Abstract: Modern handheld devices often employ neural processing units (NPUs) to accelerate deep neural network (DNN) inference applications. Unlike the AI accelerator of a data center, the NPU of an ...