AdaptiveNet：Post-deployment Neural Architecture Adaptation for Diverse Edge Environments

Sun Hao

2024-03-12 15:09:45 2024-03-12 15:09

来源：MobiCom ‘23: Proceedings of the 29th Annual International Conference on Mobile Computing And Networking, October 2-6, 2023

原文速递：AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge

Content at a glance

Topic

Generate models for diverse and dynamic edge environments.
Background

It is an increasingly common practice to deploy the deep learning models to edge devices, due to latency and privacy considerations. To ensure stable service quality across diverse edge environments, it is highly desirable to generate tailored model architectures for different conditions. However, conventional pre-deployment model generation approaches are not satisfactory due to the difficulty of handling the diversity of edge environments and the demand for edge information.
Key idea

Let the model self-adapt to the target environment after deployment, which is so called “post-deployment approach”.
Benefits: The quality of model architectures can be more precisely measured in the target environment, and user privacy can be better protected because there is no need to collect edge information.

Figure 1: Comparison of pre-deployment and post-deployment model generation approaches.

Challenges
1. Generating the model search space for edge devices is difficult.
2. The model performance evaluation process can be time-consuming at the edge.
Techniques

1. On-cloud model elastification

The on-cloud model elastification mainly consists of a granularity-aware graph expansion step and a distillation-based training step. The graph expansion step discovers the repeating basic blocks and adds optional branches to the model to extend it into a supernet, which includes layers that can replace multiple original layers or structured-pruned layers that reduce the computational cost of individual layers. The training step use s branch-wise distillation to efficiently train the newly-added branches to mimic the original branches, which is followed by a whole-graph fine-tuning to further improve the overall accuracy of the subnets.

Figure 2: Supernet architecture.
2. On-device subnet search

The AdaptiveNet builds a latency model by profiling the blocks in the supernet to precisely estimate the latency of subnets in the native environment. Based on the latency model, a search strategy is designed by initializing a set of promising candidate models and iteratively mutating the candidates around the latency budget. The search efficiency is further improved by reusing the common intermediate features during candidate model evaluation. The optimal model is also adaptively updated by the runtime monitor to handle environment dynamicity.

Comments

该工作针对DL模型在环境不同的边缘设备上部署的难题，提出在部署后对模型进行动态调整的概念，整体想法具有一定的创新性和实用性，是除协同计算外复杂模型在边缘设备部署的另一种方案；

该工作针对现有模型自适应工作中存在的问题设计了扩大模型搜索灵活度和搜索效率的几种技术，针对性强且足够直观；

文章写作逻辑比较严密，结构清晰，不会过度介绍细节，但会将内容描述清晰；

文章评估部分的搜索对比算法存在一些歧义，有改进空间。

Presentation Slides

Copyright Notice：All articles in this blog are licensed under BY-NC-SA unless stating additionally.

Comments