Paper Artifacts

CodeMatch: Obfuscation Won't Conceal Your Repackaged App

Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, Mira Mezini

This page lists all available artifacts for the paper and links to the source code of the implementation.

Datasets

We list the different datasets which were used in the evaluation.

The maven APKs were used to show the robustness of our representations with known libraries. The file contains the file paths which can be found on Maven Central, because of this the have.

The list of APKs from different app stores were used to compare LibDetect against the white list of Li Li et. al. and LibRadar.
The list of APKs is devided into the app name, the count of library packages, the count of app packages, the name of the approach and the true/false positives and true/false negatives.
The ground truth for the comparison is contained in a ZIP file.
The ZIP file contains for each app the classification if a package belongs to a library or not.

The list of repackage classification contains a list of app pairs, from a snapshot of Google Play store called PlayDrone Archive, and if they are repackaged or not.

These datasets are made available for academic purposes only. You are not allowed to redistribute any part of this dataset. Do not distribute direct links to these files, link to this page instead.

Further Analyses

For the determination of the glue code threshold, we compared the instruction count distribution of 14,000 apps without libraries and chose for filtering purposes the peak of the possible app's glue code plus 5% for distribution significance.

To determine if an app is obfuscated, we measured the distribution of method-name length of truly non-obfuscated projects, such as Java Class Library and compared it to the distribution of over 14,000 random apps, which are taken from the Google Play Store.