On 08/16/2010 09:39 PM, Dianne Hackborn wrote:
> On Mon, Aug 16, 2010 at 8:17 PM, Tim Bird 
> <tim.b...@am.sony.com<mailto:tim.b...@am.sony.com>> wrote:

> The complete validation only happens the first time an .apk is scanned or if 
> it later changes.  If the package manager was actually verifying all contents 
> of the .apk at each boot, it would literally take something like 2 minutes to 
> run. :}  This often is actually the slowest part of installing an app.  (And 
> it is possibly a good target for optimization since this is just using 
> whatever the current implementation of the core library is, which is often 
> not the most optimal of code.)
> 
> If there is actually something going on that is causing all of the files to 
> be read, that should definitely be fixed.  It is not expected...  for 
> example, I have an app that runs through all of the .apks on my Nexus One, 
> opening each and loading the app's label (which requires retrieving its 
> resource table from the .apk), and it takes well less than 1 second to open 
> and load those all.  This is on a device with a good number of third party 
> apps installed in additional to all of the built-in .apks.
> 
> So there is no reason for the stuff that is loading the AndroidManifest.xml 
> from the .apk to be very slow.
> 
>> The sanity checks, IMHO, would be better placed at the time of reading the 
>> (sub)file.
>> As it is now, when the code opens a single file in an archive, every file in 
>> the
>> archive has it's signature checked.  Once things are in the page cache, this 
>> happens
>> quickly, but it's still pretty wasteful.  The archive index is rebuilt in 
>> user space,
>> and the signature checks are repeated, if a package is accessed again later 
>> in the
>> boot (which happens for a least Contacts.apk multiple times).
> 
> We definitely are not checking the signature of every single file in an .apk 
> at every boot.  I am sure of that...  when I first implemented the app 
> certificate stuff years ago, it made the package scan performance *horrible* 
> so I had to make sure we didn't do that unless something had changed. :}

I think we must be talking about different signatures.  I'm referring
to the one checked in frameworks/base/libs/utils/ZipFileRO.cpp,
in ZipFileRO::parseZipArchive() about line 255 in the file:

        localHdr = basePtr + localHdrOffset;
        if (get4LE(localHdr) != kLFHSignature) {
            LOGW("Bad offset to local header: %d (at %d)\n",
                localHdrOffset, i);
            goto bail;
        }

On Eclair, Contacts.apk is 2.3M and has 1103 sub-files, the vast
majority of which are less than 8K.  The biggest exception is the
resources file, which, it turns out also consists of nested subfiles.

The code above walks the index which is at the end of the archive,
but also peeks at a header word preceding each and every file in
the archive.  On my system, this results in page-faulting
approximately 1.5M of Contacts.apk into system memory, on the first
request to read AndroidManifest.xml.  Subsequent reads of the resource
file result in reading all pages of that portion of the file as well.

> That's an interesting requirement, but I think you missed my point.
> The multiple abstraction layers in the code are causing things to be re-read
> multiple times.  The kernel has the capability to consolidate handling of
> things like container boundaries, compression (or not) of individual 
> sub-elements
> of a package, management of meta-data for the files, and to intelligently 
> cache
> these operations, across processes, for performance. User space can't do this
> as well.  Calling through multiple layers of code and relying on faults to
> load data implicitly instead of managing it explicitly with reads is 
> inefficient,
> and less flexible in terms of I/O scheduling.
> 
> Honestly I don't think that is a problem.  Doing an mmap of the file, getting 
> the TOC out of the end, and using that to go directly to the address where 
> the desired data is at should be quite fast.
The code above does more than that.

> I don't see anything about that that would be intrinsically slow, or much 
> worse than opening files and using read() to copy their data into local RAM.  
> And using mmap() is advantages in that it allows things that are large (such 
> as big resource tables with strings in lots of languages we don't care about) 
> to easily only have the parts we need mapped to RAM and the rest of the pages 
> available to be swapped out if needed.

Well, page faults are always inherently slower than reads, but you're right that
they shouldn't be much slower.  I'm still trying to figure out why reading the
resource table (apparently) reads every byte of that portion of the archive, as 
opposed
to only accessing the data sparsely, as you indicate is the intent.
 -- Tim

=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Network Entertainment
=============================

-- 
unsubscribe: android-porting+unsubscr...@googlegroups.com
website: http://groups.google.com/group/android-porting

Reply via email to