Hi,

There has been a bit of inevitable FUD with phar.  Although the manual
(http://php.net/phar) describes a fair amount of how phar works, the
design decisions are not documented.

Originally, the phar stream wrapper was a userspace thing.  Davey Shafik
designed it to take advantage of a neat loophole in the design of the
tar file format so that a valid tar could be run by PHP without needing
to have the phar stream wrapper loaded.  This was great until I started
using it to run the PEAR Installer.  The performance hit was tremendous,
as every newly included file required scanning the entire file, header
by header, until we found the needed file.  Worst case, it meant loading
megabytes of information just to locate a file.  The zip file format has
the same limitation - the entire archive needs to be scanned.

Both of these formats were not designed for random access in the way a
traditional filesystem is designed.  In fact, I could not find an
example of a archive format that is designed for this.

As such, borrowing from the design of disk filesystems, I created a new
format that is very small and processes very quickly.  It is so much
faster, I can't detect a difference in performance running the PEAR
installer off of the disk and running it out of a phar.  I am sure there
is a difference that apache benchmark would detect because of extra
load-in time of the file manifest.  The way phar works now is a file
manifest is at the start of the phar archive (similar to a directory
file in traditional filesystems).  Each file has a manifest entry
containing the file name, size of the file, and offset into the archive
plus some flags and optional meta-data.  The manifest is currently
limited in size to 1 MB, so some applications probably would not be
possible to phar under the current design.

Each phar has a loader stub, which can be any php code, but must contain
the __HALT_COMPILER(); token.  This will allow creating phars that also
contain PHP_Archive to work under conditions where the phar extension is
disabled.  It is the loader stub that makes it possible to run a phar
with plain vanilla PHP.

I see two possible solutions to the concerns raised by others.

1) don't worry, be happy
2) re-design the phar file format such that it is a tar again, and put
the manifest for quick loading in one of the first files of the tar archive.

If I had thought #2 was a good idea, I would have already done it, so
there is my opinion.

One basic assumption I would like to raise here is that nobody is going
to download a .phar archive who does not already have PHP.  Does this
assumption sound sane?

If so, I would like to provide some simple scripts for unpharring and
repharring a .phar archive.  This is not hard to do with a 5-line PHP
script.

One of the big questions I would have though would be for xdebug (hi
Derick) and designers of IDEs, as it would be good to ensure that it is
possible to step through a phar, or even to dump the source line with an
error message.  These, to me, seem to be the most pressing disadvantages
of phar currently - it becomes much harder to debug a problem in a PHP
script when it is stuffed into a phar.

Thanks,
Greg

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to