Re: [Distutils] Design rationale for the egg format ?
At 01:25 PM 6/15/2010 +0900, David Cournapeau wrote: On Tue, Jun 15, 2010 at 10:36 AM, P.J. Eby p...@telecommunity.com wrote: As I said above, it *also* scales better for performance -- i.e., it's a secondary concern. ok. Â The #1 reason for separating metadata files is that it makes the addition of new metadata much easier than maintaining a single monolithic format. Do you still think it is true today ? I am asking because AFAIK, there aren't many packages besides setuptools which use those metadata ? That depends on what you mean besides setuptools. Entry points, for example, are used by various apps and frameworks that implement plugins, and these in turn are used by app and plugin developers. There's also a package (EggTranslations I think?) that uses metadata files for i18n discovery, allowing plugins to provide translations for an app, or translations for other plugins. So, it's true that it's not very common that a library or app would directly use metadata files -- in general, you'll go through an intermediary such as setuptools or EggTranslations... or even a third level, such as an app framework that then uses setuptools internally to implement a plugin API. I don't mean to criticize the design, just to see if you would do things differently today. Oh, many things. But separating metadata files is definitely NOT one of them. As you might notice, PEP 376 and Distutils2 build even further on this pattern, with roughly the same rationale(s). ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
I don't mean to criticize the design, just to see if you would do things differently today. Oh, many things. That would make for an interesting read. Regards ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
[Distutils] Design rationale for the egg format ?
Hi, I have a few questions about the egg format implementation, and was hoping people who designed it could answer them: - why does the filename encode some metadata (which python version if the package contains extensions, platform specifier) ? - why are the metadata split into files instead of one single metadata file ? I tried to find a rationale for those, but could not find much info on distutils-sig or setuptools doc, thanks, David ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
On 14 June 2010 07:59, David Cournapeau courn...@gmail.com wrote: Hi, I have a few questions about the egg format implementation, and was hoping people who designed it could answer them: - why does the filename encode some metadata (which python version if the package contains extensions, platform specifier) ? I'm not one of the designers, nor an expert, but I believe that this is so that the basic metadata can be obtained as part of an initial listdir(), without needing to open and read a file at all - so essentially it's to reduce the number of OS calls needed in certain key cases. Paul. ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
On Mon, Jun 14, 2010 at 4:34 PM, Paul Moore p.f.mo...@gmail.com wrote: On 14 June 2010 07:59, David Cournapeau courn...@gmail.com wrote: Hi, I have a few questions about the egg format implementation, and was hoping people who designed it could answer them: - why does the filename encode some metadata (which python version if the package contains extensions, platform specifier) ? I'm not one of the designers, nor an expert, but I believe that this is so that the basic metadata can be obtained as part of an initial listdir(), without needing to open and read a file at all - so essentially it's to reduce the number of OS calls needed in certain key cases. Do you happen to know what those key cases are ? cheers, David ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
David Cournapeau courn...@gmail.com writes: I have a few questions about the egg format implementation, and was hoping people who designed it could answer them: […] I tried to find a rationale for those, but could not find much info on distutils-sig or setuptools doc, You might find it useful to read this archived thread: I've noticed that there seems to be a lot of confusion out there about what setuptools is and/or does, at least among Python-Dev folks, so I thought it might be a good idea to give an overview of its structure, so that people have a better idea of what is and isn't magic. […] URL:http://mail.python.org/pipermail/python-dev/2006-April/064145.html It goes into some amount of detail on the background and rationale for Setuptools and its early design decisions. -- \“Sane people have an appropriate perspective on the relative | `\ importance of foodstuffs and human beings. Crazy people can't | _o__) tell the difference.” —Paul Z. Myers, 2010-04-18 | Ben Finney ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
At 03:59 PM 6/14/2010 +0900, David Cournapeau wrote: Hi, I have a few questions about the egg format implementation, and was hoping people who designed it could answer them: - why does the filename encode some metadata (which python version if the package contains extensions, platform specifier) ? As Paul mentioned, it's so that discovery can happen without needing anything more than a listdir(). (A legacy of eggs' origin as a binary plugin format.) - why are the metadata split into files instead of one single metadata file ? Because that's simpler than trying to define a single universal file format that's forward and backward-compatible with every possible feature and use case. Each use case can have an optimized file format. It also scales better for performance when you have multiple things you might (or might not) be reading. For example, since entry points are separate from dependencies, you you don't need to read the dependencies from an egg that doesn't have an entry point you're scanning for. ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
At 08:29 AM 6/15/2010 +0900, David Cournapeau wrote: On Mon, Jun 14, 2010 at 11:28 PM, P.J. Eby p...@telecommunity.com wrote: At 03:59 PM 6/14/2010 +0900, David Cournapeau wrote: Â - why are the metadata split into files instead of one single metadata file ? Because that's simpler than trying to define a single universal file format that's forward and backward-compatible with every possible feature and use case. Â Each use case can have an optimized file format. It also scales better for performance when you have multiple things you might (or might not) be reading. Â For example, since entry points are separate from dependencies, you you don't need to read the dependencies from an egg that doesn't have an entry point you're scanning for. What I am interested in is the exact situations where this happens (there is the case where eggs are used as plugins, the case where eggs are namespace packages, etc...). For example, I don't quite understand why reading dependencies need to be fast (it does not matter at install time, so I guess I am missing some usecases) ? As I said above, it *also* scales better for performance -- i.e., it's a secondary concern. The #1 reason for separating metadata files is that it makes the addition of new metadata much easier than maintaining a single monolithic format. That is, programs that don't understand new metadata don't have to read it. Plugins that write metadata don't need to co-operate with others - they can just write their own files. And so on. That is the original reason for making separate metadata files: i.e. simplicity. It just turned out to also provide a performance benefit in the case of cross-egg scanning for distinct types of metadata. ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
Paul Moore wrote: On 14 June 2010 07:59, David Cournapeau courn...@gmail.com wrote: - why does the filename encode some metadata (which python version if the package contains extensions, platform specifier) ? I believe that this is so that the basic metadata can be obtained as part of an initial listdir() It would also seem useful to be able to keep a collection of versions of a package for different python configurations in the same directory without their names clashing. -- Greg ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Design rationale for the egg format ?
On Tue, Jun 15, 2010 at 10:36 AM, P.J. Eby p...@telecommunity.com wrote: As I said above, it *also* scales better for performance -- i.e., it's a secondary concern. ok. The #1 reason for separating metadata files is that it makes the addition of new metadata much easier than maintaining a single monolithic format. Do you still think it is true today ? I am asking because AFAIK, there aren't many packages besides setuptools which use those metadata ? I don't mean to criticize the design, just to see if you would do things differently today. cheers, David ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig