[issue15295] Import machinery documentation
Phillip J. Eby added the comment: Hope I'm not too late to the bikeshed painting party; just wanted to chip in with the suggestion of self-contained package for non-namespace packages. (i.e., a self-contained package is one that cannot be split across different sys.path entries due to its use of an __init__ module). Also, technically, namespace portions do not only contribute subpackages; they can contribute modules as well. Another point of possible confusion: meta finders and path finders are both described as hooks, but this terminology seems in conflict with the use of hook as describing a callable on path_hooks. Perhaps we could drop the term hook from this section, and retitle it Import System Extensions and say you can extend the system by writing meta finders and path entry finders. This would let the term hook be the exclusive property of path_hooks, which is how you extend the import system to use your custom finders. The statement about __path__ being a list is also out-of-date; as of PEP 420, it can be an immutable, iterable object. Specification-wise, __path__ need only be a re-iterable object, and people reading its value must NOT assume that it's a list or even indexable. The term sys path finder should also be replaced by path entry finder. The former term is both incorrect and misleading, as it both implies that such a finder actually searches sys.path, and that it is exclusive to sys.path. Path entry finders are used to look for modules within a location specified by a path entry - a string found in sys.path or in a __path__ attribute. The term path importer is also horribly confusing in context... after some reflection, I suggest the following additional terminology changes: 1. Replace meta path finder with import handler 2. Replace path importer with sys.path import handler Now we can say that you extend the import system by adding import handlers to sys.meta_path, and that one of the default handlers is the sys.path import handler, which processes imports using sys.path and module __path__ attributes. The sys.path import handler can of course in turn be extended by adding path hooks to sys.path_hooks, which are used to create module finder objects for the path entry strings found in sys.path and module __path__ attributes. A path hook must return a finder object, which implements similar methods to those of an import handler, but with some important differences. Whew. It's a bit of a mouthful, but I think that this set of terms would keep all the roles and functions clear, along with their relationships to one another. In addition, I think it provides greater clarity as to which pieces you need to extend when, why, and how. What do you think? -- nosy: +pje ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: We changed quite a bit already as we tried to make everything consistent, including the importlib ABCs. Current version is on trunk, current discussion is in #15502 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Brett Cannon added the comment: While saying default path importer vs. meta path finder somewhat muddles the term importer, it definitely gets the point across that PathFinder does a lot more than any other default meta path finder. While _we_ might know that import does nothing more than call a method on sys.meta_path and has no concept of sys.path and friends, most people will consider the default path importer as part of import's semantics and thus not make the distinction. IOW I like Nick's suggestion. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 29, 2012, at 05:10 AM, Nick Coghlan wrote: I would title the new section Import system rather than Import machinery as it is meant to be a specification documentation rather than an implementation description. Import system it is. The statement that from X import A only performs a single import lookup is incorrect. The trick is that if A, B or C refers to a submodule of X then it will be imported. I think I see where you and Eric are coming from on this. Actually, I don't think I changed the existing text in this regard, but probably once I refactored out all the details, it reads in such a way as to be confusing. I've tweaked the text under the import statement to hopefully be more clear. It could probably still use improvement. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow added the comment: Part of the problem with the import nomenclature is that PEP 302 doesn't really nail it down and mixes the terms up a bit. This is understandable considering it broken ground in some regard. However, at this point we have a more comfortable relationship with the import system. Would it be feasible to lightly update PEP 302 to have a more concrete and consistent use of import terminology? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 29, 2012, at 06:09 AM, Nick Coghlan wrote: runpy, pkgutil, et al should all get See Also links at the top pointing to the new import system section. I've put an XXX in the import.rst file for this, but I probably won't get to adding all the cross references. Others can take that on once this lands. Opening section should refer to importlib.import_module(). Any mentions of __import__ should point out that its API is designed for the convenience of the interpreter, and thus it's a pain to use directly. However, we should also document explicitly that, unlike the import statement and calling __import__ directly, importlib.import_module will ignore any module level replacements of __import__. Replacing builtins.__import__ won't reliably override the entire import system (due to module level __import__ definitions, most notably importlib.__import__) and other tools that work with the process global import state directly (e.g. pkgutil, runpy). While I've added a mention of import_module() in several places, I don't think the above detail is appropriate for the introduction. I don't want to overload folks with all those details before they understand the basics of Python's import model. I would much rather add a section that goes into more detail about coarsely overriding the import system, and there we can discuss replacing built-in __import__() along with its implications and caveats, including any behavior changes in Python 3.3 with the adoption of importlib. I probably won't get to that so feel free to add such a section later. Since we have the privilege of defining *the* standard terminology for old-style packages, I suggest we use the term initialised packages (since having an __init__.py is what makes them special). We should also note explicitly that an initialised package can also behave as a namespace package, by setting __path__ appropriately in __init__.py I don't like the term initialized package (even with the Americanized spelling :), because the term initialized means set to the value or put in the condition appropriate to the start of an operation, which clearly refers to both types of packages. What about concrete package? In a sense, namespace packages are virtual, so the opposite of that would be concrete. OTOH, while regular package may still not be the right term, it might be good enough. The bike shed is already looking pretty tie-died. Finally, I think this section needs to explicitly define the terms *import path* and *path entry*. The meta path docs later refer to find_module() accepting a module name and path, and the reader could be forgiven for thinking that meant a filesystem path, when it is actually an import path (which is a sequence of path entries, which may themselves by filesystem paths). This is getting somewhere. I like using the term path importer for the thing that PathFinder is. (path finder doesn't quite do it for me, but maybe I'm clouded by same term used as a car model. ;) What we have are several default finders, one that knows how to locate frozen modules, one that knows how to locate built-in modules, and one that knows how to search an import path (which consists of path entries). This latter finder is the path importer and it has further extensibility so that new types of path entries can be used. A path entry is a location to search for modules, and sequences of path entries exist on import paths. When a search occurs, typically the import path is taken from sys.path, but for subpackages, it is taken from the __path__ attribute of its parent package. This seems to make for much better reading, and while I've worded it differently to fit better in the flow of the documentation, it's terminology that feels more right to me. Thanks! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 30, 2012, at 09:41 PM, Brett Cannon wrote: As for the diagram(s), I have attached the overall PDF that I still have from my original Omnifgraffle file (which I don't have a license to anymore) that I built my PyCon 2008 presentation with. It's probably outdated at this point. I will have to redo them for my PyCon Argentina/Brasil (maybe US?) import talks anyway. Thanks. This isn't quite the level I was looking for, but we can add a diagram later. (I think I've improved the discussion on __package__ based on your feedback and PEP 366. Thanks!) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 31, 2012, at 12:28 AM, Eric Snow wrote: You ask in [2] whether path importer refers specifically to the callables on sys.path_hooks. Can you site a reference for this? I found one reference in PEP 302 to path importer but it's hard to tell exactly what that is referring to. Unfortunately not. There aren't many people that use import hook terminology and I already have a terrible memory. :) Regardless, I find path importer a little too ambiguous. Dang. I've grown to really like path importer for the thing on sys.meta_path that provides sys.path and related functionality. It seems appropriate given the observation that we're talking about sys.path or __path__ and what this thing does is manage that corner of the import subsystem. Thinking about Nick's suggestion then, the callables on sys.path_hooks would be path entry finders since they are given a chance to find modules for each entry on the import path (be it sys.path or __path__). I think this terminology holds together well, and I think I'm going to land it as such. Then we can promote this terminology as we talk about the import system in other documentation. * (glossary.rst) sys path finder: having sys is a nice touch, making it more distinct and more explicit. TBH, I'm not crazy about the term sys path finder either but I couldn't think of anything better. What don't you like about the sys path thingee names? I find them to be nice and explicit. I'll mull this over some more. Nick put his finger on it. sys path implies that only sys.path is involved, whereas __path__ is also involved. * (import_machinery.rst) Meta path loaders, end of paragraph 2: The finder * could also be a classmethod that returns an instance of the class. I don't understand what you're suggesting here. Yeah, that was poorly worded. I'd meant to suggest that you could document the alternative to find_module() and load_module() being regular methods of the same object. For instance: class MyMetaHook: @classmethod def find_module(cls, name, path=None): return cls() def load_module(self, name): raise ImportError(You lose!) Thus, the finder is the class, and the loader is the instance. While true, it's not required in the specification, so I'd like to leave this out. Smart Pythonistas can figure details like that out. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow added the comment: Well, I'm more -0 than -1 on path importer, though I do like default path importer better. As to the rest, sounds good to me. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: I think I was unclear in my previous follow up. Here are the objects involved, taken from the glossary. import path A list of locations (or :term:`path entries path entry`) that are searched by the :term:`path importer` for modules to import. During import, this list of locations usually comes from :data:`sys.path`, but for subpackages it may also come from the parent package's ``__path__`` attribute. meta path finder A finder returned by a search of :data:`sys.meta_path`. Meta path finders are related to, but different from :term:`path entry finders path entry finder`. path entry A single location on the :term:`import path` which the :term:`path importer` consults to find modules for importing. path entry finder A :term:`finder` returned by a callable on :data:`sys.path_hooks` (i.e. a :term:`path entry hook`) which knows how to locate modules given a :term:`path entry`. path entry hook A callable on the :data:`sys.path_hook` list which returns a :term:`path entry finder` if it knows how to find modules on a specific :term:`path entry`. path importer One of the default :term:`meta path finders meta path finder` which searches an :term:`import path` for modules. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Antoine Pitrou added the comment: Shouldn't it be committed already? I don't see the point of refining documentation in a separate repo rather than in the main repo. -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 31, 2012, at 03:21 AM, Eric Snow wrote: 1. default path importer (a.k.a PathFinder), +1, although currently I am refraining from using default when describing this thing. 2. path hook (lives on sys.path_hooks), I have called these path entry hooks 3. path entry handler (finder look-alike that a path hook returns), I still call these path entry finders. I understand the ambiguity, and despite supporting a slightly different protocol than meta path finders, they still serve the role of finding a loader for a module. So for now, I'm keeping path entry finder, though I'll leave the door slightly open to persuasion. :) 4. module loader (business as usual). I've pulled Loaders out into a separate higher level section because as you say, the loader API is the same for the things returned by both meta path finders and path entry finders. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Roundup Robot added the comment: New changeset c933ec7cafcf by Barry Warsaw in branch 'default': Address substantially all of Eric Snow's comments in issue #15295, except for http://hg.python.org/cpython/rev/c933ec7cafcf New changeset d5317b8f455a by Barry Warsaw in branch 'default': - Issue #15295: Reorganize and rewrite the documentation on the import system. http://hg.python.org/cpython/rev/d5317b8f455a -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Changes by Barry A. Warsaw ba...@python.org: -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Brett Cannon added the comment: The import path definition is a little misleading as sys.path is only inferred when 'path' has None passed in. Otherwise 'path' is what __path__ in a package is set to, so technically sys.path never even comes into play except by choice from PathFinder as it just chooses to treat None to mean sys.path. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 31, 2012, at 08:30 PM, Brett Cannon wrote: The import path definition is a little misleading as sys.path is only inferred when 'path' has None passed in. Otherwise 'path' is what __path__ in a package is set to, so technically sys.path never even comes into play except by choice from PathFinder as it just chooses to treat None to mean sys.path. Do you think the glossary entry needs to be so precise? It may be difficult to explain all that in a concise definition. Maybe it's best to just remove the During import...attribute bit? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Brett Cannon added the comment: I guess just saying it can be None depending on context would be enough. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: On Jul 31, 2012, at 02:56 PM, Eric Snow wrote: Part of the problem with the import nomenclature is that PEP 302 doesn't really nail it down and mixes the terms up a bit. This is understandable considering it broken ground in some regard. However, at this point we have a more comfortable relationship with the import system. Would it be feasible to lightly update PEP 302 to have a more concrete and consistent use of import terminology? Maybe not an update to PEP 302, but probably a big red warning that the terminology is out of date, with a reference to the import system documentation in the reference manual. This also points out an interesting, more general problem, with PEPs that get out of date doesn't it? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: Ah, the perils of email readers with quote folding and issue trackers without it. The important part of Brett's email is that PEP 420 has started splitting the meta path finder and path entry finder APIs, but importlib still uses a single ABC for both of them. That's probably a mistake, and something we want to address prior to the release of 3.3. I'll create a separate issue for that. I just pushed a docs update to the PEP 420 repo that should address all of my comments. I went ahead with the regular package - initialized package and sys path finder - path entry finder name changes - they just make more sense given the way the components are used. I wanted to avoid regular package as I expect namespace packages to eventually become the norm and initialized packages the more unusual case. sys path finder was simply misleading, as those finders are used for *all* path entries, including those in package __path__ attributes. I haven't reviewed Eric's comments in detail, so I don't know if I also picked up all of those. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: #15502 records Brett concern about the merged ABC -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Barry A. Warsaw added the comment: Thanks for the review Eric. I'm slogging through these and many other comments, but I now have the docs integrated with trunk, and will probably land them for better or worse in the next day or so. I'll respond just to a few of your comments. Whatever I omit you can consider them fixed! On Jul 28, 2012, at 06:54 AM, Eric Snow wrote: * (glossary.rst) path importer: I like that you pointed out this specific * metapath importer, but aren't path importers something else? [2] Perhaps * the metapath importer doesn't need to be in the glossary. Then again I * like the entry, though I'd change :term:`finder` / :term:`loader` to * metapath importer. Maybe just a different term would work, like sys * path subsystem. Regardless, it is certainly the big dog in the import * machinery and deserves special attention. I certainly struggled with this term. I almost picked PathFinder (or path finder) since that's the name of the actual class used in the implementation, but then I thought this might be too implementation specific. You ask in [2] whether path importer refers specifically to the callables on sys.path_hooks. Can you site a reference for this? I found one reference in PEP 302 to path importer but it's hard to tell exactly what that is referring to. The sys module doesn't use that term. If we agree that path importer is the name of the things on sys.path_hooks, then we need a name for the thing on sys.meta_path that implements the things on sys.path_hook. :) * (glossary.rst) sys path finder: having sys is a nice touch, making it more distinct and more explicit. TBH, I'm not crazy about the term sys path finder either but I couldn't think of anything better. Keep the suggestions coming for both of these terms,. I'll ruminate on it too and leave XXX's in the docs for now. (Maybe as I work through the rest of the comments, something better has already been suggested.) * (importlib.rst) I could have sworn that find_loader() and resolve_name() * were public... There's importlib.find_loader() and importlib.util.resolve_name(), but OTOH, this is not intended to be the importlib library documentation. So I'm happy to leave out such details and add a reference to those docs (done). In a subsequent comment, Nick suggests this whole chapter be called the Import System instead of the Import Machinery, but I've been thinking Import Protocol might be good too. The intent is really to describe the hooks, methods, and attributes used by Python to accomplish import, as well as allow Python code to extend or modify the import machinery. The background at the top of the chapter is really just there to set the stage (and because afaict, nothing like that existed before :). I'm still thinking about this. * (importlib.rst) SourceFileLoader.load_module(): What about when the name * does not match? An ImportError gets raised? Were you suggesting that some additional documentation should be added for this? * (import_machinery.rst) import machinery, end of first paragraph: Note that * importlib.import_module() is the recommended method of calling the import * machinery. I rewrote the introductory paragraphs, and added a mention of import_module(), as well as a section on importlib. Hopefully this will provide enough information for people to figure things out. :) * (import_machinery.rst) how about a section devoted just to the attributes * of modules and packages, perhaps expanding upon or supplanting the related * entries in the data model reference page? I've added an XXX for this. I think the right thing to do is to update the data model chapter, and add a link from here to there. * (import_machinery.rst) Meta path loaders, end of paragraph 2: The finder * could also be a classmethod that returns an instance of the class. I don't understand what you're suggesting here. * (import_machinery.rst) Meta path loaders: If the load fails, the loader * needs to remove any modules... is a pretty exceptional case, since the * modules is not in charge of its parent or children, nor of import * statements executed for it. Is this a new requirement? I don't think so. I lifted it from somewhere (hard to remember exactly where now ;). PEP 302? * (import_machinery.rst) Meta path loaders: too bad there isn't something * like __origin__ for the case where __file__ doesn't make semantic sense, * but you still want to record where the module came from. Yeah. * (import_machinery.rst) I'm surprised __name__ isn't required. Indeed! AFAICT, it was only required by the module object's default repr, but I fixed that when I added module_repr(). :) * (import_machinery.rst) Meta path loaders: what should __package__ be set to * for a top-level module? Great question. I see no official recommendation in anything I've consulted, and I think CPython is all over the map. Some set it to None, some to the empty string. I added a footnote about this, and recommend the empty string,
[issue15295] Import machinery documentation
Éric Araujo added the comment: A small note in passing: “protocol” is used for things like the sequence protocol, the iterator protocol, or closer to home the finder and loader protocols, so it would sound weird or potentially confusing to me. Import system is how I’ve always thought about it (probably took that term from the docs). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Brett Cannon added the comment: To answer a couple of Barry's comments in reply to Eric... __package__ should be set to the empty string if I'm reading PEP 362 correctly (and importlib isn't broken): When the import system encounters an explicit relative import in a module without __package__ set (or with it set to None), it will calculate and store the correct value (__name__.rpartition('.')[0] for normal modules and __name__ for package initialisation modules. If someone sets __package__ to None, then importlib fills it in as necessary. As for the diagram(s), I have attached the overall PDF that I still have from my original Omnifgraffle file (which I don't have a license to anymore) that I built my PyCon 2008 presentation with. It's probably outdated at this point. I will have to redo them for my PyCon Argentina/Brasil (maybe US?) import talks anyway. -- Added file: http://bugs.python.org/file26606/__import__.pdf ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow added the comment: I certainly struggled with this term. I almost picked PathFinder (or path finder) since that's the name of the actual class used in the implementation, but then I thought this might be too implementation specific. Considering that the goal is for importlib to be the common import machinery for the various Python implementations, this might not be inappropriate. You ask in [2] whether path importer refers specifically to the callables on sys.path_hooks. Can you site a reference for this? I found one reference in PEP 302 to path importer but it's hard to tell exactly what that is referring to. Unfortunately not. There aren't many people that use import hook terminology and I already have a terrible memory. :) Regardless, I find path importer a little too ambiguous. * (glossary.rst) sys path finder: having sys is a nice touch, making it more distinct and more explicit. TBH, I'm not crazy about the term sys path finder either but I couldn't think of anything better. What don't you like about the sys path thingee names? I find them to be nice and explicit. I'll mull this over some more. In a subsequent comment, Nick suggests this whole chapter be called the Import System instead of the Import Machinery, but I've been thinking Import Protocol might be good too. I agree with Nick. The background at the top of the chapter is really just there to set the stage (and because afaict, nothing like that existed before :). And it does a good job of it. * (importlib.rst) SourceFileLoader.load_module(): What about when the name * does not match? An ImportError gets raised? Were you suggesting that some additional documentation should be added for this? I guess I was just noting a possible hole in the Import System (sounds nice, doesn't it wink) specification. Since importlib is a complete reference implementation, it's not critical to have every detail spelled out (at least, that's seems to be the status quo). * (import_machinery.rst) how about a section devoted just to the attributes * of modules and packages, perhaps expanding upon or supplanting the related * entries in the data model reference page? I've added an XXX for this. I think the right thing to do is to update the data model chapter, and add a link from here to there. Perfect. * (import_machinery.rst) Meta path loaders, end of paragraph 2: The finder * could also be a classmethod that returns an instance of the class. I don't understand what you're suggesting here. Yeah, that was poorly worded. I'd meant to suggest that you could document the alternative to find_module() and load_module() being regular methods of the same object. For instance: class MyMetaHook: @classmethod def find_module(cls, name, path=None): return cls() def load_module(self, name): raise ImportError(You lose!) Thus, the finder is the class, and the loader is the instance. * (import_machinery.rst) Meta path loaders: If the load fails, the loader * needs to remove any modules... is a pretty exceptional case, since the * modules is not in charge of its parent or children, nor of import * statements executed for it. Is this a new requirement? I don't think so. I lifted it from somewhere (hard to remember exactly where now ;). PEP 302? Nick made the point more clearly. :) * (simple_stmts.rst) the from list can include submodules which must be * imported separately, implying a step 1b I couldn't figure out what to say differently here. No, you've got it covered. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: As far as the path importer goes, it's important to keep in mind there are *four* different pieces in play: 1. The path importer itself This is a meta path finder installed on sys.meta_path, which implements the find_module API. It scans the supplied search path (or sys.path) for path entries, using sys.path_importer_cache and sys.path_hooks to find the locate path entry finders. Path importer is an eminently appropriate name as it is responsible for *all* of the standard semantics of sys.path and package __path__ attribute processing. It could be potentially be qualified with standard path importer or default path importer to distinguish it from other cases. 2. The path hooks These are installed in sys.path_hooks, and are simply callables that accept a path entry and return an appropriate path entry handler or else raise ImportError. The specification is designed to make it easy to use the classes for path entry handlers directly as path hooks (since __init__ can throw ImportError, but it can't return None). For these, path hook is just fine as a name. 3. The path entry handlers These are the objects returned by the path hooks. Historically, they implemented find_module() (without the second search path parameter), and now they can implement the find_loader() API instead. The reason I don't like sys path finder for these is that it misses their essential role in handling package __path__ attributes. I have previously suggested path entry finder, but that's a little ambiguous (since it suggests they're tools for *finding* path entries, rather than tools for finding module loaders *given* a path entry). Thus, my new suggestion here: path entry handler. They're objects that handle particular path entries on behalf of the path importer, so the name is perfectly appropriate, and better distinguishes them from the meta path finder objects. 4. The module loaders As with any import, the module loaders implement the load_module() API to create, cache, initialise and return a loaded module object. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: s/locate path entry finders/appropriate path entry handlers/ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow added the comment: Sounds good to me. As I understood them: 1. default path importer (a.k.a PathFinder), 2. path hook (lives on sys.path_hooks), 3. path entry handler (finder look-alike that a path hook returns), 4. module loader (business as usual). A path entry handler would stand in contrast to a meta path finder. These two would also map well to ABCs for issue15502. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow added the comment: More on import-related terms. Given Nick's recommendation, here's a broader view, as related to the import state: sys.meta_path: meta path finder - module loader sys.meta_path[-1] (initially): default path importer sys.path_hooks: path hook - path entry handler sys.path_importer_cache: path entry handler - module loader One unfortunate name is sys.path_importer_cache, which implies either a cache of path importers or a cache belonging to path importer, both of which are still rather ambiguous. In light of all the above, I've attached an updated patch just for the glossary. The import system reference then goes further into the protocols that the different objects implement and so forth. -- keywords: +patch Added file: http://bugs.python.org/file26611/issue15295_glossary_refactor.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: General comment: runpy, pkgutil, et al should all get See Also links at the top pointing to the new import system section. Import system intro: As noted above, I suggest changing the name :) Opening section should refer to importlib.import_module(). Any mentions of __import__ should point out that its API is designed for the convenience of the interpreter, and thus it's a pain to use directly. However, we should also document explicitly that, unlike the import statement and calling __import__ directly, importlib.import_module will ignore any module level replacements of __import__. Replacing builtins.__import__ won't reliably override the entire import system (due to module level __import__ definitions, most notably importlib.__import__) and other tools that work with the process global import state directly (e.g. pkgutil, runpy). 5.1 Packages: Don't tease readers, just tell them: the defining characteristic of a package is that it is a module object with a __path__ attribute. Since we have the privilege of defining *the* standard terminology for old-style packages, I suggest we use the term initialised packages (since having an __init__.py is what makes them special). We should also note explicitly that an initialised package can also behave as a namespace package, by setting __path__ appropriately in __init__.py Also, I suggest adding a 5.1.3 Package Example subheading - currently you define an initialised package under the namespace package heading Finally, I think this section needs to explicitly define the terms *import path* and *path entry*. The meta path docs later refer to find_module() accepting a module name and path, and the reader could be forgiven for thinking that meant a filesystem path, when it is actually an import path (which is a sequence of path entries, which may themselves by filesystem paths). 5.2.2 Finders and loaders: The term sys path finder is incorrect as registered path hooks are invoked for both sys.path entries *and* package __path__ entries. I suggest path entry finder. (I agree a longer name is needed to better distinguish them from metapath finders) 5.2.3 Import hooks: While it does get cleared up in 5.2.4, this section could be clearer that the hooks *cannot* override the initial check of the module cache. 5.3.4 Metapath: See above comment about clarifying that an import path is passed to find_module() rather than a filesystem path. The description of the path importer is incorrect. It only knows how to scan an import path and interrogate the path hooks. It's the individual path entry finders that know how to do things like load modules from the filesystem or zip files. 5.2.5 Meta path loaders I don't like the title here. There's no such thing as a meta path loader. there are only module loaders. Once they're created, it doesn't matter how you found them. Clarify that the loader only has to remove the modules it inserted itself. Other modules that were loaded *successfully* as a side effect of the code execution will remain in the cache. 5.3 The Path Importer As noted above, the path importer is *NOT* restricted to filesystem imports. All it cares about is arbitrary text sequences and path hooks. With the right path hook, you could use URLs or database connection strings as path entries. 5.5 References I'd also point to PEP 328 (absolute imports by default and explicit relative import syntax) and PEP 338 (using the import system to find __main__) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: Great start here Barry, I'll switch my checkout over to read/write access and start contributing fixes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: Pushed the import machinery - import system change (which hopefully won't break Barry's world) Also merged in a more recent version of trunk. This probably screwed up the default branch in this clone, but the clone should be done after these docs updates. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: Updated the statement docs to accurately describe the from X import Y case. I also noted that unlike the statement form, importlib.import_module ignores module level __import__ overrides. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Brett Cannon added the comment: On Jul 29, 2012 2:09 AM, Nick Coghlan rep...@bugs.python.org wrote: Nick Coghlan added the comment: General comment: runpy, pkgutil, et al should all get See Also links at the top pointing to the new import system section. Import system intro: As noted above, I suggest changing the name :) Opening section should refer to importlib.import_module(). Any mentions of __import__ should point out that its API is designed for the convenience of the interpreter, and thus it's a pain to use directly. However, we should also document explicitly that, unlike the import statement and calling __import__ directly, importlib.import_module will ignore any module level replacements of __import__. Replacing builtins.__import__ won't reliably override the entire import system (due to module level __import__ definitions, most notably importlib.__import__) and other tools that work with the process global import state directly (e.g. pkgutil, runpy). 5.1 Packages: Don't tease readers, just tell them: the defining characteristic of a package is that it is a module object with a __path__ attribute. Since we have the privilege of defining *the* standard terminology for old-style packages, I suggest we use the term initialised packages (since having an __init__.py is what makes them special). We should also note explicitly that an initialised package can also behave as a namespace package, by setting __path__ appropriately in __init__.py Also, I suggest adding a 5.1.3 Package Example subheading - currently you define an initialised package under the namespace package heading Finally, I think this section needs to explicitly define the terms *import path* and *path entry*. The meta path docs later refer to find_module() accepting a module name and path, and the reader could be forgiven for thinking that meant a filesystem path, when it is actually an import path (which is a sequence of path entries, which may themselves by filesystem paths). 5.2.2 Finders and loaders: The term sys path finder is incorrect as registered path hooks are invoked for both sys.path entries *and* package __path__ entries. I suggest path entry finder. (I agree a longer name is needed to better distinguish them from metapath finders) 5.2.3 Import hooks: While it does get cleared up in 5.2.4, this section could be clearer that the hooks *cannot* override the initial check of the module cache. 5.3.4 Metapath: See above comment about clarifying that an import path is passed to find_module() rather than a filesystem path. The description of the path importer is incorrect. It only knows how to scan an import path and interrogate the path hooks. It's the individual path entry finders that know how to do things like load modules from the filesystem or zip files. 5.2.5 Meta path loaders I don't like the title here. There's no such thing as a meta path loader. there are only module loaders. Once they're created, it doesn't matter how you found them. Clarify that the loader only has to remove the modules it inserted itself. Other modules that were loaded *successfully* as a side effect of the code execution will remain in the cache. 5.3 The Path Importer As noted above, the path importer is *NOT* restricted to filesystem imports. All it cares about is arbitrary text sequences and path hooks. With the right path hook, you could use URLs or database connection strings as path entries. Just so this doesn't get lost and in case it is important enough to block on: it might be worth having separate ABCs for meta path finders and the finders pathfinder uses since they have different APIs now (if one ignores find_module for pathfinder finders). Just need to come up with names. -brett 5.5 References I'd also point to PEP 328 (absolute imports by default and explicit relative import syntax) and PEP 338 (using the import system to find __main__) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Eric Snow ericsnowcurren...@gmail.com added the comment: Awesome addition, Barry! Bless you for slogging through this. Here are some thoughts (prepare one grain of salt for each): * (glossary.rst) finder: should also reference PEP 420. * (glossary.rst) module: s/contain/containing/ * (glossary.rst) path importer: I like that you pointed out this specific metapath importer, but aren't path importers something else? [2] Perhaps the metapath importer doesn't need to be in the glossary. Then again I like the entry, though I'd change :term:`finder` / :term:`loader` to metapath importer. Maybe just a different term would work, like sys path subsystem. Regardless, it is certainly the big dog in the import machinery and deserves special attention. * (glossary.rst) sys path finder: having sys is a nice touch, making it more distinct and more explicit. * (importlib.rst) I could have sworn that find_loader() and resolve_name() were public... * (importlib.rst) module_repr() is nice. * (importlib.rst) SourceFileLoader.load_module(): What about when the name does not match? * (import_machinery.rst) import machinery: really nice intrro! * (import_machinery.rst) import machinery, end of first paragraph: Note that importlib.import_module() is the recommended method of calling the import machinery. * (import_machinery.rst) import machinery, third paragraph: though there is the side effects of the module getting added to sys.modules, and of parent modules getting imported (if not bound). * (import_machinery.rst) package, second paragraph: generally implies further explanation which doesn't materialize. Perhaps s/modules generally do not contain other modules or packages/modules do not naturally contain other modules or packages/ or something like that? * (import_machinery.rst) I like that you make it clear that even packages are not strictly a FS-based construct. * (import_machinery.rst) how about a section devoted just to the attributes of modules and packages, perhaps expanding upon or supplanting the related entries in the data model reference page? * (import_machinery.rst) Namespace packages: while provided by a separate vendor installed container does convey the broad possibilities, it's nearly equivalent to separate sys.path entries in practice (and in the example). Regardless, separate vendor installed container could be clarified. * (import_machinery.rst) Searching, paragraph 1: don't forget importlib.import_module()! :) * (import_machinery.rst) The module cache: A gotcha snuck in under the old machinery that may or may not be worth noting. [3] * (import_machinery.rst) nice point about messing around with sys.modules. * (import_machinery.rst) I like the sound of import protocol. * (import_machinery.rst) Meta path loaders, end of paragraph 2: The finder could also be a classmethod that returns an instance of the class. * (import_machinery.rst) Meta path loaders: reload() is no longer a builtin function. * (import_machinery.rst) Meta path loaders: If the load fails, the loader needs to remove any modules... is a pretty exceptional case, since the modules is not in charge of its parent or children, nor of import statements executed for it. Is this a new requirement? * (import_machinery.rst) Meta path loaders: too bad there isn't something like __origin__ for the case where __file__ doesn't make semantic sense, but you still want to record where the module came from. * (import_machinery.rst) I'm surprised __name__ isn't required. * (import_machinery.rst) __loader__ is finally getting the respect it deserves (after nearly 10 long years)! * (import_machinery.rst) Meta path loaders: what should __package__ be set to for a top-level module? * (import_machinery.rst) Meta path loaders: s/it should execute the module's code/the loader should execute the module's code/. * (import_machinery.rst) Module reprs: perhaps s/``loader.module_repr(module)``/``module.__loader__.module_repr(module)``/ * (import_machinery.rst) Module reprs: how does module.__qualname__ fit in? * (import_machinery.rst) module.__path__: s/are consulted/is consulted/ ? * (import_machinery.rst) The Path Importer: as noted above, this seems like a new usage of path importer, a term which carries other meaning already. It's an important and distinct thing though, worthy of its own name. * (import_machinery.rst) sys path finders, third paragraph: maybe put a reference to the site module? * (import_machinery.rst) sys path finders, last paragraph: s/it is used to load/that's what the import machinery uses to load/. * (import_machinery.rst) NullImporter (issue15473)? I though Brett had a plan for taking it to the gallows... * (import_machinery.rst) Diagrams? Brett again. :) He put together some nice ones a few years back. * (import_machinery.rst) * (import_machinery.rst) * (simple_stmts.rst) a wonderful improvement! * (simple_stmts.rst) the from list can include submodules which must be imported
[issue15295] Import machinery documentation
Changes by Nick Coghlan ncogh...@gmail.com: -- hgrepos: +142 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Nick Coghlan added the comment: I would title the new section Import system rather than Import machinery as it is meant to be a specification documentation rather than an implementation description. Import statement: The statement that from X import A only performs a single import lookup is incorrect. The trick is that if A, B or C refers to a submodule of X then it will be imported. I'll use a couple of examples from the logging package to make this clear: # Attribute access will fail for submodules that haven't been imported yet import logging logging.DEBUG 10 logging.handlers Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'module' object has no attribute 'handlers' # Direct imports will fail for attributes that are not submodules import logging.DEBUG Traceback (most recent call last): File stdin, line 1, in module ImportError: No module named 'logging.DEBUG' import logging.handlers # From imports check for an existing attribute first, but check for a submodule if the attribute is missing del sys.modules[logging] del sys.modules[logging.handlers] from logging import DEBUG from logging import handlers Aside from this flaw, the new content in the import statement looks good. More on the import system section in a subsequent comment. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15295] Import machinery documentation
Changes by Barry A. Warsaw ba...@python.org: -- priority: deferred blocker - release blocker stage: needs patch - patch review title: Document PEP 420 namespace packages - Import machinery documentation ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15295 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com