On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan <ncogh...@gmail.com> wrote:
> On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky <eli...@gmail.com> wrote: > > One of the contended issues with PEP 435 on which Guido pronounced was > the > > functional API, that allows created enumerations dynamically in a manner > > similar to namedtuple: > > > > Color = Enum('Color', 'red blue green') > > > > The biggest complaint reported against this API is interaction with > pickle. > > As promised, I want to discuss here how we're going to address this > concern. > > > > At this point, the pickle docs say that module-top-level classes can be > > pickled. This obviously works for the normal Enum classes, but is a > problem > > with the functional API because the class is created dynamically and has > no > > __module__. > > > > To solve this, the reference implementation is used the same approach as > > namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real > > code has some safeguards): > > > > module_name = sys._getframe(1).f_globals['__name__'] > > enum_class.__module__ = module_name > > > > According to an earlier discussion, this is works on CPython, PyPy and > > Jython, but not on IronPython. The alternative that works everywhere is > to > > define the Enum like this: > > > > Color = Enum('the_module.Color', 'red blue green') > > > > The reference implementation supports this as well. > > > > Some points for discussion: > > > > 1) We can say that using the functional API when pickling can happen is > not > > recommended, but maybe a better way would be to just explain the way > things > > are and let users decide? > > It's probably worth creating a section in the pickle docs and > explaining the vagaries of naming things and the dependency on knowing > the module name. The issue comes up with defining classes in __main__ > and when implementing pseudo-modules as well (see PEP 395). > > Any pickle-expert volunteers to do this? I guess we can start by creating a documentation issue. > > 2) namedtuple should also support the fully qualified name syntax. If > this > > is agreed upon, I can create an issue. > > Yes, I think that part should be done. > OK, I'll create an issue. > > > 3) Antoine mentioned that work is being done in 3.4 to enable pickling of > > nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets > > implemented, I don't see a reason why Enum and namedtuple can't be > adjusted > > to find the __qualname__ of the class they're internal to. Am I missing > > something? > > The class based form should still work (assuming only classes are > involved), the stack inspection will likely fail. > I can probably be made to work with a bit more effort than the current "hack", but I don't see why it wouldn't be doable. > > 4) Using _getframe(N) here seems like an overkill to me. > > It's not just overkill, it's fragile - it only works if you call the > constructor directly. If you use a convenience function in a utility > module, it will try to load your pickles from there rather than > wherever you bound the name. > In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid. > > > What we really need > > is just the module in which the current execution currently is (i.e. the > > metaclass's __new__ in our case). Would it make sense to add a new > function > > somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just > provides > > the current module name? It seems that all Pythons should be able to > easily > > provide it, it's certainly a very small subset of the functionality > provided > > by walking the callframe stack. This function can then be used for build > > fully qualified names for pickling of Enum and namedtuple. Moreover, it > can > > be general even more widely - dynamic class building is quite common in > > Python code, and as Nick mentioned somewhere earlier, the extra power of > > metaclasses in the recent 3.x's will probably make it even more common. > > Yes, I've been thinking along these lines myself, although in a > slightly more expanded form that also touches on the issues that > stalled PEP 406 (the import engine API that tries to better > encapsulate the import state). It may also potentially address some > issues with initialisation of C extensions (I don't remember the exact > details off the top of my head, but there's some info we want to get > from the import machinery to modules initialised from Cython, but the > loader API and the C module initialisation API both get in the way). > > Specifically, what I'm talking about is some kind of implicit context > similar to the approach the decimal module uses to control operations > on Decimal instances. In this case, what we're trying to track is the > "active module", either __main__ (if the code has been triggered > directly through an operation in that module), or else the module > currently being imported (if the import machinery has been invoked). > > The bare minimum would just be to store the __name__ (using > sys.modules to get access to the full module if needed) in a way that > adequately handles nested, circular and threaded imports, but there > may be a case for tracking a richer ModuleContext object instead. > > However, there's also a separate question of whether implicitly > tracking the active module is really what we want. Do we want that, or > is what we actually want the ability to define an arbitrary "naming > context" in order to use functional APIs to construct classes without > losing the pickle integration of class statements? > > What if there was a variant of the class statement that bound the > result of a function call rather than using the normal syntax: > > class Animal from enum.Enum(members="dog cat bear") > > And it was only class statements in that form which manipulated the > naming context? (you could also use the def keyword rather than class) > > Either form would essentially be an ordinary assignment statement, > *except* that they would manipulate the naming context to record the > name being bound *and* relevant details of the active module. > > Regardless, I think the question is not really well enough defined to > be a topic for python-dev, even though it came up in a python-dev > discussion - it's more python-ideas territory. > Wait... I agree that having a special syntax for this is a novel idea that's not well defined and can be discussed on python-ideas. But the utility function I was mentioning is a pretty simple idea, and it's well defined. It can be very useful in contexts where code is created dynamically, by removing the amount of explicit-frame-walking hacks. Eli
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com