On 12/27/2012 07:05 AM, Omer Korat wrote: > You're probably right in general, for me the 3.3 and 2.7 pickles definitely > don't work the same: > > 3.3: >>>> type(pickle.dumps(1)) > <type 'bytes'> > > 2.7: >>>> type(pickle.dumps(1, pickle.HIGHEST_PROTOCOL)) > <type 'str'>
That is the same. In 2.7, str is made up of bytes, while in 3.3, str would be unicode. So 'bytes' is the 3.3 equivalent of str. > > As you can see, in 2.7 when I try to dump something, I get useless string. > Look what I gen when I dump an NLTK object such as the sent_tokenize function: > > '\x80\x02cnltk.tokenize\nsent_tokenize\ng\x00' > > Now, this is useless. If I try to load it on a platform without NLTK > installed on it, I get: > > ImportError: No module named 'nltk' > > So it means the actual sent_tokenizer wasn't saved. Just it's module. As Peter Otten has already pointed out, that's how pickle works. It does not somehow encode the whole module into the pickle, only enough information to recreate the particular objects you're saving, *using* the same modules. I don't know of any method of avoiding the destination machine needing nltk, regardless of Python version. Perhaps you'd rather see it in the Python docs. http://docs.python.org/2/library/pickle.html http://docs.python.org/3.3/library/pickle.html pickle <http://docs.python.org/2/library/pickle.html#module-pickle>can save and restore class instances transparently, however the class definition must be importable and live in the same module as when the object was stored. and Similarly, when class instances are pickled, their class’s code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list