Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-17 Thread Brett Cannon
Anyone other than Eric have something to say on this proposal? Obviously
the discussion went tangential before I saw a clear consensus that what I
was proposing was fine with people.

On Sat, Apr 14, 2012 at 16:56, Brett Cannon br...@python.org wrote:

 An open issue in PEP 302 is whether to require __loader__ attributes on
 modules. The claimed worry is memory consumption, but considering importlib
 and zipimport are already doing this that seems like a red herring.
 Requiring it, though, opens the door to people relying on its existence and
 thus starting to do things like loading assets with
 ``__loader__.get_data(path_to_internal_package_file)`` which allows code to
 not care how modules are stored (e.g. zip file, sqlite database, etc.).

 What I would like to do is update the PEP to state that loaders are
 expected to set __loader__. Now importlib will get updated to do that
 implicitly so external code can expect it post-import, but requiring
 loaders to set it would mean that code executed during import can rely on
 it as well.

 As for __package__, PEP 366 states that modules should set it but it isn't
 referenced by PEP 302. What I want to do is add a reference and make it
 required like __loader__. Importlib already sets it implicitly post-import,
 but once again it would be nice to do this pre-import.

 To help facilitate both new requirements, I would update the
 importlib.util.module_for_loader decorator to set both on a module that
 doesn't have them before passing the module down to the decorated method.
 That way people already using the decorator don't have to worry about
 anything and it is one less detail to have to worry about. I would also
 update the docs on importlib.util.set_package and importlib.util.set_loader
 to suggest people use importlib.util.module_for_loader and only use the
 other two decorators for backwards-compatibility.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-17 Thread Andrew Svetlov
+1 for initial proposition.

On Tue, Apr 17, 2012 at 6:59 PM, Brett Cannon br...@python.org wrote:
 Anyone other than Eric have something to say on this proposal? Obviously the
 discussion went tangential before I saw a clear consensus that what I was
 proposing was fine with people.


 On Sat, Apr 14, 2012 at 16:56, Brett Cannon br...@python.org wrote:

 An open issue in PEP 302 is whether to require __loader__ attributes on
 modules. The claimed worry is memory consumption, but considering importlib
 and zipimport are already doing this that seems like a red herring.
 Requiring it, though, opens the door to people relying on its existence and
 thus starting to do things like loading assets with
 ``__loader__.get_data(path_to_internal_package_file)`` which allows code to
 not care how modules are stored (e.g. zip file, sqlite database, etc.).

 What I would like to do is update the PEP to state that loaders are
 expected to set __loader__. Now importlib will get updated to do that
 implicitly so external code can expect it post-import, but requiring loaders
 to set it would mean that code executed during import can rely on it as
 well.

 As for __package__, PEP 366 states that modules should set it but it isn't
 referenced by PEP 302. What I want to do is add a reference and make it
 required like __loader__. Importlib already sets it implicitly post-import,
 but once again it would be nice to do this pre-import.

 To help facilitate both new requirements, I would update the
 importlib.util.module_for_loader decorator to set both on a module that
 doesn't have them before passing the module down to the decorated method.
 That way people already using the decorator don't have to worry about
 anything and it is one less detail to have to worry about. I would also
 update the docs on importlib.util.set_package and importlib.util.set_loader
 to suggest people use importlib.util.module_for_loader and only use the
 other two decorators for backwards-compatibility.



 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com




-- 
Thanks,
Andrew Svetlov
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-17 Thread Nick Coghlan
+1 here. Previously, it wasn't a reasonable requirement, since CPython
itself didn't comply with it.

--
Sent from my phone, thus the relative brevity :)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-15 Thread Nick Coghlan
On Sun, Apr 15, 2012 at 12:59 PM, Guido van Rossum gu...@python.org wrote:
 Hm... Can you give an example of a library that needs a real file?
 That sounds like a poorly designed API.

If you're invoking a separate utility (e.g. via it's command line
interface), you may need a real filesystem path that you can pass
along.

 The get_file() feature has a neat benefit. Since it transparently
 extracts files from the loader, users can ship binary extensions and
 shared libraries (dlls) in a ZIP file and use them without too much hassle.

 Yeah, DLLs are about the only example I can think of where even a
 virtual filesystem doesn't help...

An important example, though. However, I still don't believe it is
something we should necessarily be rushing into implementing in the
standard library in the *same* release that finally completes the
conversion started so long ago with PEP 302.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-15 Thread Nick Coghlan
On Sun, Apr 15, 2012 at 8:32 AM, Guido van Rossum gu...@python.org wrote:
 Funny, I was just thinking about having a simple standard API that
 will let you open files (and list directories) relative to a given
 module or package regardless of how the thing is loaded. If we
 guarantee that there's always a __loader__ that's a first step, though
 I think we may need to do a little more to get people who currently do
 things like open(os.path.join(os.path.basename(__file__),
 'some_file_name') to switch. I was thinking of having a stdlib
 function that you give a module/package object, a relative filename,
 and optionally a mode ('b' or 't') and returns a stream -- and sibling
 functions that return a string or bytes object (depending on what API
 the user is using either the stream or the data can be more useful).
 What would we call thos functions and where would the live?

We already offer pkgutil.get_data() for the latter API:
http://docs.python.org/library/pkgutil#pkgutil.get_data

There's no get_file() or get_filename() equivalent, since there's no
relevant API formally defined for PEP 302 loader objects (the closest
we have is get_filename(), which is only defined for the actual module
objects, not for arbitrary colocated files).

Now that importlib is the official import implementation, and is fully
PEP 302 compliant, large sections of pkgutil should either be
deprecated (the import emulation) or updated to be thin wrappers
around importlib (the package walking components and other utility
functions).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-15 Thread Glyph

On Apr 14, 2012, at 3:32 PM, Guido van Rossum wrote:

 Funny, I was just thinking about having a simple standard API that
 will let you open files (and list directories) relative to a given
 module or package regardless of how the thing is loaded.


Twisted has such a thing, mostly written by me, called twisted.python.modules.

Sorry if I'm repeating myself here, I know I've brought it up on this list 
before, but it seems germane to this thread.  I'd be interested in getting 
feedback from the import-wizards participating in this thread in case it is 
doing anything bad (in particular I'd like to make sure it will keep working in 
future versions of Python), but I think it may provide quite a good template 
for a standard API.

The code's here: 
http://twistedmatrix.com/trac/browser/trunk/twisted/python/modules.py

The API is fairly simple.

 from twisted.python.modules import getModule
 e = getModule(email) # get an abstract module object (un-loaded)
 e
PythonModule'email'
 walker = e.walkModules() # walk the module hierarchy
 walker.next()
PythonModule'email'
 walker.next()
PythonModule'email._parseaddr'
 walker.next() # et cetera
PythonModule'email.base64mime'
 charset = e[charset] # get the 'charset' child module of the 'e' package
 charset.filePath
FilePath('.../lib/python2.7/email/charset.py')
 charset.filePath.parent().children() # list the directory containing 
 charset.py

Worth pointing out is that although in this example it's a FilePath, it could 
also be a ZipPath if you imported stuff from a zipfile.  We have an adapter 
that inspects path_importer_cache and produces appropriately-shaped 
filesystem-like objects depending on where your module was imported from.  
Thank you to authors of PEP 302; that was my religion while writing this code.

You can also, of course, ask to load something once you've identified it with 
the traversal API:

 charset.load()
module 'email.charset' from '.../lib/python2.7/email/charset.pyc'

You can also ask questions like this, which are very useful when debugging 
setup problems:

 ifaces = getModule(twisted.internet.interfaces)
 ifaces.pathEntry
PathEntryFilePath('/Domicile/glyph/Projects/Twisted/trunk')
 list(ifaces.pathEntry.iterModules())
[PythonModule'setup', PythonModule'twisted']

This asks what sys.path entry is responsible twisted.internet.interfaces, and 
then what other modules could be loaded from there.  Just 'setup' and 'twisted' 
indicates that this is a development install (not surprising for one of my 
computers), since site-packages would be much more crowded.

The idiom for saying there's a file installed near this module, and I'd like 
to grab it as a string, is pretty straightforward:

from twisted.python.modules import getModule
mod = getModule(__name__).filePath.sibling(my-file).open().read()

And hopefully it's obvious from this idiom how one might get the pathname, or a 
stream rather than the bytes.

-glyph___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-15 Thread Barry Warsaw
On Apr 15, 2012, at 02:12 PM, Glyph wrote:

Twisted has such a thing, mostly written by me, called
twisted.python.modules.

Sorry if I'm repeating myself here, I know I've brought it up on this list
before, but it seems germane to this thread.  I'd be interested in getting
feedback from the import-wizards participating in this thread in case it is
doing anything bad (in particular I'd like to make sure it will keep working
in future versions of Python), but I think it may provide quite a good
template for a standard API.

The code's here: 
http://twistedmatrix.com/trac/browser/trunk/twisted/python/modules.py

The API is fairly simple.

 from twisted.python.modules import getModule
 e = getModule(email) # get an abstract module object (un-loaded)

Got a PEP 8 friendly version? :)

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-15 Thread Glyph

On Apr 15, 2012, at 6:38 PM, Barry Warsaw wrote:

 On Apr 15, 2012, at 02:12 PM, Glyph wrote:
 
 Twisted has such a thing, mostly written by me, called
 twisted.python.modules.
 
 Sorry if I'm repeating myself here, I know I've brought it up on this list
 before, but it seems germane to this thread.  I'd be interested in getting
 feedback from the import-wizards participating in this thread in case it is
 doing anything bad (in particular I'd like to make sure it will keep working
 in future versions of Python), but I think it may provide quite a good
 template for a standard API.
 
 The code's here: 
 http://twistedmatrix.com/trac/browser/trunk/twisted/python/modules.py
 
 The API is fairly simple.
 
 from twisted.python.modules import getModule
 e = getModule(email) # get an abstract module object (un-loaded)
 
 Got a PEP 8 friendly version? :)

No, but I'd be happy to do the translation manually if people actually prefer 
the shape of this API!

I am just pointing it out as a source of inspiration for whatever comes next, 
which I assume will be based on pkg_resources.

-glyph
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Eric Snow
On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon br...@python.org wrote:
 An open issue in PEP 302 is whether to require __loader__ attributes on
 modules. The claimed worry is memory consumption, but considering importlib
 and zipimport are already doing this that seems like a red herring.
 Requiring it, though, opens the door to people relying on its existence and
 thus starting to do things like loading assets with
 ``__loader__.get_data(path_to_internal_package_file)`` which allows code to
 not care how modules are stored (e.g. zip file, sqlite database, etc.).

 What I would like to do is update the PEP to state that loaders are expected
 to set __loader__. Now importlib will get updated to do that implicitly so
 external code can expect it post-import, but requiring loaders to set it
 would mean that code executed during import can rely on it as well.

 As for __package__, PEP 366 states that modules should set it but it isn't
 referenced by PEP 302. What I want to do is add a reference and make it
 required like __loader__. Importlib already sets it implicitly post-import,
 but once again it would be nice to do this pre-import.

 To help facilitate both new requirements, I would update the
 importlib.util.module_for_loader decorator to set both on a module that
 doesn't have them before passing the module down to the decorated method.
 That way people already using the decorator don't have to worry about
 anything and it is one less detail to have to worry about. I would also
 update the docs on importlib.util.set_package and importlib.util.set_loader
 to suggest people use importlib.util.module_for_loader and only use the
 other two decorators for backwards-compatibility.

+1

-eric
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Guido van Rossum
On Sat, Apr 14, 2012 at 2:15 PM, Eric Snow ericsnowcurren...@gmail.com wrote:
 On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon br...@python.org wrote:
 An open issue in PEP 302 is whether to require __loader__ attributes on
 modules. The claimed worry is memory consumption, but considering importlib
 and zipimport are already doing this that seems like a red herring.
 Requiring it, though, opens the door to people relying on its existence and
 thus starting to do things like loading assets with
 ``__loader__.get_data(path_to_internal_package_file)`` which allows code to
 not care how modules are stored (e.g. zip file, sqlite database, etc.).

 What I would like to do is update the PEP to state that loaders are expected
 to set __loader__. Now importlib will get updated to do that implicitly so
 external code can expect it post-import, but requiring loaders to set it
 would mean that code executed during import can rely on it as well.

 As for __package__, PEP 366 states that modules should set it but it isn't
 referenced by PEP 302. What I want to do is add a reference and make it
 required like __loader__. Importlib already sets it implicitly post-import,
 but once again it would be nice to do this pre-import.

 To help facilitate both new requirements, I would update the
 importlib.util.module_for_loader decorator to set both on a module that
 doesn't have them before passing the module down to the decorated method.
 That way people already using the decorator don't have to worry about
 anything and it is one less detail to have to worry about. I would also
 update the docs on importlib.util.set_package and importlib.util.set_loader
 to suggest people use importlib.util.module_for_loader and only use the
 other two decorators for backwards-compatibility.

 +1

Funny, I was just thinking about having a simple standard API that
will let you open files (and list directories) relative to a given
module or package regardless of how the thing is loaded. If we
guarantee that there's always a __loader__ that's a first step, though
I think we may need to do a little more to get people who currently do
things like open(os.path.join(os.path.basename(__file__),
'some_file_name') to switch. I was thinking of having a stdlib
function that you give a module/package object, a relative filename,
and optionally a mode ('b' or 't') and returns a stream -- and sibling
functions that return a string or bytes object (depending on what API
the user is using either the stream or the data can be more useful).
What would we call thos functions and where would the live?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Brett Cannon
On Sat, Apr 14, 2012 at 18:32, Guido van Rossum gu...@python.org wrote:

 On Sat, Apr 14, 2012 at 2:15 PM, Eric Snow ericsnowcurren...@gmail.com
 wrote:
  On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon br...@python.org wrote:
  An open issue in PEP 302 is whether to require __loader__ attributes on
  modules. The claimed worry is memory consumption, but considering
 importlib
  and zipimport are already doing this that seems like a red herring.
  Requiring it, though, opens the door to people relying on its existence
 and
  thus starting to do things like loading assets with
  ``__loader__.get_data(path_to_internal_package_file)`` which allows
 code to
  not care how modules are stored (e.g. zip file, sqlite database, etc.).
 
  What I would like to do is update the PEP to state that loaders are
 expected
  to set __loader__. Now importlib will get updated to do that implicitly
 so
  external code can expect it post-import, but requiring loaders to set it
  would mean that code executed during import can rely on it as well.
 
  As for __package__, PEP 366 states that modules should set it but it
 isn't
  referenced by PEP 302. What I want to do is add a reference and make it
  required like __loader__. Importlib already sets it implicitly
 post-import,
  but once again it would be nice to do this pre-import.
 
  To help facilitate both new requirements, I would update the
  importlib.util.module_for_loader decorator to set both on a module that
  doesn't have them before passing the module down to the decorated
 method.
  That way people already using the decorator don't have to worry about
  anything and it is one less detail to have to worry about. I would also
  update the docs on importlib.util.set_package and
 importlib.util.set_loader
  to suggest people use importlib.util.module_for_loader and only use the
  other two decorators for backwards-compatibility.
 
  +1

 Funny, I was just thinking about having a simple standard API that
 will let you open files (and list directories) relative to a given
 module or package regardless of how the thing is loaded. If we
 guarantee that there's always a __loader__ that's a first step, though
 I think we may need to do a little more to get people who currently do
 things like open(os.path.join(os.path.basename(__file__),
 'some_file_name') to switch. I was thinking of having a stdlib
 function that you give a module/package object, a relative filename,
 and optionally a mode ('b' or 't') and returns a stream -- and sibling
 functions that return a string or bytes object (depending on what API
 the user is using either the stream or the data can be more useful).
 What would we call thos functions and where would the live?


IOW go one level lower than get_data() and return the stream and then just
have helper functions which I guess just exhaust the stream for you to
return bytes or str? Or are you thinking that somehow providing a function
that can get an explicit bytes or str object will be more optimized than
doing something with the stream? Either way you will need new methods on
loaders to make it work more efficiently since loaders only have get_data()
which returns bytes and not a stream object. Plus there is currently no API
for listing the contents of a directory.

As for what to call such functions, I really don't know since they are
essentially abstract functions above the OS which work on whatever storage
backend a module uses.

For where they should live, it depends if you are viewing this as more of a
file abstraction or something that ties into modules. For the former it
seems like shutil or something that dealt with higher order file
manipulation. If it's the latter I would say importlib.util.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Guido van Rossum
On Sat, Apr 14, 2012 at 3:50 PM, Brett Cannon br...@python.org wrote:
 On Sat, Apr 14, 2012 at 18:32, Guido van Rossum gu...@python.org wrote:
 Funny, I was just thinking about having a simple standard API that
 will let you open files (and list directories) relative to a given
 module or package regardless of how the thing is loaded. If we
 guarantee that there's always a __loader__ that's a first step, though
 I think we may need to do a little more to get people who currently do
 things like open(os.path.join(os.path.basename(__file__),
 'some_file_name') to switch. I was thinking of having a stdlib
 function that you give a module/package object, a relative filename,
 and optionally a mode ('b' or 't') and returns a stream -- and sibling
 functions that return a string or bytes object (depending on what API
 the user is using either the stream or the data can be more useful).
 What would we call thos functions and where would the live?

 IOW go one level lower than get_data() and return the stream and then just
 have helper functions which I guess just exhaust the stream for you to
 return bytes or str? Or are you thinking that somehow providing a function
 that can get an explicit bytes or str object will be more optimized than
 doing something with the stream? Either way you will need new methods on
 loaders to make it work more efficiently since loaders only have get_data()
 which returns bytes and not a stream object. Plus there is currently no API
 for listing the contents of a directory.

Well, if it's a real file, and you need a stream, that's efficient,
and if you need the data, you can read it. But if it comes from a
loader, and you need a stream, you'd have to wrap it in a StringIO
instance. So having two APIs, one to get a stream, and one to get the
data, allows the implementation to be more optimal -- it would be bad
to wrap a StringIO instance around data only so you can read the data
from the stream again...

 As for what to call such functions, I really don't know since they are
 essentially abstract functions above the OS which work on whatever storage
 backend a module uses.

 For where they should live, it depends if you are viewing this as more of a
 file abstraction or something that ties into modules. For the former it
 seems like shutil or something that dealt with higher order file
 manipulation. If it's the latter I would say importlib.util.

if pkg_resources is in the stdlib that would be a fine place to put it.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Brett Cannon
On Sat, Apr 14, 2012 at 18:41, Christian Heimes li...@cheimes.de wrote:

 Am 15.04.2012 00:32, schrieb Guido van Rossum:
  Funny, I was just thinking about having a simple standard API that
  will let you open files (and list directories) relative to a given
  module or package regardless of how the thing is loaded. If we
  guarantee that there's always a __loader__ that's a first step, though
  I think we may need to do a little more to get people who currently do
  things like open(os.path.join(os.path.basename(__file__),
  'some_file_name') to switch. I was thinking of having a stdlib
  function that you give a module/package object, a relative filename,
  and optionally a mode ('b' or 't') and returns a stream -- and sibling
  functions that return a string or bytes object (depending on what API
  the user is using either the stream or the data can be more useful).
  What would we call thos functions and where would the live?

 pkg_resources has a similar API [1] that supports dotted names.
 pkg_resources also does some caching for files that aren't stored on a
 local file system (database, ZIP file, you name it). It should be
 trivial to support both dotted names and module instances.


But that begs the question of whether this API should conflate module
hierarchies with file directories. Are we trying to support reading files
from within packages w/o caring about storage details but still
fundamentally working with files, or are we trying to abstract away the
concept of files and deal more with stored bytes inside packages? For the
former you would essentially want the root package and then simply specify
some file path. But for the latter you would want the module or package
that is next to or containing the data and grab it from there.

And I just realized that we would have to be quite clear that for namespace
packages it is what is in __file__ that people care about, else people
might expect some search to be performed on their behalf. Namespace
packages also dictate that you would want the module closest to the data in
the hierarchy to make sure you went down the right directory (e.g. if you
had the namespace package monty with modules spam and bacon but from
different directories, you really want to make sure you grab the right
module). I would argue that you can only go next to/within
modules/packages; going up would just cause confusion on where you were
grabbing from and going down could be done but makes things a little
messier.

-Brett


 Christian

 [1]

 http://packages.python.org/distribute/pkg_resources.html#resourcemanager-api

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/brett%40python.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Brett Cannon
On Sat, Apr 14, 2012 at 18:56, Guido van Rossum gu...@python.org wrote:

 On Sat, Apr 14, 2012 at 3:50 PM, Brett Cannon br...@python.org wrote:
  On Sat, Apr 14, 2012 at 18:32, Guido van Rossum gu...@python.org
 wrote:
  Funny, I was just thinking about having a simple standard API that
  will let you open files (and list directories) relative to a given
  module or package regardless of how the thing is loaded. If we
  guarantee that there's always a __loader__ that's a first step, though
  I think we may need to do a little more to get people who currently do
  things like open(os.path.join(os.path.basename(__file__),
  'some_file_name') to switch. I was thinking of having a stdlib
  function that you give a module/package object, a relative filename,
  and optionally a mode ('b' or 't') and returns a stream -- and sibling
  functions that return a string or bytes object (depending on what API
  the user is using either the stream or the data can be more useful).
  What would we call thos functions and where would the live?

  IOW go one level lower than get_data() and return the stream and then
 just
  have helper functions which I guess just exhaust the stream for you to
  return bytes or str? Or are you thinking that somehow providing a
 function
  that can get an explicit bytes or str object will be more optimized than
  doing something with the stream? Either way you will need new methods on
  loaders to make it work more efficiently since loaders only have
 get_data()
  which returns bytes and not a stream object. Plus there is currently no
 API
  for listing the contents of a directory.

 Well, if it's a real file, and you need a stream, that's efficient,
 and if you need the data, you can read it. But if it comes from a
 loader, and you need a stream, you'd have to wrap it in a StringIO
 instance. So having two APIs, one to get a stream, and one to get the
 data, allows the implementation to be more optimal -- it would be bad
 to wrap a StringIO instance around data only so you can read the data
 from the stream again...


Right, so you would need to grow, which is fine and can be done in a
backwards-compatible way using io.BytesIO and StringIO.



  As for what to call such functions, I really don't know since they are
  essentially abstract functions above the OS which work on whatever
 storage
  backend a module uses.
 
  For where they should live, it depends if you are viewing this as more
 of a
  file abstraction or something that ties into modules. For the former it
  seems like shutil or something that dealt with higher order file
  manipulation. If it's the latter I would say importlib.util.

 if pkg_resources is in the stdlib that would be a fine place to put it.


It's not.

-Brett



 --
 --Guido van Rossum (python.org/~guido)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Christian Heimes
Am 15.04.2012 00:56, schrieb Guido van Rossum:
 Well, if it's a real file, and you need a stream, that's efficient,
 and if you need the data, you can read it. But if it comes from a
 loader, and you need a stream, you'd have to wrap it in a StringIO
 instance. So having two APIs, one to get a stream, and one to get the
 data, allows the implementation to be more optimal -- it would be bad
 to wrap a StringIO instance around data only so you can read the data
 from the stream again...

We need a third way to access a file. The two methods get_data() and
get_stream() aren't sufficient for libraries that need a read file that
lifes on the file system. In order to have real files the loader (or
some other abstraction layer) needs to create a temporary directory for
the current process and clean it up when the process ends. The file is
saved to the temporary directory the first time it's accessed.

The get_file() feature has a neat benefit. Since it transparently
extracts files from the loader, users can ship binary extensions and
shared libraries (dlls) in a ZIP file and use them without too much hassle.

Christian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Require loaders set __package__ and __loader__

2012-04-14 Thread Guido van Rossum
On Sat, Apr 14, 2012 at 5:06 PM, Christian Heimes li...@cheimes.de wrote:
 Am 15.04.2012 00:56, schrieb Guido van Rossum:
 Well, if it's a real file, and you need a stream, that's efficient,
 and if you need the data, you can read it. But if it comes from a
 loader, and you need a stream, you'd have to wrap it in a StringIO
 instance. So having two APIs, one to get a stream, and one to get the
 data, allows the implementation to be more optimal -- it would be bad
 to wrap a StringIO instance around data only so you can read the data
 from the stream again...

 We need a third way to access a file. The two methods get_data() and
 get_stream() aren't sufficient for libraries that need a read file that
 lives on the file system. In order to have real files the loader (or
 some other abstraction layer) needs to create a temporary directory for
 the current process and clean it up when the process ends. The file is
 saved to the temporary directory the first time it's accessed.

Hm... Can you give an example of a library that needs a real file?
That sounds like a poorly designed API.

Perhaps you're talking about APIs that take a filename instead of a
stream? Maybe for those it would be best to start getting serious
about a virtual filesystem... (Sorry, probably python-ideas stuff).

 The get_file() feature has a neat benefit. Since it transparently
 extracts files from the loader, users can ship binary extensions and
 shared libraries (dlls) in a ZIP file and use them without too much hassle.

Yeah, DLLs are about the only example I can think of where even a
virtual filesystem doesn't help...

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com