On 04/09/2016 12:48 AM, Nick Coghlan wrote:

> Considering the helper function usage, here's some examples in
> combination with os.fsencode and os.fsdecode:
>
>   # Status quo for binary/text path conversions
>   text_path = os.fsdecode(bytes_path)
>   bytes_path = os.fsencode(text_path)
>
>   # Getting a text path from an arbitrary object
>   text_path = os.fspath(obj) # This doesn't scream "returns text!"
>   text_path = os.fspathname(obj) # This does
>
>   # Getting a binary path from an arbitrary object
>   bytes_path = os.fsencode(os.fspath(obj))
>   bytes_path = os.fsencode(os.fspathname(obj))
>
> I'm starting to think the semantic nudge from the "name" suffix when
> reading the code is worth the extra four characters when writing it
> (keeping in mind that the whole point of this exercise is that most
> folks *won't* be writing explicit conversions - the stdlib will handle
> it on their behalf).
>
> I also think the more explicit name helps answer some of the type
> signature questions that have arisen:
>
> 1. Does os.fspathname return rich Path objects? No, it returns names
> as str objects
> 2. Will file descriptors pass through os.fspathname? No, as they're
> not names, they're numeric descriptors.
> 3. Will bytes-like objects pass through os.fspathname? No, as they're
> not names, they're encodings of names

This worries me.

I know the primary purpose of this change is to enable pathlib and os and the rest of the stdlib to work together, but consider . . .

If adding a new attribute/method was as far as we went, new code (stdlib or otherwise) would look like:

  if isinstance(a_path_thingy, bytes):
      # because os can accept bytes
      pass
  elif isinstance(a_path_thingy, str):
      # but it's usually text
      pass
  elif hasattr(a_path_thingy, '__fspath__'):
      a_path_thingy = a_path_thingy.__fspath__()
  else:
      raise TypeError('not a valid path')
  # do something with the path

If we add os.fspath(), but don't allow bytes to be returned from it, our above example looks more like:

  if isinstance(a_path_thingy, bytes):
      # because os can accept bytes
      pass
  else:
      a_path_thingy = os.fspath(a_path_thingy)
  # do something with the path

Yes, it's better -- but it still requires a pre-check before calling os.fspath().

It is my contention that this is better:

  a_path_thingy = os.fspath(a_path_thingy)

This raises two issues:

1) Part of the stdlib is the new scandir module, which can work
   with, and return, both bytes and text -- if __fspath__ can only
   hold text, DirEntry will not get the __fspath__ method added,
   and the pre-check, boiler-plate code will flourish;

2) pathlib.Path accepts bytes -- so what happens when a byte-derived
   Path is passed to os.fspath()?  Is a TypeError raised?  Do we
   guess and auto-convert with fsdecode()?

I think the best answer is to

- let __fspath__ hold bytes as well as text
- let fspath() return bytes as well as text

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to