jorisvandenbossche commented on a change in pull request #11447:
URL: https://github.com/apache/arrow/pull/11447#discussion_r731565164



##########
File path: python/pyarrow/_fs.pyx
##########
@@ -833,6 +833,12 @@ cdef class SubTreeFileSystem(FileSystem):
         FileSystem.init(self, wrapped)
         self.subtreefs = <CSubTreeFileSystem*> wrapped.get()
 
+    def __str__(self):
+        return f'SubTreeFileSystem: file:{self.base_path}'

Review comment:
       > Does it make sense to omit `subfs.base_fs` from `__repr__` in this 
case?
   
   I would certainly keep it, as that seems an essential part to understand a 
SubtreeFilesystem object (i.e. what kind of filesystem is it wrapping)
   
   > Ideally, 
[`__repr__`](https://docs.python.org/3/library/functions.html#repr) differs 
from `__str__` in that it allows "reconstruction" (eval) of an equivalent 
object. But this would be difficult for `SubTreeFileSystem` and I do not think 
Arrow has an established convention for representing Python objects.
   
   Indeed, if making a distinction, that's the typical rule to differentiate 
repr and str. But making an "eval"-able repr is not always easy / possible. We 
don't generally do that in pyarrow (eg Array, Table, RecordBatch etc don't have 
a separate repr). For FileSystems it might be possible though. But I would 
defer that to later (because it requires updating the repr of other 
filesystems, see note below).
   
   > Would it make sense to put this PR on hold, and first add 
`__str__/__repr__` to `FileSystem`. Then circle back to this PR and build on 
top of that information? cc @jorisvandenbossche @ianmcook
   
   We could certainly improve the str/reprs of the other FileSystems as well 
(and we should open a JIRA for it). But I don't think that need to hold up this 
PR. For example, the repr of LocalFileSystem is already informative, but only 
contains some noise. While SubtreeFileSystem is really lacking information. 
   
   
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to