jorisvandenbossche commented on a change in pull request #11447: URL: https://github.com/apache/arrow/pull/11447#discussion_r731565164
########## File path: python/pyarrow/_fs.pyx ########## @@ -833,6 +833,12 @@ cdef class SubTreeFileSystem(FileSystem): FileSystem.init(self, wrapped) self.subtreefs = <CSubTreeFileSystem*> wrapped.get() + def __str__(self): + return f'SubTreeFileSystem: file:{self.base_path}' Review comment: > Does it make sense to omit `subfs.base_fs` from `__repr__` in this case? I would certainly keep it, as that seems an essential part to understand a SubtreeFilesystem object (i.e. what kind of filesystem is it wrapping) > Ideally, [`__repr__`](https://docs.python.org/3/library/functions.html#repr) differs from `__str__` in that it allows "reconstruction" (eval) of an equivalent object. But this would be difficult for `SubTreeFileSystem` and I do not think Arrow has an established convention for representing Python objects. Indeed, if making a distinction, that's the typical rule to differentiate repr and str. But making an "eval"-able repr is not always easy / possible. We don't generally do that in pyarrow (eg Array, Table, RecordBatch etc don't have a separate repr). For FileSystems it might be possible though. But I would defer that to later (because it requires updating the repr of other filesystems, see note below). > Would it make sense to put this PR on hold, and first add `__str__/__repr__` to `FileSystem`. Then circle back to this PR and build on top of that information? cc @jorisvandenbossche @ianmcook We could certainly improve the str/reprs of the other FileSystems as well (and we should open a JIRA for it). But I don't think that need to hold up this PR. For example, the repr of LocalFileSystem is already informative, but only contains some noise. While SubtreeFileSystem is really lacking information. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org