Hi Akbar, The documentation regarding the legacy and new file system interface is indeed somewhat lacking. So in general, we have the older, and now legacy, filesystems in pyarrow.filesystem ( https://arrow.apache.org/docs/python/filesystems_deprecated.html) and a new implementation in pyarrow.fs ( https://arrow.apache.org/docs/python/filesystems.html, https://arrow.apache.org/docs/python/api/filesystems.html). We need to document this better (and actually deprecate), but the long term goal is certainly to eventually remove pyarrow.filesystem.
So regarding your specific HDFS related questions: - There is also a HadoopFileSystem in the new interface ( https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html), so in general support for HDFS is not limited to the deprecated API - The available methods on the new interface are different though, and there is no "download" method anymore. However (although I am not fully familiar with this), I think you can achieve more or less the same with the "open_input_file" method of the new interface (which returns a NativeFile object, which has a download method). Best, Joris On Tue, 4 Aug 2020 at 04:19, Akbar <ed.ak...@gmail.com> wrote: > apologies - I made of mess of that email. Let me try again > > > - *Question 1* - pyarrow.HadoopFileSystem.download > > <https://arrow.apache.org/docs/python/generated/pyarrow.HadoopFileSystem.download.html> > - > is listed under Filesystem Interface (Legacy) (and so are all the HDFS > APIs) - does this mean support for this is limited? > - *Question 2* - is there an equivalent to > pyarrow.HadoopFileSystem.download in the newer > pyarrow.fs.HadoopFileSystem > > <https://arrow.apache.org/docs/python/generated/pyarrow.fs.HadoopFileSystem.html#pyarrow.fs.HadoopFileSystem> > . > > > I want to be able to fetch/download an HDFS entire file or folder to the > HOST OS filesystem - Let me know if you have any guidance > > > On Fri, 31 Jul 2020 at 23:02, Akbar <ed.ak...@gmail.com> wrote: > >> >> Hello, >> >> Documentation on legacy file system interface is not quite clear. I’m not >> sure if the HDFS API layers are still relevant and supported. I know the >> HDFS API are operational >> >> The two questions I have >> 1. if there is an equivalent to HadoopFileSystem.download in the new file >> system interface (pyarrow.fs.HadoopFileSystem) >> 2. Will support for HDFS API be removed - I ask this based on the legacy >> tag on the documentation >> >> Sent from my iPhone > >