[ https://issues.apache.org/jira/browse/ARROW-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ben Kietzman resolved ARROW-8136. --------------------------------- Resolution: Fixed Issue resolved by pull request 6643 [https://github.com/apache/arrow/pull/6643] > [C++][Python] Creating dataset from relative path no longer working > ------------------------------------------------------------------- > > Key: ARROW-8136 > URL: https://issues.apache.org/jira/browse/ARROW-8136 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Python > Reporter: Joris Van den Bossche > Assignee: Joris Van den Bossche > Priority: Major > Labels: pull-request-available > Fix For: 0.17.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Since https://github.com/apache/arrow/pull/6597, local relative paths don't > work anymore: > {code} > In [1]: import pyarrow.dataset as ds > In [2]: ds.dataset("test.parquet") > --------------------------------------------------------------------------- > ArrowInvalid Traceback (most recent call last) > <ipython-input-2-23ecfce52d13> in <module> > ----> 1 ds.dataset("test.parquet") > ~/scipy/repos/arrow/python/pyarrow/dataset.py in dataset(paths_or_factories, > filesystem, partitioning, format) > 327 > 328 if isinstance(paths_or_factories, str): > --> 329 return factory(paths_or_factories, **kwargs).finish() > 330 > 331 if not isinstance(paths_or_factories, list): > ~/scipy/repos/arrow/python/pyarrow/dataset.py in factory(path_or_paths, > filesystem, partitioning, format) > 246 factories = [] > 247 for path in path_or_paths: > --> 248 fs, paths_or_selector = _ensure_fs_and_paths(path, filesystem) > 249 factories.append(FileSystemDatasetFactory(fs, > paths_or_selector, > 250 format, options)) > ~/scipy/repos/arrow/python/pyarrow/dataset.py in _ensure_fs_and_paths(path, > filesystem) > 165 from pyarrow.fs import FileType, FileSelector > 166 > --> 167 filesystem, path = _ensure_fs(filesystem, _stringify_path(path)) > 168 infos = filesystem.get_target_infos([path])[0] > 169 if infos.type == FileType.Directory: > ~/scipy/repos/arrow/python/pyarrow/dataset.py in _ensure_fs(filesystem, path) > 158 if filesystem is not None: > 159 return filesystem, path > --> 160 return FileSystem.from_uri(path) > 161 > 162 > ~/scipy/repos/arrow/python/pyarrow/_fs.pyx in > pyarrow._fs.FileSystem.from_uri() > ~/scipy/repos/arrow/python/pyarrow/error.pxi in > pyarrow.lib.pyarrow_internal_check_status() > ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status() > ArrowInvalid: URI has empty scheme: 'test.parquet' > {code} > [~apitrou] Is this something that should be fixed in > {{FileSystemFromUriOrPath}} or rather on the python side? > ({{FileSystem.from_uri}} ensures to get the absolute path for Pathlib > objects, but not for strings) -- This message was sent by Atlassian Jira (v8.3.4#803005)