[ https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche updated ARROW-8213: ----------------------------------------- Description: Even after the previous PRs related to local paths (https://github.com/apache/arrow/pull/6643, https://github.com/apache/arrow/pull/6655), I don't think the user experience optimal in case you are working with local files, and pass a wrong, non-existent path (eg due to a typo). Currently, you get this error: {code} >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet") ... ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet' {code} where "URI has empty scheme" is rather confusing for the user in case of a non-existent path. I think ideally we should raise a "No such file or directory" error. I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can also give other errors that we do want to propagate to the user. The most straightforward that I am now thinking of is checking if "URI has empty scheme" is in the error message, and then rewording it, but that's not very clean .. was: Even after the previous PRs related to local paths (https://github.com/apache/arrow/pull/6643, https://github.com/apache/arrow/pull/6655), I don't the user experience optimal in case you are working with local files, and pass a wrong, non-existent path (eg due to a typo). Currently, you get this error: {code} >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet") ... ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet' {code} where "URI has empty scheme" is rather confusing for the user in case of a non-existent path. I think ideally we should raise a "No such file or directory" error. I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can also give other errors that we do want to propagate to the user. The most straightforward that I am now thinking of is checking if "URI has empty scheme" is in the error message, and then rewording it, but that's not very clean .. > [Python][Dataset] Opening a dataset with a local incorrect path gives > confusing error message > --------------------------------------------------------------------------------------------- > > Key: ARROW-8213 > URL: https://issues.apache.org/jira/browse/ARROW-8213 > Project: Apache Arrow > Issue Type: Bug > Components: C++ - Dataset, Python > Reporter: Joris Van den Bossche > Priority: Major > Fix For: 0.17.0 > > > Even after the previous PRs related to local paths > (https://github.com/apache/arrow/pull/6643, > https://github.com/apache/arrow/pull/6655), I don't think the user experience > optimal in case you are working with local files, and pass a wrong, > non-existent path (eg due to a typo). > Currently, you get this error: > {code} > >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet") > ... > ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet' > {code} > where "URI has empty scheme" is rather confusing for the user in case of a > non-existent path. I think ideally we should raise a "No such file or > directory" error. > I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can > also give other errors that we do want to propagate to the user. > The most straightforward that I am now thinking of is checking if "URI has > empty scheme" is in the error message, and then rewording it, but that's not > very clean .. -- This message was sent by Atlassian Jira (v8.3.4#803005)