amoeba commented on issue #43303: URL: https://github.com/apache/arrow/issues/43303#issuecomment-2237103075
Hi @prayaggordy, you're right that when a dataset is written with partitioning, the partition fields aren't stored in the files. Arrow's partitioning approach does auto detection like other systems but allows you to provide a schema as an alternative which I think should get you what you want: ```r > my_schema <- schema(field("cyl_ch", string())) > open_dataset("output/partition_cyl_ch", partitioning = my_schema) FileSystemDataset with 3 Parquet files 13 columns mpg: double cyl: double disp: double hp: double drat: double wt: double qsec: double vs: double am: double gear: double carb: double gear_ch: string cyl_ch: string ``` Would this work for your use case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org