Re: [I] Partitioned variable does not read in as the correct type [arrow]

via GitHub Thu, 18 Jul 2024 10:10:56 -0700


amoeba commented on issue #43303:
URL: https://github.com/apache/arrow/issues/43303#issuecomment-2237103075


   Hi @prayaggordy, you're right that when a dataset is written with 
partitioning, the partition fields aren't stored in the files.
   
   Arrow's partitioning approach does auto detection like other systems but 
allows you to provide a schema as an alternative which I think should get you 
what you want:
   
   ```r
   > my_schema <- schema(field("cyl_ch", string()))
   > open_dataset("output/partition_cyl_ch", partitioning = my_schema)
   FileSystemDataset with 3 Parquet files
   13 columns
   mpg: double
   cyl: double
   disp: double
   hp: double
   drat: double
   wt: double
   qsec: double
   vs: double
   am: double
   gear: double
   carb: double
   gear_ch: string
   cyl_ch: string
   ```
   
   Would this work for your use case?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] Partitioned variable does not read in as the correct type [arrow]

Reply via email to