DDtKey opened a new issue, #10669:
URL: https://github.com/apache/datafusion/issues/10669
### Is your feature request related to a problem or challenge?
CSV writers usually supports configuration of quote style/mode with the
following options:
- `Always`
- `Necessary`
- `Never`
- `NonNumeric`
Sometimes this just need to be controlled, and for now only way to change
that is to re-iterate through result file(s) in order to store the content with
desired quote style.
You can find such configs in many libraries:
- `csv` crate
([`QuoteStyle`](https://docs.rs/csv/latest/csv/enum.QuoteStyle.html)),
- `csv` from python (constants, like
[`QUOTE_ALL`](https://docs.python.org/3/library/csv.html#csv.QUOTE_ALL)
- in Appach Commons CSV for Java
([`QuoteMode`](https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/QuoteMode.html))
### Describe the solution you'd like
Just expose a way to pass the `QuoteStyle` enum along with other properties
like `quote`, `delimiter` and etc (as part of `CsvOptions`). However, need to
keep in mind that the configuration only makes sense for writers, not readers.
That shouldn't be an issue to support, because `datafusion` relies on
`arrow-csv` which uses `csv` crate under the hood.
- requires to update `arrow-csv` to accept quote-style param (sub-issue for
`arrow-rs`?)
- add to `WriterBuilder`:
https://github.com/apache/arrow-rs/blob/4b5d9bfc958c06fb1ff71d90ba58497e965eff40/arrow-csv/src/writer.rs#L191-L214
- pass to `csv::Writer`:
https://github.com/apache/arrow-rs/blob/4b5d9bfc958c06fb1ff71d90ba58497e965eff40/arrow-csv/src/writer.rs#L402-L408
- update `datafusion`
- add parameter to `CsvOptions`:
https://github.com/apache/datafusion/blob/ea92ae72f7ec2e941d35aa077c6a39f74523ab63/datafusion/common/src/config.rs#L1554-L1570
- pass to `arrow-csv`:
https://github.com/apache/datafusion/blob/ea92ae72f7ec2e941d35aa077c6a39f74523ab63/datafusion/common/src/file_options/csv_writer.rs#L48-L75
### Describe alternatives you've considered
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org
-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org