[GitHub] [arrow-rs] rguerreiromsft opened a new pull request, #3736: Add converter to allow custom parsing of json data

via GitHub Sun, 19 Feb 2023 22:09:49 -0800


rguerreiromsft opened a new pull request, #3736:
URL: https://github.com/apache/arrow-rs/pull/3736


   # Which issue does this PR close?
   None. This is a feature request.
   
   # Rationale for this change
   This would make it easier to bridge between Rust + Arrow and Apache Spark, 
because Spark can convert the json data when writing into Parquet.
   
   By using an interface and offering 2 default implementations we can keep 
today's strict behaviour and offer a loose json reading that will try its best 
to match the given schema.
   
   I'm not sure arrow-json the best place to implement this conversion, but it 
would definitely be better than having to parse many json entries to conform to 
the schema before writing them into Parquet. Please, advice on the best 
strategy?
   
   # What changes are included in this PR?
   Add StrictTapeConverter that keeps today's behaviour.
   Add LooseTapeConverter that can accept pretty much anything that can fit the 
desired schema.
   
   # Are there any user-facing changes?
   Yes, RawReader now needs a <C: TapeConverter> generic in its declaration.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] rguerreiromsft opened a new pull request, #3736: Add converter to allow custom parsing of json data

Reply via email to