[
https://issues.apache.org/jira/browse/AVRO-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918338#action_12918338
]
Doug Cutting commented on AVRO-672:
-----------------------------------
I am not convinced that the tool you need is a general-purpose tool that others
will use or whether it might be better to keep this in your application.
Avro's existing JSON encoding is primarily a tool for debugging. Tools that
can losslessly import and export JSON data into and out of Avro might also be
generally useful. A tool that adapts JSON data to pre-existing schemas could
be generally useful if it permitted enough control of how the adaptation is
done, but might also be rather application-specific. What do you think?
> Convert JSON Text Input to Avro Tool
> ------------------------------------
>
> Key: AVRO-672
> URL: https://issues.apache.org/jira/browse/AVRO-672
> Project: Avro
> Issue Type: New Feature
> Reporter: Ron Bodkin
> Attachments: AVRO-672.patch, AVRO-672.patch
>
>
> The attached patch allows reading a JSON-formatted text file in, converting
> to a conforming Avro text file, emitting one record per line, e.g., it can
> read this input file:
> {"intval":12}
> {"intval":-73,"strval":"hello, there!!"}
> with this schema:
> { "type":"record", "name":"TestRecord", "fields": [
> {"name":"intval","type":"int"}, {"name":"strval","type":["string", "null"]}]}
> returning valid Avro. This is different than the DataFileWriteTool, which
> would read in the following internal encoding:
> {"intval":12,"strval":null}
> {"intval":-73,"strval":{"string":"hello, there!!"}}
> In general, the internal encodings used by Avro aren't natural when reading
> in JSON text that appears in the wild. Likewise, this utility allows changing
> invalid Avro identifier characters into an underscore, again to tolerate JSON
> that wasn't designed to be readable by Avro.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.