Convert JSON Text Input to Avro Tool
------------------------------------
Key: AVRO-672
URL: https://issues.apache.org/jira/browse/AVRO-672
Project: Avro
Issue Type: New Feature
Reporter: Ron Bodkin
The attached patch allows reading a JSON-formatted text file in, converting to
a conforming Avro text file, emitting one record per line, e.g., it can read
this input file:
{"intval":12}
{"intval":-73,"strval":"hello, there!!"}
with this schema:
{ "type":"record", "name":"TestRecord", "fields": [
{"name":"intval","type":"int"}, {"name":"strval","type":["string", "null"]}]}
returning valid Avro. This is different than the DataFileWriteTool, which would
read in the following internal encoding:
{"intval":12,"strval":null}
{"intval":-73,"strval":{"string":"hello, there!!"}}
In general, the internal encodings used by Avro aren't natural when reading in
JSON text that appears in the wild. Likewise, this utility allows changing
invalid Avro identifier characters into an underscore, again to tolerate JSON
that wasn't designed to be readable by Avro.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.