[ https://issues.apache.org/jira/browse/AVRO-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790536#action_12790536 ]
Philip Zeyliger commented on AVRO-245: -------------------------------------- bq. i think ant uses glob-like patterns, not regex. instead i just named the file AvroTestUtil.java. Cool. I intended [A-Z]* to be glob, not regex, and I thought it worked locally, but perhaps I was deluding myself. bq. // FIXME: re-create encoder to avoid extra spaces (Jackson bug?) I think this comment isn't quite enough to figure out what the fix-me is implying. What's your TODO/FIXME convention? (Perhaps that comment was intended for yourself, and intended to never be committed.) I wandered into the Jackson code when I ran into this, and it's reasonably set up to write one JSON object, and we're writing many. So I don't think they'd say it was a bug: I think they'd say we should use a different JsonGenerator for every bit of JSON we write. {noformat} while (true) { try { datum = reader.read(null, decoder); } catch (AvroRuntimeException e) { // FIXME: at EOF {noformat} It bugs me that this works. The example it ought to fail is (json-data) "1 2 3\n" (note: no newlines between records) against schema "int". This ought to throw an error. The core issue is that we've got two different things going on: we're both line-oriented and JSON-oriented. We should check that the JSON on every line is well-formed, and the code fails to. (My original code was broken too: when I wrote the test, it didn't throw an error for the malformed data; just read one entry and went on; also StringInputStream was from ant, which shouldn't even be on avroj's classpath.) One way to avoid this mess is to require that the input file be a JSON array. So "[1, 2, 3]" (with arbitrary whitespace). I think this makes it harder to use line-oriented unix tools with this, but it does solve both problems. What do you think? It also worries me every time JsonDecoder calls "in.nextToken();" without checking that the value it got was expected (typically "null" or possibly END_ARRAY or END_OBJECT). It doesn't seem that using the ValidatingDecoder makes it check that, but i could be wrong. > Commandline utility for converting to and from Avro's binary format. > -------------------------------------------------------------------- > > Key: AVRO-245 > URL: https://issues.apache.org/jira/browse/AVRO-245 > Project: Avro > Issue Type: New Feature > Components: java > Reporter: Philip Zeyliger > Assignee: Philip Zeyliger > Priority: Minor > Attachments: AVRO-245.patch, AVRO-245.patch.txt, AVRO-245.patch.txt, > AVRO-245.patch.txt, AVRO-245.patch.txt > > > A utility for avrotool that can convert between Avro binary data and the JSON > textual form. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.