[ https://issues.apache.org/jira/browse/AVRO-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amichai Rothman updated AVRO-438: --------------------------------- Status: Patch Available (was: Open) The patch fixes most of the issues. A few more thoughts: - The block mechanism described for arrays and maps is basically copy&paste of a few paragraphs - perhaps the map serialization can simply be described as an array where each item is a key immediately followed by its respective value? - I added the binary encoding to the file format, but not the rpc section, since I got more confused there. In discussion of the HTTP transport, it says to use the "avro/binary" content type, which suggests there might also be a "avro/json" version later on or something like that. So maybe the serialization format is actually transport-dependent and not part of the spec? Maybe there should be another section for the binary socket transport implementation? - Further, AIUI "avro/binary" is not a legal HTTP content type. It should be something more like "application/x-avro-binary" (or registered with IANA). But this digresses into changes in the spec itself, not just its wording. Should I open this as a separate bug? is it? - As for the example, yes it would be mostly binary, but can be annotated to explain what each bunch of bytes mean. > spec organization and clarification improvements > ------------------------------------------------ > > Key: AVRO-438 > URL: https://issues.apache.org/jira/browse/AVRO-438 > Project: Avro > Issue Type: Improvement > Components: spec > Affects Versions: 1.3.0 > Reporter: Amichai Rothman > Priority: Trivial > Attachments: fix_spec_loose_ends.patch > > > There are a few improvements that can be made to make the spec better > organized and clarify ambiguous meanings: > 1. The binary encoding specifies string, then bytes, then longs. However, the > first two are dependent on the latter, so in essence long encoding is being > used before it was defined. In addition, string comes before bytes even > though it is logically a special case of bytes. It would be clearer if these > were ordered long, bytes, string so that each definition builds on its > predecessors and nothing is used before it is defined. Maybe bytes/string > should be at the end of the other primitives, since they are technically more > complex structures. Note that it might be a good idea to do this in all > places in the spec where primitives are enumerated. > 2. The sentence about array count and size is a bit confusing. A possible > alternative: > "If a block's count is negative, its absolute value is used, and it is > followed immediately by a long block size indicating the number of bytes in > the block. " > and maybe this should be immediately followed by the sentence explaining why > this is useful which is currently a few lines below. > 3. There is a note about blocks being in experimental stage, but it's unclear > if this is only for map blocks or also for array blocks. > 4. Object Container Files and Protocol Declarations are described in the spec > using JSON objects and their schema is shown, but it doesn't say anywhere how > these should be serialized. If it's using binary serialization, it should say > so explicitly. If it can be either binary or JSON, then the file has no > self-describing way of differentiating the two - this should be addressed > somewhere (maybe have a different magic word for binary/JSON content). > 5. Protocol Definition has a namespace and name (called protocol), but it is > not clear whether the namespace rules defined in the first section apply here > or not. It should be mentioned explicitly either way. > 6.It would be extremely helpful to have a full sample of an RPC call over > HTTP, possibly using the HelloWorld protocol from the previous example. This > would show how the transport, framing, handshake, call format and messages > all fit together. Examples in RFCs often help clarify any misunderstandings > that might arise from the body of the specs, which makes for a better spec - > and this would be great here too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.