[ 
https://issues.apache.org/jira/browse/AVRO-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amichai Rothman updated AVRO-438:
---------------------------------

    Status: Patch Available  (was: Open)

The patch fixes most of the issues. A few more thoughts:

- The block mechanism described for arrays and maps is basically copy&paste of 
a few paragraphs - perhaps the map serialization can simply be described as an 
array where each item is a key immediately followed by its respective value?

- I added the binary encoding to the file format, but not the rpc section, 
since I got more confused there. In discussion of the HTTP transport, it says 
to use the "avro/binary" content type, which suggests there might also be a 
"avro/json" version later on or something like that. So maybe the serialization 
format is actually transport-dependent and not part of the spec? Maybe there 
should be another section for the binary socket transport implementation?

- Further, AIUI "avro/binary" is not a legal HTTP content type. It should be 
something more like "application/x-avro-binary"  (or registered with IANA). But 
this digresses into changes in the spec itself, not just its wording. Should I 
open this as a separate bug? is it?

- As for the example, yes it would be mostly binary, but can be annotated to 
explain what each bunch of bytes mean.


> spec organization and clarification improvements
> ------------------------------------------------
>
>                 Key: AVRO-438
>                 URL: https://issues.apache.org/jira/browse/AVRO-438
>             Project: Avro
>          Issue Type: Improvement
>          Components: spec
>    Affects Versions: 1.3.0
>            Reporter: Amichai Rothman
>            Priority: Trivial
>         Attachments: fix_spec_loose_ends.patch
>
>
> There are a few improvements that can be made to make the spec better 
> organized and clarify ambiguous meanings:
> 1. The binary encoding specifies string, then bytes, then longs. However, the 
> first two are dependent on the latter, so in essence long encoding is being 
> used before it was defined. In addition, string comes before bytes even 
> though it is logically a special case of bytes. It would be clearer if these 
> were ordered long, bytes, string so that each definition builds on its 
> predecessors and nothing is used before it is defined. Maybe bytes/string 
> should be at the end of the other primitives, since they are technically more 
> complex structures. Note that it might be a good idea to do this in all 
> places in the spec where primitives are enumerated.
> 2. The sentence about array count and size is a bit confusing. A possible 
> alternative:
> "If a block's count is negative, its absolute value is used, and it is 
> followed immediately by a long  block size indicating the number of bytes in 
> the block. "
> and maybe this should be immediately followed by the sentence explaining why 
> this is useful which is currently a few lines below.
> 3. There is a note about blocks being in experimental stage, but it's unclear 
> if this is only for map blocks or also for array blocks.
> 4. Object Container Files and Protocol Declarations are described in the spec 
> using JSON objects and their schema is shown, but it doesn't say anywhere how 
> these should be serialized. If it's using binary serialization, it should say 
> so explicitly. If it can be either binary or JSON, then the file has no 
> self-describing way of differentiating the two - this should be addressed 
> somewhere (maybe have a different magic word for binary/JSON content).
> 5. Protocol Definition has a namespace and name (called protocol), but it is 
> not clear whether the namespace rules defined in the first section apply here 
> or not. It should be mentioned explicitly either way.
> 6.It would be extremely helpful to have a full sample of an RPC call over 
> HTTP, possibly using the HelloWorld protocol from the previous example. This 
> would show how the transport, framing, handshake, call format and messages 
> all fit together. Examples in RFCs often help clarify any misunderstandings 
> that might arise from the body of the specs, which makes for a better spec - 
> and this would be great here too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to