On Thu, 16 Jan 2020, 18:59 Zoltan Farkas, <zolyfar...@yahoo.com> wrote:

> answers inline
>
> On Jan 16, 2020, at 5:51 AM, roger peppe <rogpe...@gmail.com> wrote:
>
> On Wed, 15 Jan 2020 at 18:51, Zoltan Farkas <zolyfar...@yahoo.com> wrote:
>
>> What I mean with timestamp-micros, is that it is currently restricted to
>> being bound to long,
>> I see no reason why it should not be allowed to be bound to string as
>> well. (the change should be simple to implement)
>>
>
> Wouldn't have the implication of changing the binary representation too,
> which is not necessarily desirable (it's bulkier, slower to decode and has
> more potential error cases) ?
>
>
> yes, it would, but this is how logical types work, and I see no good way
> to change this.  (this is what i meant by paying the readability cost in
> place where it is irrelevant)
>

So you think that the JSON representation should always match the
underlying type and ignore the logical type? I can understand the reasoning
behind that, but it doesn't feel very user friendly in some cases (thinking
of decimal and duration in particular).

Given their privileged place in the specification, I was thinking that some
logical types could gain privilege here.

Aside: I'm a bit concerned about the potential for data corruption from
interchange between timestamp-micros and timestamp-millis, which, as far as
understand the spec, look like they'll be treated as compatible with each
other.


>
>
>> regarding the media type, something like: application/avro.2+json would
>> be fine.
>>
>
> Attaching the ".2" to "avro" rather than "json" seems to be implying a new
> Avro version, rather than a new JSON-encoding version? Or is the idea that
> the version number here is implying both the JSON-encoding version *and* the
> underlying Avro version?  The MIME standard seems to be silent on this
> AFAICS.
>
>
> the reason why I would use +json at the end is because it would be a
> subtype sufix: https://en.wikipedia.org/wiki/Media_type#Suffix and most
> browsers will recognize it as json, and potentially format it...
>

Ah, nice, I wasn't aware of RFC 6838.

>
>
>> Other then that the proposal looks good. can you start a PR with the spec
>> update?
>>
>
> I can do, but I don't hold out much hope of it getting merged. I started a
> PR with a much more minor change <https://github.com/apache/avro/pull/738>
> almost 2 months ago and haven't seen any response yet.
>
>
> Send out a email on the dev mailing list, the committers seem more
> responsive lately...
>

I'll give it a go :)

  cheers,
    rog.

>
>
>   cheers,
>     rog.
>
>>
>> —Z
>>
>> On Jan 15, 2020, at 12:30 PM, roger peppe <rogpe...@gmail.com> wrote:
>>
>> On Wed, 15 Jan 2020 at 16:27, Zoltan Farkas <zolyfar...@yahoo.com> wrote:
>>
>>> See comments in-line below:
>>>
>>> On Jan 15, 2020, at 3:42 AM, roger peppe <rogpe...@gmail.com> wrote:
>>>
>>> Oops, I left arrays out! Two other thoughts:
>>>
>>>
>>>    - I wonder if it might be worth hedging bets about logical types. It
>>>    would be nice if (for example) a `timestamp-micros` value could be 
>>> encoded
>>>    as an RFC3339 string, so perhaps that should be allowed for, but maybe
>>>    that's a step too far.
>>>
>>> I think logical types should should stay above the encoding/decoding…
>>> With timestamp-micros we could extend it to make it applicable to string
>>> and implement the converters, and then in json you would have something
>>> readable, but you would then have the same in binary and pay the
>>> readability cost there as well.
>>>
>>
>> I'm not sure what you mean there. I wouldn't expect the Avro binary
>> format to be readable at all.
>>
>> I implemented special handling for decimal logical type in my
>>> encoder/decoder, but the best implementation I could do still feels like a
>>> hack...
>>>
>>>
>>>    - I wonder if there should be some indication of version so that you
>>>    know which JSON encoding version you're reading. Perhaps the Avro schema
>>>    could include a version field (maybe as part of a definition) so you know
>>>    which version of the spec to use when encoding/decoding. Then bet-hedging
>>>    wouldn't be quite as important.
>>>
>>> I think Schema needs to stay decoupled from the encoding. The same
>>> schema can be encoded in various ways (I have a csv encoder/decoder for
>>> example, https://demo.spf4j.org/example/records?_Accept=text/csv ).
>>> I think the right abstraction for what you are looking for is the Media
>>> Type(https://en.wikipedia.org/wiki/Media_type ),
>>> It would be helpful to “standardize” the media types for the avro
>>> encodings:
>>>
>>
>> Yes, on reflection, I agree, even though not every possible medium has a
>> media type. For example, what if we're storing JSON data in a file? I guess
>> it would be up to us to store the type along with the data, as the registry
>> message wire format
>> <https://docs.confluent.io/current/schema-registry/serializer-formatter.html#wire-format>
>> does, for example by wrapping the entire value in another JSON object.
>>
>>
>>> Here is what I mean, (with some examples where the same schema is served
>>> with different encodings):
>>>
>>> 1) Binary: “application/avro”
>>> https://demo.spf4j.org/example/records?_Accept=application/avro
>>> 2) Current Json: “application/avro+json"
>>> https://demo.spf4j.org/example/records?_Accept=application/avro-x%2Bjson
>>> <https://demo.spf4j.org/example/records?_Accept=application/avro+json>
>>> 3) New Json: “application/avro-x+json” ?
>>> https://demo.spf4j.org/example/records?_Accept=application/avro-x%2Bjson
>>> <https://demo.spf4j.org/example/records?_Accept=application/avro+json>
>>>
>>
>> ISTM that "x" isn't a hugely descriptive qualifier there. How about
>> "application/avro+json.v2" ? Then it's clear what to do if we want to make
>> another version.
>>
>>
>>
>>> The media type including the avro schema (like you can see in the
>>> response ContentType in the headers above) can provide complete type
>>>  information to be able to read a avro object from a byte stream.
>>>
>>>
>>> application/avro-x+json;avsc="{\"type\":\"array\",\"items\":{\"$ref\":\"org.spf4j.demo:jaxrs-spf4j-demo-schema:0.8:b\"}}”
>>>
>>> In HTTP context this fits well with content negotiation, and a client
>>> can ask for a previous version like:
>>>
>>>
>>> https://demo.spf4j.org/example/records/1?_Accept=application/json;avsc=%22{\%22$ref\%22:\%22org.spf4j.demo:jaxrs-spf4j-demo-schema:0.4:b\%22}%22
>>> <https://demo.spf4j.org/example/records/1?_Accept=application/json;avsc=%22%7B%5C%22$ref%5C%22:%5C%22org.spf4j.demo:jaxrs-spf4j-demo-schema:0.4:b%5C%22%7D%22>
>>>
>>>
>>
>>> Note on $ref,  it is an extension to avsc I use to reference schemas
>>> from maven repos. (see
>>> https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroReferences if
>>> interested in more detail)
>>>
>>
>> Interesting stuff. I like the idea of being able to get the server to
>> check the desired client encoding, although I'm somewhat wary of the
>> potential security implications of $ref with arbitrary URLs.
>>
>> Apart from the issues you raised, does my description of the proposed
>> semantics seem reasonable? It could be slightly cleverer and avoid
>> type-name wrapping in more situations, but this seemed like a nice balance
>> between easy-to-explain and idiomatic-in-most-situations.
>>
>>    cheers,
>>      rog.
>>
>>
>>
>

Reply via email to