[
https://issues.apache.org/jira/browse/AVRO-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216629#comment-13216629
]
graham sanderson commented on AVRO-1006:
----------------------------------------
"A clarification, which addresses issues raised by Doug and Scott. The need I'm
solving for is to capture that part of a writer's schema which a reader needs
to read data. This is a relatively straight-forward notion of "equivalence,"
and a very useful one. And the good news is that this notion of equivalence
allows us to ignore many aspects of schemas (e.g., attributes, aliases, default
values)."
Perhaps this should be made clearer (when naming the class/method), I came
across this feature because of a desire to hash/fingerprint avro schemas for
messaging, and was seeing if there was already a util to do it. In my case I
potentially might use custom properties on fields in the schema to indicate
they are being transmitted using a certain named dictionary and thus in my case
they affect the ability to interpret the message, so I'd rather stick with
something that I can reliably use on the producer end to encode the entire
state of the schema, rather than a particular well defined sub-set of the
schema.
Note that (thanks to someone making Props a LinkedHashMap since the code base
I'm using) and the particular implementation of Jackson, schema.toString() in
the Java impl appears like it will be fine for my purposes, and if another
language implementation happens to produce a different hash value I'm cool with
that, as long as it is relatively stable; for example:
SchemaInstance1 -toJson-> string x
string x -fromJson-> SchemInstance2 -> toJson string y
string x and string y being equal seems a reasonable enough guarantee for me
> Fingerprints for Avro Schemas
> -----------------------------
>
> Key: AVRO-1006
> URL: https://issues.apache.org/jira/browse/AVRO-1006
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Raymie Stata
> Assignee: Raymie Stata
> Labels: features
> Attachments: AVRO-1006-prelim.patch, AVRO-1006.patch,
> AVRO-1006.patch, schema-fingerprinting.html, schema-fingerprinting.html,
> schema-fingerprinting.html
>
>
> Add function that returns a standardized, 64-bit fingerprint for schemas.
> Fingerprints are designed such that the chances of collisions is very, very
> low.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira