Ottomata added a comment.

> > If we adopt a convention of always storing schema name and/or revision in 
> > the schemas themselves, then we can do like EventLogging does and infer and 
> > validate the schema based on this value. This would especially be helpful 
> > in associating a message with an Avro Schema when serializing into binary.

> 

> 

> The topic configuration will take precedence, so we wouldn't use 
> client-supplied values for these fields, and would basically just write a 
> part of the topic configuration into each event. We also decided that we will 
> only evolve schemas in backwards-compatible ways. In practice, this means 
> that we'll only add fields, and the latest schema will be able to validate 
> both new and old data in each topic.

> 

> @Ottomata, which value do you see in recording the schema configured for a 
> topic at enqueue time in each event?


Mainly for analytics purposes.  For historical data and other analytics 
contexts, the data may be analyzed much farther down the line than from Kafka.  
In those contexts, the information about which topic the event came from will 
be lost.  If we don't have to topic, we won't be able to know which schema the 
event was validated with.

Also, it will be cumbersome to always need to load the topic/schema config from 
the schema repository for analytics purposes.


TASK DETAIL
  https://phabricator.wikimedia.org/T116247

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mobrovac, Ottomata
Cc: intracer, EBernhardson, Smalyshev, yuvipanda, Hardikj, daniel, aaron, 
GWicke, mobrovac, MZMcBride, bd808, JanZerebecki, Halfak, Krenair, brion, 
chasemp, Eevans, mmodell, Ottomata, Mattflaschen, Matanya, Aklapper, 
JAllemandou, jkroll, Wikidata-bugs, Jdouglas, aude, Deskana, Manybubbles, 
RobLa-WMF, jeremyb



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to