Ottomata added a comment.

@Joe,  there are two parts to this MVP:

- Centralized (and CI controlled) schema sharing
- An easy way to get valid data into Kafka.

With eventlogging right now, we are spending a lot of resources just processing 
incoming data and making sure it is valid.  When data is produced by N clients, 
it is difficult to guarantee that all of them are producing valid data.  
Invalid data in a stream can make processing and sanity checking difficult, 
especially in a distributed environment like Hadoop.

The goal of this system is to standardize the way we pass messages between 
different infrastructures.  For that we need strict schemas and schema 
validation.  For lowish volume production data (job queue, change propagation, 
etc.), using a proxy that can tell a client that it has sent invalid data is 
valuable.  If a client tries to produce invalid data, they will get a 500 
response of some kind, and can react appropriately.

You are right though, we could achieve a similar thing if we built Kafka 
wrappers that new how to use our centralized schema system in all languages.  
Then each wrapper could validate the message a client is trying to send before 
it actually sends.  I think this could potentially introduce more bugs, as 
there is more actual code to maintain, as well as more places that the code is 
deployed.  E.g. we'd have to make sure that all clients updated their wrapper 
library if we fix a bug.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ottomata
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, 
Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, 
mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to