There is a way to wire the system to bypass enrichment and profiling, but you would then bypass a lot of key features of the system. It would be unwise to do that.
25.06.2018, 15:13, "Michel Sumbul" <michelsum...@gmail.com>: > Hi Casey, > > Thats make completely sense. > Short question, if there is no enrichment or no profiling, does the message > still pass through the enrichment/profiling topic? > > If yes, do you think its possible to imagine a way that for messages that > doesn't need enrichment or profiling to skip the topic and to go directly > to the next one? This is again to avoid in/out in kafka. > > Thanks for the explaination, > Michel > > 2018-06-23 3:58 GMT+01:00 Casey Stella <ceste...@gmail.com>: > >> Hey Michel, >> >> Those are good questions and there were some reasons surrounding that. In >> fact, historically, we had fewer topologies (e.g. indexing and enrichment >> were merged). Even earlier on, we had just one giant topology per parser >> that enriched and indexed. The long story short is that we moved this way >> because we saw how people were using metron and we gained more insight >> tuning Metron. That led us down this architectural path. >> >> Some of the reasons that we went this way: >> >> - Fewer large topologies were a nightmare to tune >> - Enrichment would have different memory requirements than, say, >> parsers or indexing >> - You can adjust the kafka topic params per topology to adjust the >> number of partitions, etc. >> - Having the separate topologies gives a natural set of extension points >> for customization and enhancement (e.g. you want a phase between parsing >> and enrichment). >> - Decoupling the topologies lets us spin up and down parts of Metron >> without affecting others (e.g. you don't have to take down enrichments >> to >> add a parser, even for a moment) >> - The movement to Flux meant we were limited in how much we could adjust >> the topology at runtime (e.g. colocating parsers and enrichment would >> mean >> moving away from flux essentially as the topology changes its structure) >> >> Best, >> >> Casey >> >> On Fri, Jun 22, 2018 at 5:25 PM Michel Sumbul <michelsum...@gmail.com> >> wrote: >> >> > Hi Everyone, >> > >> > I was asking myself what was the architectural reason to split the >> > ingestion in metron in 4 differents toppologies that all read/write to >> > kafka? >> > >> > For example, why the parsing and enrichment topologies have not been >> > merged? Would it not be possible when you parse the message to directly >> > enricht it? >> > >> > Im asking that because splitting in several topologies means that all of >> > the topologies read/write to Kafka, which produce a bigger load on the >> > kafka cluster and then a need for way more infrastructure/servers. The >> cost >> > is especially true when we speak about TBs of data ingested every day. >> > >> > Im sure there were a very good reason, I was just curious. >> > >> > Thanks, >> > Michel >> > ------------------- Thank you, James Sirota PMC- Apache Metron jsirota AT apache DOT org