Completely agreed on the acking. The reason I posed the question to begin
with was because, while I believe dropping+acking is the correct
functionality, I could see a few alternative patterns for handling this:

   1. Require filtering to be handled by the message filter infrastructure
   and publish an error to the error queue if field transformations such as
   REGEX_SELECT violate this by dropping messages.
   2. Default records to be written to enrichments, or handle per my
   comments in #1
   3. Default records to be written to the topic defined by outputTopic
   (non-default version of #2)

At any rate, we should fix the acking problem and then the dropped messages
pattern makes sense to me. I've created a Jira to track it -
https://issues.apache.org/jira/browse/METRON-1948.

On Wed, Dec 19, 2018 at 12:43 PM Casey Stella <ceste...@gmail.com> wrote:

> We absolutely should be acking the dropped messages otherwise they'll be in
> a replay loop.  Not acking is a flat-out bug IMO.
>
> On Wed, Dec 19, 2018 at 2:37 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
> > When a message is filtered by the message filtering mechanism, we
> > explicitly drop the message (and presumably ack it in Storm), as
> explained
> > here -
> >
> >
> https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#filtered
> > .
> > When using the REGEX_SELECT field transformation (see here -
> >
> >
> https://github.com/apache/metron/tree/master/metron-platform/metron-parsing#fieldtransformation-configuration
> > )
> > with the kafka.topicField option for parser-chaining, it's unclear to me
> > whether we expect the same behavior (drop message, ack it). The
> > interpretation I get from this example in the parser-chaining doc
> >
> >
> https://github.com/apache/metron/tree/master/use-cases/parser_chaining#the-pix_syslog_router-parser
> > suggests to me that the approach we take for messages with message
> > filtering is the correct one, however in testing an example with dropped
> > messages, we appear not to ack those dropped messages.
> >
> > Before I go creating a fix I thought it best to summarize and confirm my
> > expectations on this functionality. Messages from a REGEX_SELECT that
> don't
> > match a pattern, and therefore don't get a value assigned to their output
> > topic value, should be dropped and acked.
> >
> > *Example:*
> > {
> > "parserClassName": "org.apache.metron.parsers.GrokParser",
> >         "sensorTopic": "myInTopic",
> > ...
> >         "parserConfig": {
> > ...,
> > "kafka.topicField": "output_topic"
> > },
> > "fieldTransformations": [
> > {
> > "input": [
> > "message"
> > ],
> > "output": [
> > "output_topic"
> > ],
> > "transformation": "REGEX_SELECT",
> > "config": {
> > "world": "^Hello "
> > }
> > },
> > ...
> > }
> >
> > *Input Records:*
> > "...sshd[32469]: Hello..."
> > "...sshd[30432]: Bye..."
> >
> > *Output:*
> > Kafka topic = "world" (as determined by the REGEX_SELECT pattern match
> that
> > sets the "output_topic" property used by kafka.topicField)
> > 1 record present
> > contents of that record = our record with "Hello" in it
> > 1 record is dropped ("Bye" record) and will not be forwarded any further
> > through the pipeline.
> >
>

Reply via email to