yeah, it was set to `latest` so prob it was started later than the
producer. In any case, I did not encounter it again. tx

On Thu, Jan 12, 2017 at 3:00 AM, Koji Kawamura <ijokaruma...@gmail.com>
wrote:

> Hello Raf,
>
> If I understand it correctly, there are two topics, lets's say topic A
> and B, and the flow seems like:
>
> 1. ProduceKafka -> topic A
> 2. External process consumes from topic A, process data and push those
> to topic B
> 3. topic B -> ConsumeKafka
>
> I tested a flow looks like above, and the last ConsumeKafka received
> expected number of flow files, and status looks fine.
> Details and result are available here:
> https://gist.github.com/ijokarumawak/29a568f760cb24fb94ae3384037ab9e7
>
> A question to rule out a possible cause,
> Had the ConsumeKafka processor at step-3 already started when the
> External process started sending messages to topic B?
>
> If it started after that, then ConsumeKafka will start receiving
> messages from the 'latest' offset at that point by default (can be
> configured by 'offset reset' property). For example, if ConsumeKafka
> started after External process published 807 messages, then
> ConsumeKafka can only consume 700 messages those are sent after
> ConsumeKafka is connected to the topic.
>
> Otherwise, if it's reproducible, please let us know of your
> environment detail, such as NiFi clustered/standalone, number of Kafka
> partition ... etc.
>
> Thanks,
> Koji
>
> On Wed, Jan 11, 2017 at 10:17 PM, Raf Huys <raf.h...@gmail.com> wrote:
> > With the Produce_Kafka_0_10 processor 1507 flowfiles are pushed onto a
> > topic.
> >
> > A separate process mangles a bit with these records and pushes them back
> on
> > a separate topic. With the Consume_Kafka_0_10 processor I'm ingesting
> this
> > second topic again.
> >
> > According to Kafka, this Nifi consumer is at offset 1507 (so there's no
> > lag). The total amount of processing time was around 3 minutes (less
> than 5
> > for sure).
> >
> > Weird thing is, I cannot correlate these number in the Nifi consumer. The
> > status history of the processor returns an aggregated sum of 700
> flowfiles
> > out (all flowfiles are transferred to a logging processor). Data
> provenance
> > has 700 messages as well.
> >
> > `Flowfiles in` is 0 (I suppose because there's no ingoing arrow on the
> > consumer) so that's not of any use.
> >
> > `Message Demarcator` is not set, `Max Poll Records` 1000, `Max
> Uncommitted
> > Time` 1 s have default values, so every incoming message should result
> in 1
> > flowFile to my understanding.
> >
> > Not really sure what to think of this. Are there any other places i can
> look
> > for the total amount of flowfiles (in/out) during a certain amount of
> time?
> >
> >
> > --
> > Mvg,
> >
> > Raf Huys
>



-- 
Mvg,

Raf Huys

Reply via email to