yeah, it was set to `latest` so prob it was started later than the producer. In any case, I did not encounter it again. tx
On Thu, Jan 12, 2017 at 3:00 AM, Koji Kawamura <ijokaruma...@gmail.com> wrote: > Hello Raf, > > If I understand it correctly, there are two topics, lets's say topic A > and B, and the flow seems like: > > 1. ProduceKafka -> topic A > 2. External process consumes from topic A, process data and push those > to topic B > 3. topic B -> ConsumeKafka > > I tested a flow looks like above, and the last ConsumeKafka received > expected number of flow files, and status looks fine. > Details and result are available here: > https://gist.github.com/ijokarumawak/29a568f760cb24fb94ae3384037ab9e7 > > A question to rule out a possible cause, > Had the ConsumeKafka processor at step-3 already started when the > External process started sending messages to topic B? > > If it started after that, then ConsumeKafka will start receiving > messages from the 'latest' offset at that point by default (can be > configured by 'offset reset' property). For example, if ConsumeKafka > started after External process published 807 messages, then > ConsumeKafka can only consume 700 messages those are sent after > ConsumeKafka is connected to the topic. > > Otherwise, if it's reproducible, please let us know of your > environment detail, such as NiFi clustered/standalone, number of Kafka > partition ... etc. > > Thanks, > Koji > > On Wed, Jan 11, 2017 at 10:17 PM, Raf Huys <raf.h...@gmail.com> wrote: > > With the Produce_Kafka_0_10 processor 1507 flowfiles are pushed onto a > > topic. > > > > A separate process mangles a bit with these records and pushes them back > on > > a separate topic. With the Consume_Kafka_0_10 processor I'm ingesting > this > > second topic again. > > > > According to Kafka, this Nifi consumer is at offset 1507 (so there's no > > lag). The total amount of processing time was around 3 minutes (less > than 5 > > for sure). > > > > Weird thing is, I cannot correlate these number in the Nifi consumer. The > > status history of the processor returns an aggregated sum of 700 > flowfiles > > out (all flowfiles are transferred to a logging processor). Data > provenance > > has 700 messages as well. > > > > `Flowfiles in` is 0 (I suppose because there's no ingoing arrow on the > > consumer) so that's not of any use. > > > > `Message Demarcator` is not set, `Max Poll Records` 1000, `Max > Uncommitted > > Time` 1 s have default values, so every incoming message should result > in 1 > > flowFile to my understanding. > > > > Not really sure what to think of this. Are there any other places i can > look > > for the total amount of flowfiles (in/out) during a certain amount of > time? > > > > > > -- > > Mvg, > > > > Raf Huys > -- Mvg, Raf Huys