New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-24 Thread Stevo Slavić
Hello Apache Kafka community,

Say there is only one topic with single partition and a single message on
it.
Result of calling a poll with new consumer will return ConsumerRecord for
that message and it will have offset of 0.

After processing message, current KafkaConsumer implementation expects one
to commit not offset 0 as processed, but to commit offset 1 - next
offset/position one would like to consume.

Does this sound strange to you as well?

Wondering couldn't this offset+1 handling for next position to read been
done in one place, in KafkaConsumer implementation or broker or whatever,
instead of every user of KafkaConsumer having to do it.

Kind regards,
Stevo Slavic.


Re: New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-24 Thread Stevo Slavić
Sorry, wrong ML.

On Fri, Jul 24, 2015 at 7:07 PM, Cody Koeninger c...@koeninger.org wrote:

 Are you intending to be mailing the spark list or the kafka list?

 On Fri, Jul 24, 2015 at 11:56 AM, Stevo Slavić ssla...@gmail.com wrote:

 Hello Cody,

 I'm not sure we're talking about same thing.

 Since you're mentioning streams I guess you were referring to current
 high level consumer, while I'm talking about new yet unreleased high level
 consumer.

 Poll I was referring to is
 https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L715

 ConsumerRecord offset I was referring to is
 https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecord.java#L22

 poll returns ConsumerRecords, per TopicPartition collection of
 ConsumerRecord. And in example I gave ConsumerRecord offset would be 0.

 Commit I was referring to is
 https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L805

 After processing read ConsumerRecord, commit expects me to submit not
 offset 0 but offset 1...

 Kind regards,
 Stevo Slavic

 On Fri, Jul 24, 2015 at 6:31 PM, Cody Koeninger c...@koeninger.org
 wrote:

 Well... there are only 2 hard problems in computer science: naming
 things, cache invalidation, and off-by-one errors.

 The direct stream implementation isn't asking you to commit anything.
 It's asking you to provide a starting point for the stream on startup.

 Because offset ranges are inclusive start, exclusive end, it's pretty
 natural to use the end of the previous offset range as the beginning of the
 next.


 On Fri, Jul 24, 2015 at 11:13 AM, Stevo Slavić ssla...@gmail.com
 wrote:

 Hello Apache Kafka community,

 Say there is only one topic with single partition and a single message
 on it.
 Result of calling a poll with new consumer will return ConsumerRecord
 for that message and it will have offset of 0.

 After processing message, current KafkaConsumer implementation expects
 one to commit not offset 0 as processed, but to commit offset 1 - next
 offset/position one would like to consume.

 Does this sound strange to you as well?

 Wondering couldn't this offset+1 handling for next position to read
 been done in one place, in KafkaConsumer implementation or broker or
 whatever, instead of every user of KafkaConsumer having to do it.

 Kind regards,
 Stevo Slavic.







Re: New consumer - offset one gets in poll is not offset one is supposed to commit

2015-07-24 Thread Stevo Slavić
Hello Cody,

I'm not sure we're talking about same thing.

Since you're mentioning streams I guess you were referring to current high
level consumer, while I'm talking about new yet unreleased high level
consumer.

Poll I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L715

ConsumerRecord offset I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerRecord.java#L22

poll returns ConsumerRecords, per TopicPartition collection of
ConsumerRecord. And in example I gave ConsumerRecord offset would be 0.

Commit I was referring to is
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L805

After processing read ConsumerRecord, commit expects me to submit not
offset 0 but offset 1...

Kind regards,
Stevo Slavic

On Fri, Jul 24, 2015 at 6:31 PM, Cody Koeninger c...@koeninger.org wrote:

 Well... there are only 2 hard problems in computer science: naming things,
 cache invalidation, and off-by-one errors.

 The direct stream implementation isn't asking you to commit anything.
 It's asking you to provide a starting point for the stream on startup.

 Because offset ranges are inclusive start, exclusive end, it's pretty
 natural to use the end of the previous offset range as the beginning of the
 next.


 On Fri, Jul 24, 2015 at 11:13 AM, Stevo Slavić ssla...@gmail.com wrote:

 Hello Apache Kafka community,

 Say there is only one topic with single partition and a single message on
 it.
 Result of calling a poll with new consumer will return ConsumerRecord for
 that message and it will have offset of 0.

 After processing message, current KafkaConsumer implementation expects
 one to commit not offset 0 as processed, but to commit offset 1 - next
 offset/position one would like to consume.

 Does this sound strange to you as well?

 Wondering couldn't this offset+1 handling for next position to read been
 done in one place, in KafkaConsumer implementation or broker or whatever,
 instead of every user of KafkaConsumer having to do it.

 Kind regards,
 Stevo Slavic.





Spark-core and guava

2015-03-26 Thread Stevo Slavić
Hello Apache Spark community,

spark-core 1.3.0 has guava 14.0.1 as provided dependency (see
http://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.3.0/spark-core_2.10-1.3.0.pom
)

What is supposed to provide guava, and that specific version?

Kind regards,
Stevo Slavic.


Re: Spark-core and guava

2015-03-26 Thread Stevo Slavić
Thanks for heads up Sean!
On Mar 26, 2015 1:30 PM, Sean Owen so...@cloudera.com wrote:

 This is a long and complicated story. In short, Spark shades Guava 14
 except for a few classes that were accidentally used in a public API
 (Optional and a few more it depends on). So provided is more of a
 Maven workaround to achieve a desired effect. It's not provided in
 the usual sense.

 On Thu, Mar 26, 2015 at 12:24 PM, Stevo Slavić ssla...@gmail.com wrote:
  Hello Apache Spark community,
 
  spark-core 1.3.0 has guava 14.0.1 as provided dependency (see
 
 http://repo1.maven.org/maven2/org/apache/spark/spark-core_2.10/1.3.0/spark-core_2.10-1.3.0.pom
  )
 
  What is supposed to provide guava, and that specific version?
 
  Kind regards,
  Stevo Slavic.