Re: kafka direct streaming python API fromOffsets

2016-05-03 Thread Saisai Shao
I guess the problem is that py4j automatically translate the python int into java int or long according to the value of the data. If this value is small it will translate to java int, otherwise it will translate into java long. But in java code, the parameter must be long type, so that's the

Re: kafka direct streaming python API fromOffsets

2016-05-03 Thread Tigran Avanesov
Thank you, But now I have this error: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long My offsets are actually not big enough to be long. If I put bigger values, I have no such exception. For me looks like a bug. Any ideas for a workaround? Thank! On

Re: kafka direct streaming python API fromOffsets

2016-05-02 Thread Cody Koeninger
If you're confused about the type of an argument, you're probably better off looking at documentation that includes static types: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.kafka.KafkaUtils$ createDirectStream's fromOffsets parameter takes a map from

kafka direct streaming python API fromOffsets

2016-05-02 Thread Tigran Avanesov
Hi, I'm trying to start consuming messages from a kafka topic (via direct stream) from a given offset. The documentation of createDirectStream says: :param fromOffsets: Per-topic/partition Kafka offsets defining the (inclusive) starting point of the stream. However it expects a dictionary