[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110082#comment-15110082 ]
Mark Grover commented on SPARK-12177: ------------------------------------- Hi Mario, I may have misunderstood some parts of your previous comment and if so, I apologize in advance. bq. i think that is not required when client uses a v0.9 jar though consuming only the older high level/low level API and talking to a v0.8 kafka cluster. Based on what I understand, that's not the case. If one uses the kafka v9 jar even when using the old consumer API, it can only work a Kafka v9 broker. So, if we have to support both Kafka v08 and Kafka v09 brokers with Spark (which I believe we do), we have to have both Kafka v08 and Kafka v09 jars in our assembly. As far as I understand, simply having Kafka v09 jar only will not help. bq. 1 thought around not introducing the version in the package name or class name (I see that Flink does it in the class name) was to avoid forcing us to create v0.10/v0.11 packages (and customers to change code and recompile), even if those releases of kafka don’t have client-api’ or otherwise such changes that warrant us to make a new version I totally agree with you on this note. I was actually thinking of renaming all the v09 packages to be something different (like 'new'? But may be there's a better term) because as very aptly pointed out that it would be very confusing as we support later kafka versions. bq. That’s why 1 earlier idea i mentioned in this JIRA was 'The public API signatures (of KafkaUtils in v0.9 subproject) are different and do not clash (with KafkaUtils in original kafka subproject) and hence can be added to the existing (original kafka subproject) KafkaUtils class.’ This also addresses the issues u mention above. Cody mentioned that we need to get others on the same page for this idea, so i guess we really need the committers to chime in here. Of course i forgot to answer’s Nikita’s followup question - 'do you mean that we would change the original KafkaUtils by adding new functions for new DirectIS/KafkaRDD but using them from separate module with kafka09 classes’ ? To be clear, these new public methods added to original kafka subproject’s ‘KafkaUtils' ,will make use of DirectKafkaInputDStream,KafkaRDD,KafkaRDDPartition,OffsetRange classes that are in a new v09 package (internal of course). In short we don’t have a new subproject. (I skipped class KafkaCluster class from the list, because i am thinking it makes more sense to call this class something like 'KafkaClient' instead going forward) At the core of it, I am not 100% sure if we can hide/abstract the fact away from our users that we have completely changed the consumer API from underneath us. I can think more about it but would appreciate more thoughts/insights along this direction, especially if you feel strongly about this. Thanks again, Mario! > Update KafkaDStreams to new Kafka 0.9 Consumer API > -------------------------------------------------- > > Key: SPARK-12177 > URL: https://issues.apache.org/jira/browse/SPARK-12177 > Project: Spark > Issue Type: Improvement > Components: Streaming > Affects Versions: 1.6.0 > Reporter: Nikita Tarasenko > Labels: consumer, kafka > > Kafka 0.9 already released and it introduce new consumer API that not > compatible with old one. So, I added new consumer api. I made separate > classes in package org.apache.spark.streaming.kafka.v09 with changed API. I > didn't remove old classes for more backward compatibility. User will not need > to change his old spark applications when he uprgade to new Spark version. > Please rewiew my changes -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org