[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374872#comment-16374872
 ] 

Sönke Liebau commented on SPARK-18057:
--

Alright.
I've got a 12h flight to SF ahead of me on Sunday, will put this on the "things 
to keep me from dying of boredom" list and have a look at it.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-22 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373611#comment-16373611
 ] 

Michael Armbrust commented on SPARK-18057:
--

My only concern is that it is stable and backwards compatible.  I'm fine with 
skipping / waiting.

Regarding the timeline, we can put this in master whenever, but we typically 
don't change dependencies in point releases so this will need to be targeted at 
Spark 2.4.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372627#comment-16372627
 ] 

Sönke Liebau commented on SPARK-18057:
--

Fine with me :)

Any thoughts on which version we want to upgrade to?  I guess we could skip 
0.11 entirely, like was done with 0.9 and make the jump to 1.0 - though I 
really do think that a lot of people are waiting for the delegation token 
feature in 1.1 which is just around the corner, so we might as well wait a 
little while and go straight to that version.
 
Thoughts?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-21 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372138#comment-16372138
 ] 

Michael Armbrust commented on SPARK-18057:
--

We generally tend towards "don't break things that are working for people" 
rather than "clean".  See the RDD API for an example :).

I'm increasingly pro just keeping the name and upgrading the client.  If they 
ever break compatibility again we can have yet another artifact name, but I 
hope it doesn't come to that.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371174#comment-16371174
 ] 

Sönke Liebau commented on SPARK-18057:
--

I agree that we should not rename the existing kafka_10 package, as that would 
probably cause people loads of pain.

It is however tempting to have "clean" naming - would it be an option to add a 
package simply called kafka and update the kafka version in that? We could keep 
the kafka_10 package around for now but deprecate it at some point in time.

I am a bit on the fence about this, as in principle the current Kafka story is: 
"Any client with a version of 0.10.2.0 or later will support brokers of version 
0.10.x or later, so simply upgrading the kafka version in the existing package 
should not break anything, as 0.9.x is not currently supported anyway. 
However there is a caveat:
{code:java}
If the burden of backwards compatibility becomes too large, at some point we 
may need to break it.
{code}
So there is the possibility of the kafka_10 package becoming relevant again if 
later Kafka versions stop supporting 0.10.x brokers.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370924#comment-16370924
 ] 

Cody Koeninger commented on SPARK-18057:


Just doing the upgrade is probably a good starting point for any
potential new contributor who's interested in Kafka.

Agreed that time indexes would be a great thing to take advantage of
after that, happy to help out with either.

On Tue, Feb 20, 2018 at 6:17 PM, Michael Armbrust (JIRA)


> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370780#comment-16370780
 ] 

Michael Armbrust commented on SPARK-18057:
--

+1 to upgrading and it would also be great to add support for any new features 
(i.e. starting a query based on the time index rather than a specific offset).

I personally don't think that fixing KAFKA-4897 is mandatory, but keeping our 
stress tests running without hanging or losing coverage is.

Regarding naming, I'd probably just stop changing the name and say that 
"kafka-0-10-sql" works with any broker that is 0.10.0+.  We could also get rid 
of it, but that seems like an unnecessary change to me that just causes 
unnecessary pain to existing users.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370642#comment-16370642
 ] 

Sönke Liebau commented on SPARK-18057:
--

I think in addition to the naming convention issue there was also the question 
of whether or not to wait for 
[KAFKA-4897|https://issues.apache.org/jira/browse/KAFKA-4879] which as I 
understand it causes a stress test in Spark to hang indefinitely. However I do 
not think that that ticket will get fixed anytime soon, its currently assigned 
to the 2.0 release which I believe is scheduled to land around October, but 
that will change of course if no one works on it..
Since repeatedly deleting an recreating topics is not really a common use case 
I'd be in favor of moving forward with updating the version regardless.

On the naming convention, as [~guozhang] mentioned, with 
[KAFKA-4462|https://issues.apache.org/jira/browse/KAFKA-4462] now merged the 
strict dependence on binary protocol versions that Kafka used to impose have 
been lifted to a large degree, so I think an argument could be made that we 
don't need the Kafka version in the artifact name any more.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370460#comment-16370460
 ] 

Cody Koeninger commented on SPARK-18057:


My guess is that DStream based integrations aren't really on committer's minds.

Happy to help, given clear direction on what artifact naming is likely
to be accepted.



> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370434#comment-16370434
 ] 

Sönke Liebau commented on SPARK-18057:
--

The upcoming release of Kafka 1.1 will support delegation tokens, which would 
be a very nice addition to the Spark Streaming Kafka connector - is someone 
actively looking at this currently?


> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-01-30 Thread John Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345963#comment-16345963
 ] 

John Cheng commented on SPARK-18057:


Apache Kafka is now at version 1.0. For people who want to use Spark streaming 
against Kafka brokers on 1.0.0, it is preferable to use the 
`org.apache.kafka:kafka-clients:jar:1.0.0` client.

 

"Most of the discussion on the performance impact of [upgrading to the 0.10.0 
message 
format|https://kafka.apache.org/0110/documentation.html#upgrade_10_performance_impact]
 remains pertinent to the 0.11.0 upgrade. This mainly affects clusters that are 
not secured with TLS since "zero-copy" transfer is already not possible in that 
case. In order to avoid the cost of down-conversion, you should ensure that 
consumer applications are upgraded to the latest 0.11.0 client."

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>Priority: Major
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-30 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070525#comment-16070525
 ] 

Michael Armbrust commented on SPARK-18057:
--

We should upgrade.  Now that Kafka has a good protocol versioning story, I also 
wonder if we should get rid of the version in our artifacts entirely.  When we 
upgrade it would also be good if we can add the new headers to the row that we 
output.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-29 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069300#comment-16069300
 ] 

Helena Edelson commented on SPARK-18057:


IMHO kafka-0-11 to be explicit and wait until kafka 0.11.1.0 which per 
https://issues.apache.org/jira/browse/KAFKA-4879 resolves the last blocker to 
upgrading?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-29 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068980#comment-16068980
 ] 

Cody Koeninger commented on SPARK-18057:


Kafka 0.11 is now released.  

Are we upgrading spark artifacts named kafka-0-10 to use kafka 0.11, or are we 
renaming them to kafka-0-11?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16001474#comment-16001474
 ] 

Shixiong Zhu commented on SPARK-18057:
--

[~helena_e] I didn't mean for Spark. Even in Spark, the required code changes 
are in tests. I meant, as a Spark user, why you cannot add the Kafka client as 
a dependency and update the Kafka client? Because you have some test codes 
similar to Spark, or are you using Kafka API directly in your codes?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-07 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000130#comment-16000130
 ] 

Helena Edelson commented on SPARK-18057:


Did that a while ago, my only point is not modifying artifacts ideally, by 
adding and excluding in builds.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-04 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997133#comment-15997133
 ] 

Shixiong Zhu commented on SPARK-18057:
--

[~helena_e] I'm curious why you cannot just update the Kafka client version in 
your pom.xml (or build.sbt, not sure which tool you are using)?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-03 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996166#comment-15996166
 ] 

Helena Edelson commented on SPARK-18057:


With the current 0.10.0.1 version we have several issues happening, forcing us 
into ever tighter situations. Much of this is constraints related to new 
functionality in later Kafka releases around kafka security and SASL_SSL and 
related behavior not in previous versions of Kafka. 

Users in our ecosystem can not delete topics on clusters so this is not our 
relevant use case. It seems only structured streaming  kafka does deleteTopic, 
vs spark-streaming-kafka. 

I've had to create an internal fork so that we can use Kafka 0.10.2.0 in Spark, 
which is bad but we are blocked otherwise.

[~ijuma] good to know on the timing. A group of us voted for 
https://issues.apache.org/jira/browse/KAFKA-4879. 

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983853#comment-15983853
 ] 

Ismael Juma commented on SPARK-18057:
-

It's worth noting that no-one is working on that ticket at the moment, so a fix 
may take some time. And even if it lands soon, it's likely to be in 0.11.0.0 
first (0.10.2.1 is being voted and will be out very soon).

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983832#comment-15983832
 ] 

Helena Edelson commented on SPARK-18057:


It is the timeout. I think waiting is better, will be watching that ticket in 
Kafka.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983820#comment-15983820
 ] 

Michael Armbrust commented on SPARK-18057:
--

I guess I'd like to understand more about what problems people are running into 
with the current version.  Are there more pressing issues than hanging when 
topics are deleted?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983809#comment-15983809
 ] 

Shixiong Zhu commented on SPARK-18057:
--

I prefer to just wait. The user can still use Kafka 0.10.2.0 with the current 
Spark Kafka source in their application. The APIs are compatibility. Commenting 
tests out means we cannot prevent future changes from breaking them.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983796#comment-15983796
 ] 

Helena Edelson commented on SPARK-18057:


I have a branch off branch-2.2 with the 0.10.2.0 upgrade and changes done. All 
the delete-topic-related tests fail (mainly just in streaming kafka sql).

I can PR with those few tests commented out but that doesn't sound right. Or 
wait to PR?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983396#comment-15983396
 ] 

Shixiong Zhu commented on SPARK-18057:
--

[~guozhang] We have a stress test to test Spark Kafka connector for various 
cases, and it will try to frequently delete / re-create topics.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982683#comment-15982683
 ] 

Ismael Juma commented on SPARK-18057:
-

Thanks for the clarification [~zsxwing], that's helpful.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982377#comment-15982377
 ] 

Guozhang Wang commented on SPARK-18057:
---

Just adding the related KIP for the recently added client compatibility that 
[~ijuma] mentioned:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-97%3A+Improved+Kafka+Client+RPC+Compatibility+Policy

As for KAFKA-4879: I'm curious to learn how common could Spark be hitting this 
issue. For example, would Spark Streaming code try to frequently delete / 
re-create topics?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982180#comment-15982180
 ] 

Shixiong Zhu commented on SPARK-18057:
--

[~ijuma] it's not a regression. In Kafka 0.10.0.1, deleting topics will cause 
KAFKA-4536. In Spark, we just retry as a workaround. In 0.10.2.0, this issue 
has been fixed and then it just exposed KAFKA-4879.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982153#comment-15982153
 ] 

Ismael Juma commented on SPARK-18057:
-

[~helena], about KAFKA-4879, are you suggesting that it's a regression in 
0.10.2? The behaviour described (blocking to ensure offsets for all partitions 
are retrieved) has been there since the new Java consumer was introduced in 
0.9.0.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-24 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982077#comment-15982077
 ] 

Ismael Juma commented on SPARK-18057:
-

Hi. A few clarifications below.

"Based on previous kafka client upgrades I wouldn't expect them to be binary 
compatible, so it's likely to cause someone problems if they were also making 
use of kafka client libraries in their spark job. Still may be the path of 
least resistance."

We do strive for binary compatibility for APIs that are not marked as Unstable. 
The Java consumer was introduced in 0.9.0.0 and the APIs were marked as 
Unstable. There were incompatible changes between 0.9.0.0 and 0.10.0.0 
(KIP-45), which is what the above comment is probably referring to, but no 
other incompatible changes after that.

"For what it's worth, or not, I ran into a wire protocol incompatibility 
between 0.10.0.1 and 0.10.1.1 today. I suspect we'd find the same vs 0.10.2.0. 
It surprised me."

As Michael has said, 0.10.2 clients are the first version that supports older 
brokers (0.10.0 and higher). Before 0.10.2, clients supported newer brokers 
(e.g. 0.8.2 clients support 0.10.2), but not older brokers. I hope that helps.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-22 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979788#comment-15979788
 ] 

Helena Edelson commented on SPARK-18057:


Confirming that https://issues.apache.org/jira/browse/KAFKA-4879 - 
KafkaConsumer.position may hang forever when deleting a topic - is the only 
blocker. I upgraded in my fork with some minor code changes and the 
delete-related tests in spark-sql-kafka-0-10 hang. I can submit this as a PR as 
soon as that is resolved.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979678#comment-15979678
 ] 

Helena Edelson commented on SPARK-18057:


There’s a RC for 0.10.2.1 that’s been opened for a while.

- It introduces backward compatible protocol (new clients can talk to old 
broker and vice versa).
- There are many fixes in this RC:
https://issues.apache.org/jira/browse/KAFKA-4198?jql=fixVersion%20%3D%200.10.2.1%20AND%20project%20%3D%20KAFKA

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979382#comment-15979382
 ] 

Michael Armbrust commented on SPARK-18057:
--

Yes, 0.10.2.0 is the first release that promises forward compatibility from 
client to a newer broker.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979225#comment-15979225
 ] 

Sean Owen commented on SPARK-18057:
---

This wasn't true for me: 0.10.1 client failed to connect to 0.10.0 broker. 

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979156#comment-15979156
 ] 

Michael Armbrust commented on SPARK-18057:
--

[~srowen], thanks for reporting, but based on the [release 
notes|http://kafka.apache.org/0102/documentation.html#upgrade_1020_notable] I 
think we are okay as long as we go all the way to 0.10.0.2.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979147#comment-15979147
 ] 

Sean Owen commented on SPARK-18057:
---

For what it's worth, or not, I ran into a wire protocol incompatibility between 
0.10.0.1 and 0.10.1.1 today. I suspect we'd find the same vs 0.10.2.0. It 
surprised me.
It doesn't necessarily block updating, but, these maintenance updates to Kafka 
are apparently mutually incompatible, so is going to make people also upgrade 
their clusters to keep working.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979126#comment-15979126
 ] 

Michael Armbrust commented on SPARK-18057:
--

If there are multiple reports of 0.10.2.0 being more stable than 0.10.1.0, then 
we should probably upgrade.  It will need to happen before the first RC of 2.2. 
(or only go into master).  Can someone open a PR?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977896#comment-15977896
 ] 

Helena Edelson commented on SPARK-18057:


I think this fix in 0.10.2.0 was a big part of it 
https://issues.apache.org/jira/browse/KAFKA-4547. I saw that behavior.

Possible Concern 
- https://issues.apache.org/jira/browse/SPARK-18779 - I've seen this
- https://issues.apache.org/jira/browse/KAFKA-4879 - Not seen this, noted by 
Michael and [~zsxwing]

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977709#comment-15977709
 ] 

Cody Koeninger commented on SPARK-18057:


People have also been reporting that explicit dependency on 0.10.2.0 was 
working for them where 0.10.1.0 wasn't (e.g. SPARK-20036)

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977690#comment-15977690
 ] 

Helena Edelson commented on SPARK-18057:


Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-13 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922715#comment-15922715
 ] 

Michael Armbrust commented on SPARK-18057:
--

So to summarize, it'll be unfortunate if Kafka breaks binary compatibility and 
people are relying on the libraries that are bundled with Spark.  I don't think 
that should stop us from upgrading though, especially since we do provide 
binary compatible APIs for reading/writing from kafka and API protocol 
compatibility is no longer an issue.  If enough users complain, we should 
consider shading our dependency on Kafka.

Given that, as soon as there is a release that fixes [KAFKA-4879], I think we 
should upgrade (assuming no other regressions).  We should probably do the same 
for DStreams, unless there are objections there based on the Kafka library 
binary compatibility problems (I'm not sure if its more likely for applications 
there to interact directly with the kafka library).

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905886#comment-15905886
 ] 

Shixiong Zhu commented on SPARK-18057:
--

>  Based on previous kafka client upgrades I wouldn't expect them to be binary 
> compatible, so it's likely to cause someone problems if they were also making 
> use of kafka client libraries in their spark job. Still may be the path of 
> least resistance.

I can confirm the APIs used by Kafka sink is source compatible since I didn't 
change any core source codes (test codes have to be changed because of the 
server APIs are changed). Since these APIs are Java APIs. I'm pretty sure they 
are binary compatible. So for the user, even if we upgrade the Kafka client 
version, they can still downgrade the Kafka client version if they want, and 
just repackage the codes with the kafka client. It's a bit annoying that 
"--packages" probably won't work but it's acceptable.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905774#comment-15905774
 ] 

Cody Koeninger commented on SPARK-18057:


Based on previous kafka client upgrades I wouldn't expect them to be binary 
compatible, so it's likely to cause someone problems if they were also making 
use of kafka client libraries in their spark job.  Still may be the path of 
least resistance.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Michael Armbrust (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905755#comment-15905755
 ] 

Michael Armbrust commented on SPARK-18057:
--

It seems like we can upgrade the existing Kafka10 artifacts without causing any 
compatibility issues (since 0.10.2.0 is compatible with 0.10.0.0+), so I don't 
think there is any need to make new artifacts or do any refactoring.  I think 
we can just upgrade?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Cody Koeninger (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905747#comment-15905747
 ] 

Cody Koeninger commented on SPARK-18057:


I think the bigger question is once there's a kafka version you want to upgrade 
to, are you going to just forcibly upgrade, make another set of separate 
artifacts, or refactor common code so that it can use a different / provided 
kafka version.  Ditto for the DStream, unless you're just abandoning it.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905716#comment-15905716
 ] 

Shixiong Zhu commented on SPARK-18057:
--

I did some investigation yesterday, and found one issue in 0.10.2.0:
https://issues.apache.org/jira/browse/KAFKA-4879 : KafkaConsumer.position may 
hang forever when deleting a topic

Our current tests will just hang forever due to KAFKA-4879. This prevents us 
from upgrading 0.10.2.0.

I also went through the Kafka tickets between 0.10.0.1 and 0.10.2.0. Let me try 
to summary the current situation:

The benefits of upgrading Kafka client to 0.10.2.0:
- Forward compatibility
- Reading topics from a timestamp
- The following bug fixes:

Issues that we already have workarounds:
https://issues.apache.org/jira/browse/KAFKA-4375 : Kafka consumer may swallow 
some interrupts meant for the calling thread
https://issues.apache.org/jira/browse/KAFKA-4387 : KafkaConsumer will enter an 
infinite loop if the polling thread is interrupted, and either commitSync or 
committed is called
https://issues.apache.org/jira/browse/KAFKA-4536 : Kafka clients throw 
NullPointerException on poll when delete the relative topic

Issues related to Kafka record compression
https://issues.apache.org/jira/browse/KAFKA-3937 : Kafka Clients Leak Native 
Memory For Longer Than Needed With Compressed Messages
https://issues.apache.org/jira/browse/KAFKA-4549 : KafkaLZ4OutputStream does 
not write EndMark if flush() is not called before close()

Others:
https://issues.apache.org/jira/browse/KAFKA-2948 : Kafka producer does not cope 
well with topic deletions

For 0.10.1.*, KAFKA-4547 prevents us from upgrading to 0.10.1.*.

At last, IMO, "Reading topics from a timestamp" is pretty useful and is the 
most important reason that we should upgrade Kafka. However, since the Spark 
2.2 code freeze is coming, we won't get enough time to deliver this feature to 
the user, it's fine to just wait for them fixing KAFKA-4879 in the next Kafka 
release. I don't think the next Kafka release will be later than Spark 2.3.


> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-02-26 Thread Ofir Manor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884790#comment-15884790
 ] 

Ofir Manor commented on SPARK-18057:


Sure - [~marmbrus] - what do you think? it resolved my Kafka 0.10.1 concern

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-02-26 Thread Ofir Manor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884619#comment-15884619
 ] 

Ofir Manor commented on SPARK-18057:


[~c...@koeninger.org] I think my previous issue with upgrading to Kafka 0.10.1 
client has been resolved in Kafka 0.10.2 - since now newer clients DO support 
older brokers:
{quote}
The Java clients (producer and consumer) have acquired the ability to 
communicate with older brokers. Version 0.10.2 clients can talk to version 
0.10.0 or newer brokers. Note that some features are not available or are 
limited when older brokers are used.
{quote}
See https://kafka.apache.org/documentation/#upgrade_1020_notable
So it seems safe to upgrade Spark's Kafka client to 0.10.2, as it shouldn't 
bump up the minimum required version of the Kafka broker.
What do you think?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org