[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-12-01 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-736493983 @HeartSaVioR @zsxwing @viirya @xuanyuanking thanks for making this PR better and taking care! This is an

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-12-01 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-736300563 H, jenkins is not lightning fast nowadays. This is an automated message from the Apache Git Service. T

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-735947655 retest this please This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-735818247 I've just updated the description to reflect the actual stand of the PR. This is an automated message from

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-735648409 I think the trait solution is much better, there is a reason why I've added a comment that the wrapper solution is ugly enough. Working on to make it a trait. --

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-28 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-735355267 @HeartSaVioR Thanks for the new round of review, it's really a huge change and appreciate your effort! I agree on your suggestions and will apply them starting from Mond

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-25 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-733652656 @zsxwing @viirya @HeartSaVioR @xuanyuanking I've added the asked change and I would like to you ask to have a look please. Since it's quite a heavy change I'm listin

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-23 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-732009691 retest this please This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-22 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-731955328 I would like to see it too. Just seen the discussion related cut date which I tought will be a bit later so this will be my first task to jump on heavily. ---

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-12 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-726206068 Sure, will start to implement it. This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-07 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-723435991 I think that is the way to go but it's not so simple since strategy classes are API classes. The internal behaviour of them must be subclassed. --

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-11-06 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-723271000 Option 1 would give definitely more control but it would make this code part horror complex. I'm not against it but would like to hear other voices before I build back the

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-28 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-717799115 Not sure about the environment specific things but I can give an example what the difference is. Group based ACL command: ``` bin/kafka-acls.sh --add --allow-princ

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-26 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-716492068 @xuanyuanking Thansk for your efforts making this PR better! Let me react your suggestions: > Should we add some guides of the alternative method for the original usage

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-20 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-712837431 @HeartSaVioR > Personally I wouldn't mind if it's not compatible with 0.10/0.11 as 1.0 is released in Oct 2017 (already 3 years passed), but probably someone would

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-20 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-712798812 @zsxwing @HeartSaVioR We've just had a deepdive to find out the exact version compatibility. I've created an extract related all the used API calls: https://gist.git

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-15 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-709429484 @zsxwing important question and hard to answer. In order to provide exhaustive result I need to sit together with the Kafka guys and go through all the used APIs. Will tak

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-05 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-703649627 retest this please This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-02 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-702748456 > with and without this patch all three operations are used. It's true when we take a look at it globally (driver + executors). > That said, the behavior of no

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-01 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-702491670 @zsxwing you understand it right, actually it removes permissions needed. This is an automated message fro

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-10-01 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-701967347 retest this please This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-701470701 retest this please This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-701424151 There is still an issue w/ Scala 2.13, fixing... ``` Error: ] /home/runner/work/spark/spark/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/Kafka

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-30 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-701271287 Here is the result f the deepdive with the Kafka guys: https://gist.github.com/gaborgsomogyi/06361fa4d96055a5963d133577aae4ab I'm going to write the extract into the mig

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-28 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-699835649 @HeartSaVioR OK, I'm going to have another round w/ the Kafka guys in order to have exhaustive and exact answers. I would personally add this to the "SS migration note". A

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-25 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-698211449 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-24 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-698439672 @Tagar yeah, authorization w/ vanilla and confluent Kafka still works and a good alternative is the topic based restriction since `AdminClient` is not using `group.id`. I'

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-24 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-698211449 @Tagar not sure what you mean when you say `this`. ACL possibility is available in the vanilla Kafka too: https://kafka.apache.org/documentation/#security_authz --

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-23 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-697198305 @zsxwing > So the Kafka group based authorization does nothing when fetching the data? In case of the current Spark code the answer is yes. It's noop from author

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-22 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-696621270 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-22 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-696668588 @HeartSaVioR I've just had a discussion w/ the Kafka guys related `assign` + `group.id` and here are the findings. Kafka authorizes the client in case of `assign` only i

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-22 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-696621270 @zsxwing it's not explicitly written but the short answer is that the user needs to migrate the application. A little bit more detailed if one wants to authorize with `A

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-17 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-694154890 I've just filed SPARK-32910 and started to work on that as we've agreed. Intended to file a PR if this has merged. ---

[GitHub] [spark] gaborgsomogyi commented on pull request #29729: [SPARK-32032][SS] Avoid infinite wait in driver because of KafkaConsumer.poll(long) API

2020-09-14 Thread GitBox
gaborgsomogyi commented on pull request #29729: URL: https://github.com/apache/spark/pull/29729#issuecomment-691900949 Hmmm, not sure why but my IDE not always recompiles things, sigh. This is an automated message from the Ap