[jira] [Created] (SPARK-30570) Update scalafmt to 1.0.3 with onlyChangedFiles feature

2020-01-19 Thread Cody Koeninger (Jira)
Cody Koeninger created SPARK-30570: -- Summary: Update scalafmt to 1.0.3 with onlyChangedFiles feature Key: SPARK-30570 URL: https://issues.apache.org/jira/browse/SPARK-30570 Project: Spark

[jira] [Created] (SPARK-26177) Automated formatting for Scala code

2018-11-26 Thread Cody Koeninger (JIRA)
Cody Koeninger created SPARK-26177: -- Summary: Automated formatting for Scala code Key: SPARK-26177 URL: https://issues.apache.org/jira/browse/SPARK-26177 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-26121) [Structured Streaming] Allow users to define prefix of Kafka's consumer group (group.id)

2018-11-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-26121. Resolution: Fixed Assignee: Anastasios Zouzias Fix Version/s: 3.0.0

[jira] [Commented] (SPARK-25983) spark-sql-kafka-0-10 no longer works with Kafka 0.10.0

2018-11-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682464#comment-16682464 ] Cody Koeninger commented on SPARK-25983: Looks like we need an equivalent warning in

[jira] [Commented] (SPARK-25983) spark-sql-kafka-0-10 no longer works with Kafka 0.10.0

2018-11-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682461#comment-16682461 ] Cody Koeninger commented on SPARK-25983: Documentation already explicitly says not to link to

[jira] [Resolved] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-30 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-25233. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 3

[jira] [Assigned] (SPARK-25233) Give the user the option of specifying a fixed minimum message per partition per batch when using kafka direct API with backpressure

2018-08-30 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-25233: -- Assignee: Reza Safi > Give the user the option of specifying a fixed minimum message

[jira] [Commented] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-05 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569491#comment-16569491 ] Cody Koeninger commented on SPARK-24987: It was merged to branch-2.3

[jira] [Commented] (SPARK-25026) Binary releases don't contain Kafka integration modules

2018-08-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569283#comment-16569283 ] Cody Koeninger commented on SPARK-25026: I don't think distributions have ever included the

[jira] [Resolved] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-24987. Resolution: Fixed Fix Version/s: 2.4.0 > Kafka Cached Consumer Leaking File

[jira] [Assigned] (SPARK-24987) Kafka Cached Consumer Leaking File Descriptors

2018-08-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-24987: -- Assignee: Yuval Itzchakov > Kafka Cached Consumer Leaking File Descriptors >

[jira] [Resolved] (SPARK-24713) AppMatser of spark streaming kafka OOM if there are hundreds of topics consumed

2018-07-13 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-24713. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21690

[jira] [Assigned] (SPARK-24713) AppMatser of spark streaming kafka OOM if there are hundreds of topics consumed

2018-07-13 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-24713: -- Assignee: Yuanbo Liu > AppMatser of spark streaming kafka OOM if there are hundreds

[jira] [Commented] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2018-07-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540138#comment-16540138 ] Cody Koeninger commented on SPARK-19680: A new consumer group is the easiest thing to do, but if

[jira] [Commented] (SPARK-24720) kafka transaction creates Non-consecutive Offsets (due to transaction offset) making streaming fail when failOnDataLoss=true

2018-07-06 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534797#comment-16534797 ] Cody Koeninger commented on SPARK-24720: What's your plan to tell the difference between gaps

[jira] [Resolved] (SPARK-24743) Update the JavaDirectKafkaWordCount example to support the new API of Kafka

2018-07-05 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-24743. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21717

[jira] [Commented] (SPARK-18258) Sinks need access to offset representation

2018-06-27 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525137#comment-16525137 ] Cody Koeninger commented on SPARK-18258: [~Yohan123] This ticket is about giving implementors of

[jira] [Commented] (SPARK-24507) Description in "Level of Parallelism in Data Receiving" section of Spark Streaming Programming Guide in is not relevan for the recent Kafka direct apprach

2018-06-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518287#comment-16518287 ] Cody Koeninger commented on SPARK-24507: You're welcome to submit a doc PR that clarifies that

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-05-31 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496633#comment-16496633 ] Cody Koeninger commented on SPARK-18057: I'd just modify KafkaTestUtils to match the way things

[jira] [Commented] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-05-14 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474549#comment-16474549 ] Cody Koeninger commented on SPARK-24067: [~zsxwing] even in situations where users weren't

[jira] [Resolved] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-05-11 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-24067. Resolution: Fixed Fix Version/s: 2.3.1 Issue resolved by pull request 21300

[jira] [Commented] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-25 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452603#comment-16452603 ] Cody Koeninger commented on SPARK-24067: The original PR

[jira] [Commented] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-25 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452337#comment-16452337 ] Cody Koeninger commented on SPARK-24067: Given the response on the dev list about criteria for

[jira] [Resolved] (SPARK-21168) KafkaRDD should always set kafka clientId.

2018-04-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-21168. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 19887

[jira] [Assigned] (SPARK-21168) KafkaRDD should always set kafka clientId.

2018-04-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-21168: -- Assignee: liuzhaokun > KafkaRDD should always set kafka clientId. >

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-04-18 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442838#comment-16442838 ] Cody Koeninger commented on SPARK-18057: [~cricket007] here's a branch with spark 2.1.1 / kafka

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-04-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441818#comment-16441818 ] Cody Koeninger commented on SPARK-18057: Ok, if you can figure out what version of spark it is I

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-04-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441793#comment-16441793 ] Cody Koeninger commented on SPARK-18057: Just adding the extra dependency on 0.11 probably won't

[jira] [Resolved] (SPARK-22968) java.lang.IllegalStateException: No current assignment for partition kssh-2

2018-04-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-22968. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21038

[jira] [Assigned] (SPARK-22968) java.lang.IllegalStateException: No current assignment for partition kssh-2

2018-04-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-22968: -- Assignee: Saisai Shao > java.lang.IllegalStateException: No current assignment for

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-04-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441606#comment-16441606 ] Cody Koeninger commented on SPARK-18057: Out of curiosity, was that a compacted topic? >

[jira] [Commented] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2018-04-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16433039#comment-16433039 ] Cody Koeninger commented on SPARK-19680: [~nerdynick]  If you submit a PR to add documentation

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414326#comment-16414326 ] Cody Koeninger commented on SPARK-23739: Ok, the OutOfMemoryError is probably a separate and

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411503#comment-16411503 ] Cody Koeninger commented on SPARK-23739: I meant the version of the org.apache.kafka

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411396#comment-16411396 ] Cody Koeninger commented on SPARK-23739: What version of the org.apache.kafka artifact is in the

[jira] [Resolved] (SPARK-18580) Use spark.streaming.backpressure.initialRate in DirectKafkaInputDStream

2018-03-21 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-18580. Resolution: Fixed Target Version/s: 2.4.0 > Use

[jira] [Assigned] (SPARK-18580) Use spark.streaming.backpressure.initialRate in DirectKafkaInputDStream

2018-03-21 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-18580: -- Assignee: Oleksandr Konopko > Use spark.streaming.backpressure.initialRate in

[jira] [Resolved] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2018-03-16 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-18371. Resolution: Fixed Fix Version/s: 2.4.0 > Spark Streaming backpressure bug -

[jira] [Assigned] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2018-03-16 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-18371: -- Assignee: Sebastian Arzt > Spark Streaming backpressure bug - generates a batch with

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-03-05 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387245#comment-16387245 ] Cody Koeninger commented on SPARK-18057: It's probably easiest to keep the KIP discussion on the

[jira] [Commented] (SPARK-19767) API Doc pages for Streaming with Kafka 0.10 not current

2018-03-05 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387217#comment-16387217 ] Cody Koeninger commented on SPARK-19767: [~nafshartous] I think at this point people are more

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370924#comment-16370924 ] Cody Koeninger commented on SPARK-18057: Just doing the upgrade is probably a good starting point

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2018-02-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370460#comment-16370460 ] Cody Koeninger commented on SPARK-18057: My guess is that DStream based integrations aren't

[jira] [Resolved] (SPARK-22561) Dynamically update topics list for spark kafka consumer

2017-11-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-22561. Resolution: Not A Problem > Dynamically update topics list for spark kafka consumer >

[jira] [Commented] (SPARK-22561) Dynamically update topics list for spark kafka consumer

2017-11-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264643#comment-16264643 ] Cody Koeninger commented on SPARK-22561: See SubscribePattern

[jira] [Commented] (SPARK-22486) Support synchronous offset commits for Kafka

2017-11-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248891#comment-16248891 ] Cody Koeninger commented on SPARK-22486: Can you identify a clear use case for this, given that

[jira] [Commented] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2017-11-07 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242438#comment-16242438 ] Cody Koeninger commented on SPARK-19680: If you got an OffsetOutOfRangeException after a job had

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221025#comment-16221025 ] Cody Koeninger commented on SPARK-20928: No, it doesn't exist yet as far as I know. Reason I ask

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220694#comment-16220694 ] Cody Koeninger commented on SPARK-20928: Can you clarify how this impacts sinks having access to

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202923#comment-16202923 ] Cody Koeninger commented on SPARK-20928: If a given sink is handling a result, why does handling

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202905#comment-16202905 ] Cody Koeninger commented on SPARK-20928: I was talking about the specific case of jobs with only

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2017-09-07 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157002#comment-16157002 ] Cody Koeninger commented on SPARK-17147: Patch is there, if anyone wants to test it and provide

[jira] [Commented] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2017-07-06 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076646#comment-16076646 ] Cody Koeninger commented on SPARK-19680: Direct Stream can take a mapping from topicpartition to

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-29 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16068980#comment-16068980 ] Cody Koeninger commented on SPARK-18057: Kafka 0.11 is now released. Are we upgrading spark

[jira] [Commented] (SPARK-21233) Support pluggable offset storage

2017-06-28 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066517#comment-16066517 ] Cody Koeninger commented on SPARK-21233: You already have the choice of where you want to store

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-19 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054690#comment-16054690 ] Cody Koeninger commented on SPARK-20928: Cool, can you label it SPIP so it shows up linked from

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052831#comment-16052831 ] Cody Koeninger commented on SPARK-20928: This needs an improvement proposal. Based on

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-03 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16035964#comment-16035964 ] Cody Koeninger commented on SPARK-20928: For jobs that only have narrow stages, I think it should

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977709#comment-15977709 ] Cody Koeninger commented on SPARK-18057: People have also been reporting that explicit dependency

[jira] [Commented] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-04-18 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973554#comment-15973554 ] Cody Koeninger commented on SPARK-20036: [~danielnuriyev] what actually happened when you removed

[jira] [Commented] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-04-18 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973507#comment-15973507 ] Cody Koeninger commented on SPARK-20036: I'll submit a PR to add a note to the docs about this.

[jira] [Updated] (SPARK-20287) Kafka Consumer should be able to subscribe to more than one topic partition

2017-04-13 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-20287: --- What you're describing is closer to the receiver-based implementation, which had a number of

[jira] [Commented] (SPARK-19976) DirectStream API throws OffsetOutOfRange Exception

2017-04-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966208#comment-15966208 ] Cody Koeninger commented on SPARK-19976: What would your expected behavior be when you delete

[jira] [Commented] (SPARK-20037) impossible to set kafka offsets using kafka 0.10 and spark 2.0.0

2017-04-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966199#comment-15966199 ] Cody Koeninger commented on SPARK-20037: I'd be inclined to say this is a duplicate of the issue

[jira] [Commented] (SPARK-20036) impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0

2017-04-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966193#comment-15966193 ] Cody Koeninger commented on SPARK-20036: fixKafkaParams is related to executor consumers, not the

[jira] [Commented] (SPARK-20287) Kafka Consumer should be able to subscribe to more than one topic partition

2017-04-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966180#comment-15966180 ] Cody Koeninger commented on SPARK-20287: The issue here is that the underlying new Kafka consumer

[jira] [Commented] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-27 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943827#comment-15943827 ] Cody Koeninger commented on SPARK-19904: It has been added to apache/spark-website git repo

[jira] [Commented] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15906593#comment-15906593 ] Cody Koeninger commented on SPARK-19904: Up now at

[jira] [Assigned] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger reassigned SPARK-19904: -- Assignee: Cody Koeninger > SPIP Add Spark Project Improvement Proposal doc to website

[jira] [Updated] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-19904: --- Issue Type: Improvement (was: Bug) > SPIP Add Spark Project Improvement Proposal doc to

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905774#comment-15905774 ] Cody Koeninger commented on SPARK-18057: Based on previous kafka client upgrades I wouldn't

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-03-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905747#comment-15905747 ] Cody Koeninger commented on SPARK-18057: I think the bigger question is once there's a kafka

[jira] [Commented] (SPARK-19888) Seeing offsets not resetting even when reset policy is configured explicitly

2017-03-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905730#comment-15905730 ] Cody Koeninger commented on SPARK-19888: That stacktrace also shows a concurrent modification

[jira] [Commented] (SPARK-19863) Whether or not use CachedKafkaConsumer need to be configured, when you use DirectKafkaInputDStream to connect the kafka in a Spark Streaming application

2017-03-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905603#comment-15905603 ] Cody Koeninger commented on SPARK-19863: Isn't this basically a duplicate of SPARK-19185 with the

[jira] [Updated] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-10 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-19904: --- Description: see

[jira] [Created] (SPARK-19904) SPIP Add Spark Project Improvement Proposal doc to website

2017-03-10 Thread Cody Koeninger (JIRA)
Cody Koeninger created SPARK-19904: -- Summary: SPIP Add Spark Project Improvement Proposal doc to website Key: SPARK-19904 URL: https://issues.apache.org/jira/browse/SPARK-19904 Project: Spark

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2017-02-27 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887091#comment-15887091 ] Cody Koeninger commented on SPARK-17147: Dean if you guys have any bandwith to help test out

[jira] [Updated] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-02-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-18057: --- I think you should ask Michael and / or Ryan what their plan is. > Update structured

[jira] [Comment Edited] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2017-02-22 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878623#comment-15878623 ] Cody Koeninger edited comment on SPARK-19680 at 2/22/17 4:25 PM: - The

[jira] [Commented] (SPARK-19680) Offsets out of range with no configured reset policy for partitions

2017-02-22 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878623#comment-15878623 ] Cody Koeninger commented on SPARK-19680: The issue here is likely that you have lost data

[jira] [Resolved] (SPARK-19361) kafka.maxRatePerPartition for compacted topic cause exception

2017-01-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger resolved SPARK-19361. Resolution: Duplicate > kafka.maxRatePerPartition for compacted topic cause exception >

[jira] [Commented] (SPARK-19361) kafka.maxRatePerPartition for compacted topic cause exception

2017-01-26 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839683#comment-15839683 ] Cody Koeninger commented on SPARK-19361: Compacted topics in general don't work with direct

[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0

2017-01-25 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837844#comment-15837844 ] Cody Koeninger commented on SPARK-18057: If you can get commiter agreement on the outstanding

[jira] [Commented] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing

2017-01-17 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826527#comment-15826527 ] Cody Koeninger commented on SPARK-19185: I'd expect setting cache capacity to zero to cause

[jira] [Commented] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing

2017-01-15 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823164#comment-15823164 ] Cody Koeninger commented on SPARK-19185: This is a good error report, sorry it's taken me a while

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2016-12-12 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15742285#comment-15742285 ] Cody Koeninger commented on SPARK-17147: If compacted topics are important to you, then you

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2016-12-07 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729055#comment-15729055 ] Cody Koeninger commented on SPARK-17147: This ticket is about createDirectStream. The question

[jira] [Commented] (SPARK-18682) Batch Source for Kafka

2016-12-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15720146#comment-15720146 ] Cody Koeninger commented on SPARK-18682: Isn't this a duplicate of

[jira] [Issue Comment Deleted] (SPARK-18682) Batch Source for Kafka

2016-12-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Koeninger updated SPARK-18682: --- Comment: was deleted (was: Isn't this a duplicate of

[jira] [Commented] (SPARK-18682) Batch Source for Kafka

2016-12-04 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15720145#comment-15720145 ] Cody Koeninger commented on SPARK-18682: Isn't this a duplicate of

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-12-01 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712457#comment-15712457 ] Cody Koeninger commented on SPARK-18506: Yes, amazon linux. No, not spark-ec2, just a spark

[jira] [Commented] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-29 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706814#comment-15706814 ] Cody Koeninger commented on SPARK-18475: Glad you agree it shouldn't be enabled by default. If

[jira] [Commented] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-29 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706758#comment-15706758 ] Cody Koeninger commented on SPARK-18475: Burak hasn't empirically shown that it is of benefit for

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-28 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704372#comment-15704372 ] Cody Koeninger commented on SPARK-18506: 1 x spark master is m3 medium 2 x spark workers are m3

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691386#comment-15691386 ] Cody Koeninger commented on SPARK-18506: I'm really confused by that - did you try a completely

[jira] [Commented] (SPARK-18525) Kafka DirectInputStream cannot be aware of new partition

2016-11-23 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690915#comment-15690915 ] Cody Koeninger commented on SPARK-18525: Easiest thing to do is just restart your streaming job

[jira] [Commented] (SPARK-18525) Kafka DirectInputStream cannot be aware of new partition

2016-11-22 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15686746#comment-15686746 ] Cody Koeninger commented on SPARK-18525: 0.8 works only against defined partitions. Use 0.10 and

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-21 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15685190#comment-15685190 ] Cody Koeninger commented on SPARK-18506: I'd try to isolate aws vs gce as a possible cause before

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15682216#comment-15682216 ] Cody Koeninger commented on SPARK-18506: I tried your example code on an AWS 2-node spark

[jira] [Commented] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-20 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15682171#comment-15682171 ] Cody Koeninger commented on SPARK-18475: An iterator certainly does have an ordering guarantee,

  1   2   3   4   5   >