[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-05 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215310881 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-05 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215310403 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-05 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215308701 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r215092933 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @zsxwing If it means code freeze for 2.4 is just around the corner then sure! We can focus on blockers for releasing 2.4, and revisit this again. Let me reflect @gaborgsomogyi review

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214918569 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/FetchedPoolSuite.scala --- @@ -0,0 +1,299 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214913221 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214916741 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -18,222 +18,247 @@ package

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214916493 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -18,222 +18,247 @@ package

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214917536 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/FetchedPoolSuite.scala --- @@ -0,0 +1,299 @@ +/* --- End diff

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214911381 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214910337 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214917284 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -414,17 +468,37 @@ private[kafka010

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214910482 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214910433 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214907878 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214908731 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214917336 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -414,17 +468,37 @@ private[kafka010

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r214909826 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/FetchedDataPool.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214618743 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRecordToUnsafeRowConverter.scala --- @@ -44,6 +44,11 @@ private

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214635480 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214629265 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214638140 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala --- @@ -1254,6 +1254,9 @@ abstract

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214646749 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala --- @@ -1254,6 +1254,9 @@ abstract

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214622654 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala --- @@ -293,7 +294,12 @@ private[kafka010] class

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214622600 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala --- @@ -115,7 +116,12 @@ private[kafka010] class

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214640616 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSinkSuite.scala --- @@ -59,14 +59,23 @@ class KafkaSinkSuite

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214639674 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala --- @@ -136,6 +142,19 @@ class

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-03 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r214632570 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-09-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 I spent more hours to take a look at how SQL UI can update the metrics information before task ends, and now I guess I may understand what was the concern from @cloud-fan here

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 If batch query also leverages AccumulatorV2 for metrics, IMHO it might not need to redesign metrics API start from scratch. For batch and micro-batch the metrics API work without any concerns

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Retest this, please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 I skimmed about how AccumulatorV2 works, and looks like the values in a task are reported along with CompletionEvent which is triggered when a task ends. Then in continuous mode driver even

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 My 2 cents, the root reason is the lifecycle of reporting query progress is tied to `finishTrigger` and we read updated metrics from executed plan which continuous mode doesn't have both

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Two tests failed and not relevant to this patch. * org.apache.spark.scheduler.DAGSchedulerSuite.Barrier task failures from the same stage attempt don't trigger multiple stage retries

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 Unfortunate thing is that continuous mode allows different epochs between partitions, hence query progress just doesn't fit so hard to address SPARK-23887 for now. My 2 cents is continuous mode

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 @zsxwing @gatorsmile @cloud-fan As you all know, IMHO, the thing is that continuous mode doesn't support StreamingQueryProgress for now. We allowed these incomplete state because

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 From the stack trace of test failure, it doesn't look like relevant to the code change I guess. Jenkins shows REGRESSION but the test is added at Aug 25, 2018 so aged less than 7 days, which

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Updated the description of PR to reflect the new approach. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 I just applied a new approach: "separate of concerns". This approach does pooling for both kafka consumers as well as fetched data. Both pools support eviction on idle obje

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Thanks @attilapiros I fixed the missing spots you left comments. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r213871197 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,260

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r213867027 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,260

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r213866613 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,260

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r213866495 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,260

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r213866399 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,260

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-28 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Rebased to resolve conflict. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213267299 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -266,7 +266,9 @@ object SQLConf { .createWithDefault

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213267329 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -868,7 +870,9 @@ object SQLConf { .internal

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-27 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213179259 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -2812,7 +2812,18 @@ See [Input Sources](#input-sources) and [Output Sinks](#output

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-27 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213177623 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -2812,7 +2812,18 @@ See [Input Sources](#input-sources) and [Output Sinks](#output

[GitHub] spark issue #22238: [SPARK-25245][DOCS][SS] Explain regarding limiting modif...

2018-08-27 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22238 Also adding @tdas @zsxwing @jose-torres to cc. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-27 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213129120 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -2812,6 +2812,12 @@ See [Input Sources](#input-sources) and [Output Sinks](#output

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-27 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22238#discussion_r213123711 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -2812,6 +2812,12 @@ See [Input Sources](#input-sources) and [Output Sinks](#output

[GitHub] spark pull request #22238: [SPARK-25245][DOCS][SS] Explain regarding limitin...

2018-08-26 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22238 [SPARK-25245][DOCS][SS] Explain regarding limiting modification on "spark.sql.shuffle.partitions" for structured streaming ## What changes were proposed in this pu

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-24 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 I just addressed eviction to consumer pool as well as added relevant test. This will help closing invalid idle consumers which topic or partition are no longer be assigned to any tasks. I guess

[GitHub] spark issue #22175: [MINOR] Added import to fix compilation

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22175 Didn't notice this PR. Thanks for the quick turnaround! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22178: [MINOR] Fix build failure due to non-direct confl...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR closed the pull request at: https://github.com/apache/spark/pull/22178 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22178 a998e9d829bd499dd7c65f973ea4389e0401b001 resolved this. @tdas was faster :) Closing. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22178 cc. @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22178: [MINOR] Fix build failure due to non-direct confl...

2018-08-21 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22178 [MINOR] Fix build failure due to non-direct conflict: removing import affects others ## What changes were proposed in this pull request? Fix the build failure on master branch: some

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas In case of you are not working on the patch, I'm working on the fix and will provide minor PR. --- - To unsubscribe

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas Yeah, I can check with master branch if you would like to let me handle, and please go ahead if you would like to handle it by yourself

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 Thanks all for reviewing and thanks @tdas for merging this in! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 Thanks all for reviewing and thanks @tdas for merging this in! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22170: [MINOR] Add .crc files to .gitignore

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22170 @HyukjinKwon Thanks for the reviewing and merging! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22170: [MINOR] Add .crc files to .gitignore

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22170 cc. @tdas --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22170: [MINOR] Add .crc files to .gitignore

2018-08-21 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22170 [MINOR] Add .crc files to .gitignore ## What changes were proposed in this pull request? Add .crc files to .gitignore so that we don't add .crc files in state checkpoint during tests

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas Thanks for the feedback! Updated the PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Removed the .crc files. Thanks for reviewing! Btw, it might be good to find the way to add `.crc` files to `.gitignore` if `.crc` files are mostly useless all the times. Might worth

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @koeninger Yeah I see what you're saying, then IMHO isolating consumers with query sounds better than others. Adding next offset to the cache key would make consumer moving bucket in cache

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @koeninger I'm not sure but are you saying that an executor cares about multiple queries (multiple jobs) concurrently? I honestly didn't notice it. If that is going to be problem, we

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 If my understanding is right, looks like current approach has same limitation. I guess you're busy, but could you refer some issue number or point out some code lines which was based

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-19 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r29914 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -425,70 +381,36 @@ private[kafka010

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @koeninger I'm not sure I got your point correctly. This patch is based on some assumptions, so please correct me if I'm missing here. Assumptions follow: 1. There's actually

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-08-17 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 cc. @tdas @zsxwing @koeninger @arunmahadevan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-17 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r211053868 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -425,70 +381,36 @@ private[kafka010

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-08-17 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22138 [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer ## What changes were proposed in this pull request? KafkaDataConsumer contains its own logic for caching

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-15 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas Kindly reminder. @zsxwing Could you take a quick look at this and share your thought? I think the patch is ready to merge, but blocked with slightly conflict of view so more voices

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-15 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Kindly reminder. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21718: [SPARK-24744][STRUCTRURED STREAMING] Set the SparkSessio...

2018-08-15 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21718 @bjkonglu @bethunebtj @wguangliang Update: I thought about splitting execution tasks and data partitions (`spark.sql.shuffle.partitions`), and turned out it can be achieved by calling

[GitHub] spark pull request #22010: [SPARK-21436][CORE] Take advantage of known parti...

2018-08-10 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22010#discussion_r209230417 --- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala --- @@ -95,6 +95,18 @@ class RDDSuite extends SparkFunSuite with SharedSparkContext

[GitHub] spark pull request #22010: [SPARK-21436][CORE] Take advantage of known parti...

2018-08-10 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22010#discussion_r209230438 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -396,7 +396,16 @@ abstract class RDD[T: ClassTag]( * Return a new RDD

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-09 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Addressed review comments. Please take a look again. Thanks in advance. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208776234 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -167,6 +165,18 @@ trait WatermarkSupport

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208597100 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StreamingAggregationStateManagerSuite.scala --- @@ -0,0 +1,132

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208597372 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StreamingAggregationStateManagerSuite.scala --- @@ -0,0 +1,132

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208599035 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208592855 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208600386 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208600267 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208600790 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208598475 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StreamingAggregationStateManagerSuite.scala --- @@ -0,0 +1,132

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208596844 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -201,33 +211,37 @@ object WatermarkSupport

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208602231 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208598797 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208596001 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

<    1   2   3   4   5   >