[GitHub] spark issue #23156: [SPARK-24063][SS] Add maximum epoch queue threshold for ...

2018-12-10 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23156 @gaborgsomogyi No problem :) When you get some other times please take a look at my other PRs as well. --- - To unsubscribe

[GitHub] spark issue #23156: [SPARK-24063][SS] Add maximum epoch queue threshold for ...

2018-12-10 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23156 I think @jose-torres previously led the feature. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23156: [SPARK-24063][SS] Add maximum epoch queue threshold for ...

2018-12-10 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23156 I'd rather not jumping in something regarding continuous mode unless the overall design (including aggregation and join) of continuous mode is cleared and stabilized

[GitHub] spark issue #23260: [SPARK-26311][YARN] New feature: custom log URL for stdo...

2018-12-09 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23260 @srowen For now executor log url is **static** in Spark, which forces Node Manager to be alive even after application is finished, in order to provide executor log in SHS

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-08 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23169 Thanks for addressong review comments. It looks great overall. We may want to document the new config so that we can guide setting the value to lower when end users suffer from memory

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-08 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23260: [SPARK-26311][YARN] New feature: custom log URL f...

2018-12-07 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/23260 [SPARK-26311][YARN] New feature: custom log URL for stdout/stderr ## What changes were proposed in this pull request? This patch proposes adding a new configuration on YARN mode

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing Please also take a look: I guess I addressed glob overlap issue as well. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi @steveloughran Please take a look at 17b9b9a043ead0d448048c88b30f544228bd230b which just leverages GlobFilter. You may find that when the depth of archive path is more than

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 I'm now also playing with Hadoop glob relevant classes to check whether final destination matches source path glob pattern or not. * Looks like we can leverage `GlobPattern

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi That's really huge... Could you share how you tested? Like which FS (local/HDFS/S3/etc), directory structure, count of files... That would help me understanding the impact

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi @steveloughran OK. I'll change the approach to just check against final path for each moving. As @steveloughran stated, it may bring performance hit for each checking when

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-04 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi @steveloughran `GlobExpander` only looks like handling `{}` pattern. We need to still deal with `*` and `?` which can't be expanded like this. It would only work

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23169 @DaveDeCaprio You might miss to roll back change in test. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99632/testReport/org.apache.spark.sql.catalyst.trees

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-03 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23169 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-03 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing @gaborgsomogyi What we were trying to do is enforcing archive path so that moved files will not make overlap with source path. There may be same file name with different directory

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-02 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23195#discussion_r238090190 --- Diff: docs/structured-streaming-kafka-integration.md --- @@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use `--packages

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-02 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23195#discussion_r238089426 --- Diff: docs/structured-streaming-kafka-integration.md --- @@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use `--packages

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-02 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23195#discussion_r238090666 --- Diff: docs/structured-streaming-kafka-integration.md --- @@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use `--packages

[GitHub] spark pull request #23195: [SPARK-26236][SS] Add kafka delegation token supp...

2018-12-02 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23195#discussion_r238090476 --- Diff: docs/structured-streaming-kafka-integration.md --- @@ -624,3 +624,57 @@ For experimenting on `spark-shell`, you can also use `--packages

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-12-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing Yeah, it would be ideal we can enforce `archivePath` to which don't have any possibility to match against source path (glob), so my approach was to find directory which

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-11-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing Btw, how do you think about addressing background move/deletion (I had thought and @gaborgsomogyi also suggested as well) into separate issue? I guess putting more feature would let

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237481604 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +289,65 @@ class FileStreamSource

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-11-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing Thanks for the detailed review! Addressed review comments. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237342362 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +289,65 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237342346 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -100,6 +101,36 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237342072 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +289,65 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237341854 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +289,65 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237341425 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +289,65 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237340952 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamOptions.scala --- @@ -74,6 +76,39 @@ class FileStreamOptions

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237340938 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -100,6 +101,36 @@ class FileStreamSource

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r237340601 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,12 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237305318 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1610,6 +1610,12 @@ object SQLConf { "&quo

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237304214 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/SizeLimitedWriter.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237301700 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/SizeLimitedWriter.scala --- @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237309191 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -202,6 +202,26 @@ package object util extends Logging

[GitHub] spark pull request #23169: [SPARK-26103][SQL] Limit the length of debug stri...

2018-11-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/23169#discussion_r237307829 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala --- @@ -595,4 +596,14 @@ class TreeNodeSuite extends

[GitHub] spark issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWi...

2018-11-27 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23142 cc. @tdas @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWi...

2018-11-26 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23142 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23142: [SPARK-26170][SS] Add missing metrics in FlatMapG...

2018-11-25 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/23142 [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWithState ## What changes were proposed in this pull request? This patch addresses measuring possible metrics in StateStoreWriter

[GitHub] spark issue #23103: [SPARK-26121] [Structured Streaming] Allow users to defi...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23103 LGTM. Btw, IMHO, TODOs @zouzias described would be better to be addressed at once since documentation is easy to be forgotten

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi Thanks for taking care, but I guess I can manage it. I'll ask for help when I can't go back to this one. This patch (latest change) hasn't get any feedback on committers

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi Thanks for reviewing! I addressed your review comments except asynchronous cleanup, which might be able to break down to separated issue

[GitHub] spark issue #22952: [SPARK-20568][SS] Provide option to clean up completed f...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @gaborgsomogyi Yeah I also thought about the idea (commented above) but I've lost focus on other task. Given that smaller patch is better to be reviewed easily and current patch works well

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r235632761 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,12 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r235632872 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamOptions.scala --- @@ -74,6 +76,43 @@ class FileStreamOptions

[GitHub] spark pull request #22952: [SPARK-20568][SS] Provide option to clean up comp...

2018-11-22 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r235632809 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala --- @@ -257,16 +258,64 @@ class FileStreamSource

[GitHub] spark issue #23076: [SPARK-26103][SQL] Added maxDepth to limit the length of...

2018-11-18 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/23076 I'm seeing both sides of needs: while I think dumping full plan into file is a good feature for debugging specific issue, retaining full plans for representing them to UI page have been

[GitHub] spark issue #22952: [SPARK-20568][SS] Rename files which are completed in pr...

2018-11-16 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 @zsxwing @dongjoon-hyun @steveloughran Thanks all for the valuable feedback! I applied review comments. While I covered the new feature with new UTs, I'm yet to test this manually

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-11-16 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22952: [SPARK-20568][SS] Rename files which are complete...

2018-11-12 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r232869187 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,8 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark pull request #22952: [SPARK-20568][SS] Rename files which are complete...

2018-11-07 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r231717554 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,8 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark pull request #22952: [SPARK-20568][SS] Rename files which are complete...

2018-11-07 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r231695749 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,8 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark pull request #22952: [SPARK-20568][SS] Rename files which are complete...

2018-11-07 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22952#discussion_r231429484 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -530,6 +530,8 @@ Here are the details of all the sources in Spark. &qu

[GitHub] spark issue #22952: [SPARK-20568][SS] Rename files which are completed in pr...

2018-11-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 cc. @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-11-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @zsxwing Given that Spark 2.4 vote passes, could we revisit and make progress on this? --- - To unsubscribe, e-mail

[GitHub] spark issue #22952: [SPARK-20568][SS] Rename files which are completed in pr...

2018-11-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22952 I feel the patch is simple to skip verifying manually against HDFS, but I'll try to spin up HDFS cluster and test this manually

[GitHub] spark pull request #22952: [SPARK-20568][SS] Rename files which are complete...

2018-11-05 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22952 [SPARK-20568][SS] Rename files which are completed in previous batch ## What changes were proposed in this pull request? This patch adds the option to rename files which are completed

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-11-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-11-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-22 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-18 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-16 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r225752604 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/TokenUtil.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-12 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r224320793 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/KafkaDelegationTokenProvider.scala --- @@ -0,0 +1,66 @@ +/* + * Licensed

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r224323537 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -647,4 +647,42 @@ package object config { .stringConf

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r224334764 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/TokenUtil.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r224322849 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -647,4 +647,42 @@ package object config { .stringConf

[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22598#discussion_r224338353 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/TokenUtil.scala --- @@ -0,0 +1,111 @@ +/* + * Licensed

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222840402 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22633: [SPARK-25644][SS]Fix java foreachBatch in DataStr...

2018-10-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22633#discussion_r222832931 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/streaming/JavaDataStreamReaderWriterSuite.java --- @@ -0,0 +1,64 @@ +/* +* Licensed

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-10-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @gaborgsomogyi Yeah... I'm just waiting for it. Btw I proposed solution on SPARK-10816 as well and it is also waiting for response. I'm going to work on another item or review others so that I

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Kindly ask for reviewing. Please never mind when you're busy with fixing bugs on Spark 2.4 RC. @gaborgsomogyi I guess I left two things for committer decision: 1. define soft boundary

[GitHub] spark pull request #22579: [SPARK-25429][SQL] Use Set instead of Array to im...

2018-09-28 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22579#discussion_r221177626 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala --- @@ -83,7 +83,7 @@ class SQLAppStatusListener

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-27 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Just rebased. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-21 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @gaborgsomogyi Totally makes sense. Let me address while the patch is reviewed by committers. I may get recommendations to rename the config or even more, so addressing documentation would

[GitHub] spark pull request #22331: [SPARK-25331][SS] Make FileStreamSink ignore part...

2018-09-21 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22331#discussion_r219399313 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StagingFileCommitProtocol.scala --- @@ -0,0 +1,141

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 According to the discussion on SPARK-10816, I'm holding up effort to improve and plan to discuss further from JIRA issue. I guess someone interested for this patch can still review or try

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r219367280 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -18,222 +18,247 @@ package

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 If we are fine with ignoring the optimal delta of state, or OK with addressing it in follow-up issue (it should be addressed in same release version to avoid having state V1, V2, etc...), I

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 @arunmahadevan We may want to be aware is that the requirement is pretty different from other streaming frameworks like Flink, which normally set a long period of checkpoint interval

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 Please review the general approach and direction first. I'm planning to spend time to rewrite streaming part to tightly integrate logic with state so that updating state is going

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-09-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22482 The patch is a bit huge, so I'm not sure we would be better to squash commits into one before reviewing. Two TODOs are left hence marking the patch as WIP, but closer to be a complete

[GitHub] spark pull request #22482: WIP - [SPARK-10816][SS] Support session window na...

2018-09-19 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/spark/pull/22482 WIP - [SPARK-10816][SS] Support session window natively ## What changes were proposed in this pull request? This patch proposes native support of session window, like Spark has been

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-19 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r218955883 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-19 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r218777053 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,243

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 > just wondering why org.apache.spark.sql.kafka010.CachedKafkaProducer uses com.google.common.cache.LoadingCache? Because KafkaProducer is thread-safe unless it enables transact

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-17 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Now vote for Spark 2.4 is in progress. If we are not in stand-by mode for any blocker issues for Spark 2.4 RC, I'd be really happy if someone could revisit this and continue reviewing

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 Regarding metrics in FetchedDataPool, I just add basic metrics so that tests can leverage on verification. I was adding numActive as well as numIdle, but tracking and measuring them needs more

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-07 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215867141 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -18,222 +18,247 @@ package

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-06 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215818860 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/FetchedPoolSuite.scala --- @@ -0,0 +1,299 @@ +/* + * Licensed

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-06 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215637613 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaDataConsumer.scala --- @@ -18,222 +18,247 @@ package

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-06 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215635068 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-05 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215313888 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/InternalKafkaConsumerPool.scala --- @@ -0,0 +1,241

[GitHub] spark pull request #22138: [SPARK-25151][SS] Apply Apache Commons Pool to Ka...

2018-09-05 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22138#discussion_r215313215 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/FetchedPoolSuite.scala --- @@ -0,0 +1,299 @@ +/* + * Licensed

  1   2   3   4   5   >