[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208594904 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208592556 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -871,6 +871,16 @@ object SQLConf { .intConf

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r208591232 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/package.scala --- @@ -81,4 +85,221 @@ package object state

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-08 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 My series of patches could be possible based on two metrics: `size for memory usage of latest version` and `size for total memory usage of loaded versions`. SPARK-24717 (#21700) enabled

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 Also added javadoc as well. Most of contents are from StateStore but I didn't copy the note to implementation for state store since it is duplicated. Please let me know if we want to add

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-07 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Done running perf. test with 4 more tests: > BenchmarkMovingAggregationsListenerKeyMuchBigger rate: 16 version | input rows per second | processed r

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 Thanks @zsxwing for merging and thanks all for reviewing! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 Thanks @HyukjinKwon for merging, and thanks all for reviewing! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Kindly reminder. I'll take the doc step when you say it's OK to go. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas Kindly reminder. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 @HyukjinKwon Could you take this forward given that the patch is minor and CI test is passed? Thanks in advance

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-08-06 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r207881161 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -513,6 +515,125 @@ class StreamSuite extends StreamTest

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-08-05 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 @zsxwing Addressed review comments. Please take a look again. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-08-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 @zsxwing OK I also wonder `debug` is applicable for streaming but wanted to fill the gap earlier. Will remove `debug` for streaming. Will update shortly. Thanks

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas I found the spare time to run performance tests though I've run only one app for now... I couldn't run the tests concurrently. Please let me know if you are not confident

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r207096556 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r207096219 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 Retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 Test failure looks unrelated. Jenkins, retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas I've applied your review comments except documentation. (Will add WIP to the PR's title if it sounds clearer) There may be something you can add the review comments and so I'd like

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206790470 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/MemoryStateStore.scala --- @@ -0,0 +1,53 @@ +/* + * Licensed

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206791736 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206784385 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/statefulOperators.scala --- @@ -201,33 +200,37 @@ object WatermarkSupport

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206791325 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StatefulOperatorsHelperSuite.scala --- @@ -0,0 +1,121

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206786014 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala --- @@ -53,7 +53,35 @@ class

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206780521 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206779898 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206781209 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206790358 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/MemoryStateStore.scala --- @@ -0,0 +1,53 @@ +/* + * Licensed

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206778127 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206780754 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206778355 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206790505 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StatefulOperatorsHelperSuite.scala --- @@ -0,0 +1,121

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206788634 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StatefulOperatorsHelperSuite.scala --- @@ -0,0 +1,121

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206778971 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206778077 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulOperatorsHelper.scala --- @@ -0,0 +1,137

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206775357 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -871,6 +871,16 @@ object SQLConf { .intConf

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r206776398 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -871,6 +871,16 @@ object SQLConf { .intConf

[GitHub] spark pull request #21622: [SPARK-24637][SS] Add metrics regarding state and...

2018-08-01 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21622#discussion_r206766835 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MetricsReporter.scala --- @@ -39,6 +42,23 @@ class MetricsReporter

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 @zsxwing Kindly reminder. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 Pinging @tdas and @zsxwing for reviewing. It's small one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 @tdas Thanks for the review! Addressed review comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21469#discussion_r206755595 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala --- @@ -48,12 +49,24 @@ class StateOperatorProgress private[sql

[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21469#discussion_r206755538 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala --- @@ -48,12 +49,24 @@ class StateOperatorProgress private[sql

[GitHub] spark pull request #21469: [SPARK-24441][SS] Expose total estimated size of ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21469#discussion_r206754359 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala --- @@ -81,10 +81,10 @@ class SQLMetric(val metricType

[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21357 @tdas The rationalization of this patch is to group functions which deal with delta and snapshot files into one so that the difference between delta file and snapshot file will be clearly

[GitHub] spark pull request #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStorePr...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR closed the pull request at: https://github.com/apache/spark/pull/21357 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-31 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @tdas Thanks for the detailed review! I'll follow up your comments and update the patch. Btw, If my memory is right, I tried out increasing "rate" while benchmarking

[GitHub] spark pull request #21721: [SPARK-24748][SS] Support for reporting custom me...

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21721#discussion_r206392487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala --- @@ -163,7 +163,8 @@ class SourceProgress protected[sql]( val

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206388466 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/TextSocketStreamSuite.scala --- @@ -300,6 +301,100 @@ class

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206386593 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206385714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206357959 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206371107 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206388213 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/TextSocketStreamSuite.scala --- @@ -300,6 +301,100 @@ class

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r206384495 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,295

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21721 Looks like we would we also need to add SourceProgress and SinkProgress into mima exclude list. --- - To unsubscribe, e

[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21199 @arunmahadevan Thanks for rebasing. I'll take a look. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21721: [SPARK-24748][SS] Support for reporting custom me...

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21721#discussion_r206349942 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/SupportsCustomWriterMetrics.java --- @@ -0,0 +1,45

[GitHub] spark issue #20859: [SPARK-23702][SS] Forbid watermarks on both sides of sta...

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/20859 How would we like to handle this patch? I guess we add feature on handling multiple watermarks in #21701 so based on the direction this patch might be going to be abandoned. IMHO I'm not 100

[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/18410 Looks like this PR is not needed, since `CompactibleFileStreamLog` also takes care of metadata log. https://github.com/apache/spark/commits/master/sql/core/src/main/scala/org/apache

[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source

2018-07-30 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21199 @arunmahadevan Sorry I forgot to review this so far. Could you fix merge conflicts? I'd pull the code to the local and review since the code diff is not small

[GitHub] spark issue #20675: [SPARK-23033][SS][Follow Up] Task level retry for contin...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/20675 Looks like the patch is outdated, and when continuous query supports shuffled stateful operators, implementing task level retry is not that trivial. To get correct result of aggregation, when

[GitHub] spark pull request #21721: [SPARK-24748][SS] Support for reporting custom me...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21721#discussion_r206010782 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/SupportsCustomWriterMetrics.java --- @@ -0,0 +1,45

[GitHub] spark pull request #21721: [SPARK-24748][SS] Support for reporting custom me...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21721#discussion_r206010370 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala --- @@ -143,18 +150,50 @@ trait ProgressReporter

[GitHub] spark pull request #21721: [SPARK-24748][SS] Support for reporting custom me...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21721#discussion_r206009928 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala --- @@ -143,18 +150,50 @@ trait ProgressReporter

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 Thanks @zsxwing for reviewing! Addressed review comments. Please take a look at again. Thanks in advance

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r205961782 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala --- @@ -116,6 +175,30 @@ package object debug

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r205961840 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala --- @@ -88,23 +100,70 @@ package object debug

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r205961761 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -829,6 +955,18 @@ class StreamSuite extends StreamTest

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r205961813 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala --- @@ -88,23 +100,70 @@ package object debug

[GitHub] spark pull request #21222: [SPARK-24161][SS] Enable debug package feature on...

2018-07-29 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21222#discussion_r205961764 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -513,6 +514,131 @@ class StreamSuite extends StreamTest

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-20 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 Add tests for StatefulOperatorsHelper itself as well. (Sorry for pushing commits multiple times which trigger multiple builds. It might be ideal if older test builds are terminated once newer

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 Now I'd like to propose changing default behavior to apply new path but keeping backward compatibility, so applied it to the patch. I'm still open on decision to apply it as advanced option

[GitHub] spark issue #21700: [SPARK-24717][SS] Split out max retain version of state ...

2018-07-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21700 My pleasure. Thanks for spending your time to review thoughtfully and merge this! --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21700: [SPARK-24717][SS] Split out max retain version of state ...

2018-07-18 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21700 @tdas Addressed review comments. Please take a look again. Thanks in advance! --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-18 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r203577783 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -270,11 +273,43 @@ private

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-18 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r203577561 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala --- @@ -64,21 +66,143 @@ class StateStoreSuite

[GitHub] spark issue #21700: [SPARK-24717][SS] Split out max retain version of state ...

2018-07-17 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21700 @tdas Thanks for the detailed review! Addressed review comments. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-17 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r202933053 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -270,11 +273,42 @@ private

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-17 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r202932784 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala --- @@ -64,21 +64,122 @@ class StateStoreSuite

[GitHub] spark issue #21357: [SPARK-24311][SS] Refactor HDFSBackedStateStoreProvider ...

2018-07-16 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21357 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21222: [SPARK-24161][SS] Enable debug package feature on struct...

2018-07-16 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21222 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-12 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @arunmahadevan @jose-torres https://issues.apache.org/jira/browse/SPARK-24763?focusedCommentId=16541367=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment

[GitHub] spark issue #21700: [SPARK-24717][SS] Split out max retain version of state ...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21700 @jose-torres Addressed review comment. Please take a look again. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r201866930 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala --- @@ -99,43 +102,84 @@ class StateStoreSuite

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r201866279 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala --- @@ -64,6 +64,63 @@ class StateStoreSuite

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 I guess we would have to treat reducing state memory size to have worth to do: as described in above commit, we already optimized in HDFSBackedStateStoreProvider for reducing state store disk

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r201848371 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -825,6 +825,16 @@ object SQLConf { .intConf

[GitHub] spark issue #21733: [SPARK-24763][SS] Remove redundant key data from value i...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 @arunmahadevan I'm actually in favor of changing default behavior, just not 100% sure the result would be promising for exhaustive use cases. I might need to prepare more kinds of key

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r201829938 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -825,6 +825,16 @@ object SQLConf { .intConf

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-11 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 retest this, please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-10 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r201562690 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -239,8 +241,9 @@ private

[GitHub] spark pull request #21700: [SPARK-24717][SS] Split out max retain version of...

2018-07-10 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21700#discussion_r201515401 --- Diff: sql/core/src/main/java/org/apache/spark/sql/streaming/state/BoundedSortedMap.java --- @@ -0,0 +1,145 @@ +/* + * Licensed

[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-10 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21469 Now I'm thinking about removing "metricProviderLoaderCountOfVersionsInMap" and also removing StateStoreCustomAverageMetric, since the value doesn't look correct with stream-stream

[GitHub] spark issue #21622: [SPARK-24637][SS] Add metrics regarding state and waterm...

2018-07-10 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21622 Thanks for reviewing @arunmahadevan and @jose-torres ! Could we finalize review on #21469 to see a chance to include "providerLoadedMapSizeBytes" to here? Or is it OK to handle it w

[GitHub] spark pull request #21733: [SPARK-24763][SS] Remove redundant key data from ...

2018-07-10 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/21733#discussion_r201483544 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -825,6 +825,16 @@ object SQLConf { .intConf

<    1   2   3   4   5   >