Re: Support SqlStreaming in spark

2019-03-28 Thread uncleGen
Hi all, I have rewritten the design doc based on previous discussing. https://docs.google.com/document/d/19degwnIIcuMSELv6BQ_1VQI5AIVcvGeqOm5xE2-aRA0 Would be interested to hear what others think. Regards, Genmao Yu -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

Re: Support SqlStreaming in spark

2019-03-28 Thread uncleGen
Hi all, I have rewritten the design doc based on previous discussing. https://docs.google.com/document/d/19degwnIIcuMSELv6BQ_1VQI5AIVcvGeqOm5xE2-aRA0 Would be interested to hear what others think. Regards, Genmao Yu -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

[GitHub] spark pull request #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-28 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/22575#discussion_r237372804 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StreamTableDDLCommandSuite.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18099: [SPARK-18406][CORE][Backport-2.1] Race between end-of-ta...

2018-08-27 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/18099 same issue in spark 2.2.1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] kafka pull request #3897: KAFKA-5929: Save pre-assignment to file to avoid t...

2017-09-26 Thread uncleGen
GitHub user uncleGen reopened a pull request: https://github.com/apache/kafka/pull/3897 KAFKA-5929: Save pre-assignment to file to avoid too long text to display when do topic partition reassign When do partition reassign - before pr Pre-assignment will be printed directly

[GitHub] kafka pull request #3894: KAFKA-5928: Avoid redundant requests to zookeeper ...

2017-09-26 Thread uncleGen
GitHub user uncleGen reopened a pull request: https://github.com/apache/kafka/pull/3894 KAFKA-5928: Avoid redundant requests to zookeeper when reassign topic partition We mistakenly request topic level information according to partitions config in the assignment json file

[GitHub] kafka pull request #3894: KAFKA-5928: Avoid redundant requests to zookeeper ...

2017-09-24 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/kafka/pull/3894 ---

[GitHub] kafka pull request #3897: KAFKA-5929: Save pre-assignment to file to avoid t...

2017-09-19 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/kafka/pull/3897 KAFKA-5929: Save pre-assignment to file to avoid too long text to display when do topic partition reassign You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request #16656: [SPARK-18116][DStream] Report stream input inform...

2017-09-18 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/16656 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] kafka pull request #3894: KAFKA-5928: Avoid redundant requests to zookeeper ...

2017-09-18 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/kafka/pull/3894 KAFKA-5928: Avoid redundant requests to zookeeper when reassign topic partition We mistakenly request topic level information according to partitions config in the assignment json file

[GitHub] spark pull request #17395: [SPARK-20065][SS][WIP] Avoid to output empty parq...

2017-06-19 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/17395 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-05-11 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17052 @HyukjinKwon Sorry! Busy for this period of time. Let me resolve this conflict. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #17913: [SPARK-20672][SS] Keep the `isStreaming` property in tri...

2017-05-10 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17913 @zsxwing Great! Close this pr then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17913: [SPARK-20672][SS] Keep the `isStreaming` property...

2017-05-10 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/17913 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #17917: [SPARK-20600][SS] KafkaRelation should be pretty ...

2017-05-10 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17917#discussion_r115659920 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala --- @@ -143,4 +143,6 @@ private[kafka010] class

[GitHub] spark pull request #17913: [SPARK-20672][SS] Keep the `isStreaming` property...

2017-05-10 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17913#discussion_r115659132 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingRelation.scala --- @@ -48,7 +48,7 @@ case class StreamingRelation

[GitHub] spark issue #17913: [SPARK-20672][SS] Keep the `isStreaming` property in tri...

2017-05-10 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17913 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17913: [SPARK-20672][SS] Keep the `isStreaming` property in tri...

2017-05-09 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17913 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17896: [SPARK-20373][SQL][SS] Batch queries with 'Datase...

2017-05-09 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17896#discussion_r115415803 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2457,6 +2457,19 @@ object CleanupAliases extends Rule

[GitHub] spark pull request #17896: [SPARK-20373][SQL][SS] Batch queries with 'Datase...

2017-05-09 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17896#discussion_r115415668 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2457,6 +2457,19 @@ object CleanupAliases extends Rule

[GitHub] spark pull request #17913: [SPARK-20672][SS] Keep the `isStreaming` property...

2017-05-09 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17913#discussion_r115415483 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingRelation.scala --- @@ -64,8 +64,20 @@ case class

[GitHub] spark issue #17896: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataF...

2017-05-09 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17896 Depends upon: [SPARK-20672](https://issues.apache.org/jira/browse/SPARK-20672) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #17913: [SPARK-20672][SS] Keep the `isStreaming` property...

2017-05-09 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17913 [SPARK-20672][SS] Keep the `isStreaming` property in triggerLogicalPlan in Structured Streaming ## What changes were proposed in this pull request? In Structured Streaming

[GitHub] spark issue #17896: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataF...

2017-05-08 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17896 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17896: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataF...

2017-05-08 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17896 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17395: [SPARK-20065][SS][WIP] Avoid to output empty parquet fil...

2017-05-07 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17395 @HyukjinKwon Sorry for the long absence. I will keep online for next period of time. Please give me some time. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #17896: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataF...

2017-05-07 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17896 cc @zsxwing and @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17896: [SPARK-20373][SQL][SS] Batch queries with 'Datase...

2017-05-07 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17896 [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()` does not execute ## What changes were proposed in this pull request? Any Dataset/DataFrame batch query

[GitHub] spark pull request #17463: [SPARK-20131][DStream][Test] Flaky Test: org.apac...

2017-04-23 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/17463 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17463: [SPARK-20131][DStream][Test] Flaky Test: org.apache.spar...

2017-04-01 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17463 @srowen It's hard to say it's because shutting down SparkContext is the slow part, and we can improve this case by avoiding stooping SparkContext in a separate thread. cc @zsxwing

[GitHub] spark pull request #17463: [SPARK-20131][DStream][Test] Flaky Test: org.apac...

2017-03-28 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17463 [SPARK-20131][DStream][Test] Flaky Test: org.apache.spark.streaming.StreamingContextSuite ## What changes were proposed in this pull request? do not stop the `SparkContext` in thread

[GitHub] spark issue #17395: [SPARK-20065][SS] Avoid to output empty parquet files

2017-03-23 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17395 Let me change this pr into WIP based on the discussion with @HyukjinKwon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #17395: [SPARK-20065][SS] Avoid to output empty parquet f...

2017-03-23 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17395#discussion_r107821138 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -292,7 +292,10 @@ object FileFormatWriter

[GitHub] spark pull request #17395: [SPARK-20065][SS] Avoid to output empty parquet f...

2017-03-23 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17395#discussion_r107650070 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -292,7 +292,10 @@ object FileFormatWriter

[GitHub] spark pull request #17395: [SPARK-20065][SS] Avoid to output empty parquet f...

2017-03-23 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17395#discussion_r107637615 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala --- @@ -292,7 +292,10 @@ object FileFormatWriter

[GitHub] spark issue #16972: [SPARK-19556][CORE][WIP] Broadcast data is not encrypted...

2017-03-23 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16972 close it before i have a better solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16972: [SPARK-19556][CORE][WIP] Broadcast data is not en...

2017-03-23 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/16972 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #17395: [SPARK-20065][SS] Avoid to output empty parquet f...

2017-03-23 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17395 [SPARK-20065][SS] Avoid to output empty parquet files ## Problem Description Reported by Silvio Fiorito I've got a Kafka topic which I'm querying, running a windowed aggregation

[GitHub] spark issue #17371: [SPARK-19903][PYSPARK][SS] window operator miss the `wat...

2017-03-21 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17371 @viirya Great, you give a more clear explanation. > I am thinking, should we create new expression id for the watermarking column with withWatermark? So we must write the query l

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-03-21 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17052 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17371: [SPARK-19903][PYSPARK][SS] window operator miss t...

2017-03-21 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17371#discussion_r107101883 --- Diff: python/pyspark/sql/functions.py --- @@ -1163,7 +1163,10 @@ def check_string_field(field, fieldName): raise TypeError("%s s

[GitHub] spark pull request #17371: [SPARK-19903][PYSPARK][SS] window operator miss t...

2017-03-21 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17371#discussion_r107095629 --- Diff: python/pyspark/sql/functions.py --- @@ -1163,7 +1163,10 @@ def check_string_field(field, fieldName): raise TypeError("%s s

[GitHub] spark pull request #17371: [SPARK-19903][PYSPARK][SS] window operator miss t...

2017-03-21 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17371 [SPARK-19903][PYSPARK][SS] window operator miss the `watermark` metadata of time column ## What changes were proposed in this pull request? reproduce code: ``` import sys

[GitHub] spark issue #17052: [SPARK-19690][SS] Join a streaming DataFrame with a batc...

2017-03-19 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17052 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17352: [SPARK-20021][PySpark] Miss backslash in python code

2017-03-19 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17352 @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17352: Miss backslash in python code

2017-03-19 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17352 Miss backslash in python code ## What changes were proposed in this pull request? Add backslash for line continuation in python code. ## How was this patch tested

[GitHub] spark issue #17216: [SPARK-19873][SS] Record num shuffle partitions in offse...

2017-03-16 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17216 Does this PR mix in some test file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-16 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 @srowen Could you please take a view and help to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-14 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 ping @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

2017-03-13 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17267#discussion_r105827541 --- Diff: python/pyspark/sql/utils.py --- @@ -24,7 +24,7 @@ def __init__(self, desc, stackTrace): self.stackTrace = stackTrace

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-13 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 Thanks @HyukjinKwon,you give a good catch!I lost that case. Thanks @viirya for your suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-13 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 IMHO, yes. And @viirya is the original author. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-12 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 @viirya Thanks for you review. cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-12 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105575677 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -606,6 +607,24 @@ class KafkaSourceSuite

[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...

2017-03-12 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 Maybe @viirya can give some suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17267: [SPARK-19926][PYSPARK] Make pyspark exception mor...

2017-03-12 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17267 [SPARK-19926][PYSPARK] Make pyspark exception more readable ## What changes were proposed in this pull request? Exception in pyspark is a little difficult to read. before pr

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-12 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105553128 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -128,18 +123,18 @@ private[kafka010

[GitHub] spark pull request #17257: [DOCS][SS] fix structured streaming python exampl...

2017-03-11 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17257 [DOCS][SS] fix structured streaming python example ## What changes were proposed in this pull request? - SS python example: `TypeError: 'xxx' object is not callable` - some other doc

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-10 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17209#discussion_r105528025 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala --- @@ -450,10 +445,22 @@ private[kafka010

[GitHub] spark issue #17209: [SPARK-19853][SS] uppercase kafka topics fail when start...

2017-03-10 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17209 @zsxwing done,but forgot to push,I will update it as soon as possible when I connect to internet. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #17202: [SPARK-19861][SS] watermark should not be a negative tim...

2017-03-09 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17202 cc @srowen and @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17221: [SPARK-19859][SS][Follow-up] The new watermark should ov...

2017-03-08 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17221 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17209: [SPARK-19853][SS] uppercase kafka topics fail when start...

2017-03-08 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17209 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r105075392 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,8 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17221: [SPARK-19859][SS][Follow-up] The new watermark sh...

2017-03-08 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17221 [SPARK-19859][SS][Follow-up] The new watermark should override the old one. ## What changes were proposed in this pull request? A follow up to SPARK-19859: - extract

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r105069930 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17216: [SPARK-19873][SS] Record num shuffle partitions i...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17216#discussion_r105069281 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -380,7 +382,20 @@ class StreamExecution

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r105067664 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104930545 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104904202 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17209: [SPARK-19853][SS] uppercase kafka topics fail whe...

2017-03-08 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17209 [SPARK-19853][SS] uppercase kafka topics fail when startingOffsets are SpecificOffsets ## What changes were proposed in this pull request? When using the KafkaSource with Structured

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104899893 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,11 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17202#discussion_r104888633 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -576,6 +576,8 @@ class Dataset[T] private[sql]( val parsedDelay

[GitHub] spark issue #17202: [SPARK-19861][SS] watermark should not be a negative tim...

2017-03-07 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17202 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17202: [SPARK-19861][SS] watermark should not be a negat...

2017-03-07 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17202 [SPARK-19861][SS] watermark should not be a negative time. ## What changes were proposed in this pull request? watermark should not be a negative time. ## How was this patch

[GitHub] spark pull request #17052: [SPARK-19690][SS] Join a streaming DataFrame with...

2017-03-07 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17052#discussion_r104651490 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Source.scala --- @@ -57,10 +60,31 @@ trait Source { def getBatch(start

[GitHub] spark pull request #17144: [SPARK-19803][TEST] flaky BlockManagerReplication...

2017-03-07 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17144#discussion_r104635754 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -494,7 +494,9 @@ class

[GitHub] spark issue #17141: [SPARK-19800][SS][WIP] Implement one kind of streaming s...

2017-03-07 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17141 ping @tdas and @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17144: [SPARK-19803][TEST] flaky BlockManagerReplication...

2017-03-07 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17144#discussion_r104630604 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -494,7 +494,9 @@ class

[GitHub] spark pull request #17144: [SPARK-19803][TEST] flaky BlockManagerReplication...

2017-03-07 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17144#discussion_r104619252 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -494,7 +494,9 @@ class

[GitHub] spark issue #17144: [SPARK-19803][TEST] flaky BlockManagerReplicationSuite t...

2017-03-06 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17144 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17144: [SPARK-19803][TEST] flaky BlockManagerReplication...

2017-03-06 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17144#discussion_r104570057 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -494,7 +494,9 @@ class

[GitHub] spark pull request #17167: [SPARK-19822][TEST] CheckpointSuite.testCheckpoin...

2017-03-05 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17167#discussion_r104310809 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala --- @@ -152,11 +152,9 @@ trait DStreamCheckpointTester { self

[GitHub] spark issue #17145: [SPARK-19805][TEST] Log the row type when query result d...

2017-03-04 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17145 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17167: [SPARK-19822][TEST] CheckpointSuite.testCheckpointedOper...

2017-03-04 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17167 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17167: [SPARK-19822][TEST] CheckpointSuite.testCheckpointedOper...

2017-03-04 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17167 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16656: [SPARK-18116][DStream] Report stream input information a...

2017-03-04 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16656 ping @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17167: [SPARK-19822][TEST] CheckpointSuite.testCheckpoin...

2017-03-04 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17167 [SPARK-19822][TEST] CheckpointSuite.testCheckpointedOperation: should not check checkpointFilesOfLatestTime by the PATH string. ## What changes were proposed in this pull request? https

[GitHub] spark issue #17144: [SPARK-19803][TEST] flaky BlockManagerReplicationSuite t...

2017-03-04 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17144 @kayousterhout sure, I was being doing that flaky test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #17145: [SPARK-19805][TEST] Log the row type when query r...

2017-03-04 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17145#discussion_r104304108 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala --- @@ -312,13 +312,23 @@ object QueryTest { sparkAnswer: Seq[Row

[GitHub] spark pull request #17145: [SPARK-19805][TEST] Log the row type when query r...

2017-03-03 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/17145#discussion_r104157326 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala --- @@ -312,13 +312,23 @@ object QueryTest { sparkAnswer: Seq[Row

[GitHub] spark issue #17145: [SPARK-19805][TEST] Log the row type when query result d...

2017-03-03 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17145 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17144: [SPARK-19803][TEST] flaky BlockManagerReplicationSuite t...

2017-03-03 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17144 cc @kayousterhout --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17144: [SPARK-19803][TEST] flaky BlockManagerReplicationSuite t...

2017-03-03 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17144 test crash. retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17145: [SPARK-19805][TEST] Log the row type when query result d...

2017-03-03 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17145 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17145: [SPARK-19805][TEST] Log the row type when query type dos...

2017-03-02 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17145 unrelated failure: ` org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false`. retest this please. --- If your project is set up

[GitHub] spark issue #14731: [SPARK-17159] [streaming]: optimise check for new files ...

2017-03-02 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/14731 @srowen Waiting for your final OK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17144: [SPARK-19803][TEST] flaky BlockManagerReplicationSuite t...

2017-03-02 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17144 one more flaky test? `org.apache.spark.streaming.CheckpointSuite.recovery with map and reduceByKey operations` I will check it later. retest this please. --- If your project is set up for it, you

[GitHub] spark issue #17080: [SPARK-19739][CORE] propagate S3 session token to cluser

2017-03-02 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17080 cc @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #17145: [SPARK-19805][TEST] Log the row type when query t...

2017-03-02 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17145 [SPARK-19805][TEST] Log the row type when query type dose not match ## What changes were proposed in this pull request? before pr: ``` == Results == !== Correct Answer

  1   2   3   4   5   6   7   >