date:20200126

[GitHub] [spark] SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression 
allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578629266
 
 
   **[Test build #117434 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117434/testReport)**
 for PR 27058 at commit 
[`7a74aae`](https://github.com/apache/spark/commit/7a74aae09f8f696102c5b92b850d572d64fd9cb1).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter 
expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578625013
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support 
Filter expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578625016
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22193/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support 
Filter expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578625013
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter 
expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578625016
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22193/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578621322
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117433/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578621317
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578621322
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117433/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578621317
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in 
DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578621153
 
 
   **[Test build #117433 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117433/testReport)**
 for PR 27021 at commit 
[`6c87a41`](https://github.com/apache/spark/commit/6c87a41df7555085bd1271ef86414f5f0452314f).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class PartitionIterator[T](reader: PartitionReader[T]) extends 
Iterator[T] `
 * `class MetricsHandler extends Logging with Serializable `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578619376
 
 
   **[Test build #117433 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117433/testReport)**
 for PR 27021 at commit 
[`6c87a41`](https://github.com/apache/spark/commit/6c87a41df7555085bd1271ef86414f5f0452314f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support 
Filter expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578620873
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22191/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27058: [SPARK-30276][SQL] Support 
Filter expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578620862
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578620910
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578620910
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578620914
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22192/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter 
expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578620862
 
 
   Build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27058: [SPARK-30276][SQL] Support Filter 
expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578620873
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22191/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578620914
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22192/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] xuzikun2003 commented on a change in pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-01-26 Thread GitBox

xuzikun2003 commented on a change in pull request #27019: [SPARK-30027][SQL] 
Support codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#discussion_r371088520
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ##
 @@ -329,6 +328,39 @@ case class HashAggregateExec(
 }
   }
 
+  private def generateEvalCodeForAggFuncs(
+  ctx: CodegenContext,
+  input: Seq[ExprCode],
+  inputAttrs: Seq[Attribute],
+  boundUpdateExprs: Seq[Seq[Expression]],
+  aggNames: Seq[String],
+  aggCodeBlocks: Seq[Block],
+  subExprs: SubExprCodes): String = {
+val aggCodes = if (conf.codegenSplitAggregateFunc &&
+  aggCodeBlocks.map(_.length).sum > conf.methodSplitThreshold) {
+  val maybeSplitCodes = splitAggregateExpressions(
+ctx, aggNames, boundUpdateExprs, aggCodeBlocks, subExprs.states)
+
+  maybeSplitCodes.getOrElse(aggCodeBlocks.map(_.code))
+} else {
+  aggCodeBlocks.map(_.code)
+}
+
+aggCodes.zip(aggregateExpressions.map(ae => (ae.mode, ae.filter))).map {
+  case (aggCode, (Partial | Complete, Some(condition))) =>
+// Note: wrap in "do { } while(false);", so the generated checks can 
jump out
+// with "continue;"
+s"""
+   |do {
+   |  ${generatePredicateCode(ctx, condition, inputAttrs, input)}
+   |  $aggCode
+   |} while(false);
 
 Review comment:
   Got it. Thanks for the explanation. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in 
DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578619376
 
 
   **[Test build #117433 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117433/testReport)**
 for PR 27021 at commit 
[`6c87a41`](https://github.com/apache/spark/commit/6c87a41df7555085bd1271ef86414f5f0452314f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on issue #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL conversion USING "show create table"

2020-01-26 Thread GitBox

gatorsmile commented on issue #24938: [SPARK-27946][SQL] Hive DDL to Spark DDL 
conversion USING "show create table"
URL: https://github.com/apache/spark/pull/24938#issuecomment-578618967
 
 
   ping @viirya Do you think we can finish it before the code freeze?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

SparkQA commented on issue #27058: [SPARK-30276][SQL] Support Filter expression 
allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#issuecomment-578618570
 
 
   **[Test build #117432 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117432/testReport)**
 for PR 27058 at commit 
[`71ba1f4`](https://github.com/apache/spark/commit/71ba1f46229cb9443658818b1f94b2973fbc37ce).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578617191
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117430/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578617183
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578617191
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117430/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578617183
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-01-26 Thread GitBox

beliefer commented on a change in pull request #27058: [SPARK-30276][SQL] 
Support Filter expression allows simultaneous use of DISTINCT
URL: https://github.com/apache/spark/pull/27058#discussion_r371085072
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
 ##
 @@ -148,24 +207,106 @@ object RewriteDistinctAggregates extends 
Rule[LogicalPlan] {
 val distinctAggs = exprs.flatMap { _.collect {
   case ae: AggregateExpression if ae.isDistinct => ae
 }}
-// We need at least two distinct aggregates for this rule because 
aggregation
-// strategy can handle a single distinct group.
+// This rule serves two purposes:
+// One is to rewrite when there exists at least two distinct aggregates. 
We need at least
+// two distinct aggregates for this rule because aggregation strategy can 
handle a single
+// distinct group.
+// Another is to expand distinct aggregates which exists filter clause so 
that we can
+// evaluate the filter locally.
 // This check can produce false-positives, e.g., SUM(DISTINCT a) & 
COUNT(DISTINCT a).
-distinctAggs.size > 1
+distinctAggs.size >= 1 || distinctAggs.exists(_.filter.isDefined)
   }
 
   def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
-case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a)
+case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) =>
+  val expandAggregate = extractFiltersInDistinctAggregate(a)
+  rewriteDistinctAggregate(expandAggregate)
   }
 
-  def rewrite(a: Aggregate): Aggregate = {
+  private def extractFiltersInDistinctAggregate(a: Aggregate): Aggregate = {
 
 Review comment:
   OK


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588187
 
 
   **[Test build #117430 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117430/testReport)**
 for PR 27365 at commit 
[`a5d975c`](https://github.com/apache/spark/commit/a5d975cf64b50458f716d235e754ccf9bd2b27c4).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

SparkQA commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source 
files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578616313
 
 
   **[Test build #117430 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117430/testReport)**
 for PR 27365 at commit 
[`a5d975c`](https://github.com/apache/spark/commit/a5d975cf64b50458f716d235e754ccf9bd2b27c4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615281
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117431/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615276
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615276
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in 
DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615143
 
 
   **[Test build #117431 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117431/testReport)**
 for PR 27021 at commit 
[`1be7a33`](https://github.com/apache/spark/commit/1be7a334d8eaf34d62f0a2039461e841fe740bb2).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class PartitionIterator[T](reader: PartitionReader[T]) extends 
Iterator[T] `
 * `class MetricsHandler extends Logging with Serializable `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615281
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117431/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578614124
 
 
   **[Test build #117431 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117431/testReport)**
 for PR 27021 at commit 
[`1be7a33`](https://github.com/apache/spark/commit/1be7a334d8eaf34d62f0a2039461e841fe740bb2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615024
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22190/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615024
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22190/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615014
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578615014
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

SparkQA commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in 
DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578614124
 
 
   **[Test build #117431 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117431/testReport)**
 for PR 27021 at commit 
[`1be7a33`](https://github.com/apache/spark/commit/1be7a334d8eaf34d62f0a2039461e841fe740bb2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sandeep-katta commented on issue #27021: [SPARK-30362][Core] Update InputMetrics in DataSourceRDD

2020-01-26 Thread GitBox

sandeep-katta commented on issue #27021: [SPARK-30362][Core] Update 
InputMetrics in DataSourceRDD
URL: https://github.com/apache/spark/pull/27021#issuecomment-578611669
 
 
   @rdblue  please review this, I have tested the changes from my end.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on a change in pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-01-26 Thread GitBox

HeartSaVioR commented on a change in pull request #27019: [SPARK-30027][SQL] 
Support codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#discussion_r371073779
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ##
 @@ -329,6 +328,39 @@ case class HashAggregateExec(
 }
   }
 
+  private def generateEvalCodeForAggFuncs(
+  ctx: CodegenContext,
+  input: Seq[ExprCode],
+  inputAttrs: Seq[Attribute],
+  boundUpdateExprs: Seq[Seq[Expression]],
+  aggNames: Seq[String],
+  aggCodeBlocks: Seq[Block],
+  subExprs: SubExprCodes): String = {
+val aggCodes = if (conf.codegenSplitAggregateFunc &&
+  aggCodeBlocks.map(_.length).sum > conf.methodSplitThreshold) {
+  val maybeSplitCodes = splitAggregateExpressions(
+ctx, aggNames, boundUpdateExprs, aggCodeBlocks, subExprs.states)
+
+  maybeSplitCodes.getOrElse(aggCodeBlocks.map(_.code))
+} else {
+  aggCodeBlocks.map(_.code)
+}
+
+aggCodes.zip(aggregateExpressions.map(ae => (ae.mode, ae.filter))).map {
+  case (aggCode, (Partial | Complete, Some(condition))) =>
+// Note: wrap in "do { } while(false);", so the generated checks can 
jump out
+// with "continue;"
+s"""
+   |do {
+   |  ${generatePredicateCode(ctx, condition, inputAttrs, input)}
+   |  $aggCode
+   |} while(false);
 
 Review comment:
   NOTE in above code comment would be enough to explain why, right? It still 
executes only once, but be able to exit the specific code block instead of 
exiting the function/method in the middle of the code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on a change in pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-01-26 Thread GitBox

HeartSaVioR commented on a change in pull request #27019: [SPARK-30027][SQL] 
Support codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#discussion_r371073779
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ##
 @@ -329,6 +328,39 @@ case class HashAggregateExec(
 }
   }
 
+  private def generateEvalCodeForAggFuncs(
+  ctx: CodegenContext,
+  input: Seq[ExprCode],
+  inputAttrs: Seq[Attribute],
+  boundUpdateExprs: Seq[Seq[Expression]],
+  aggNames: Seq[String],
+  aggCodeBlocks: Seq[Block],
+  subExprs: SubExprCodes): String = {
+val aggCodes = if (conf.codegenSplitAggregateFunc &&
+  aggCodeBlocks.map(_.length).sum > conf.methodSplitThreshold) {
+  val maybeSplitCodes = splitAggregateExpressions(
+ctx, aggNames, boundUpdateExprs, aggCodeBlocks, subExprs.states)
+
+  maybeSplitCodes.getOrElse(aggCodeBlocks.map(_.code))
+} else {
+  aggCodeBlocks.map(_.code)
+}
+
+aggCodes.zip(aggregateExpressions.map(ae => (ae.mode, ae.filter))).map {
+  case (aggCode, (Partial | Complete, Some(condition))) =>
+// Note: wrap in "do { } while(false);", so the generated checks can 
jump out
+// with "continue;"
+s"""
+   |do {
+   |  ${generatePredicateCode(ctx, condition, inputAttrs, input)}
+   |  $aggCode
+   |} while(false);
 
 Review comment:
   NOTE in above code comment would be enough to explain why, right? It still 
executes only once, but be able to exit the specific code block via `continue` 
instead of exiting the function/method in the middle of the code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] xuzikun2003 commented on a change in pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-01-26 Thread GitBox

xuzikun2003 commented on a change in pull request #27019: [SPARK-30027][SQL] 
Support codegen for aggregate filters in HashAggregateExec
URL: https://github.com/apache/spark/pull/27019#discussion_r371071649
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ##
 @@ -329,6 +328,39 @@ case class HashAggregateExec(
 }
   }
 
+  private def generateEvalCodeForAggFuncs(
+  ctx: CodegenContext,
+  input: Seq[ExprCode],
+  inputAttrs: Seq[Attribute],
+  boundUpdateExprs: Seq[Seq[Expression]],
+  aggNames: Seq[String],
+  aggCodeBlocks: Seq[Block],
+  subExprs: SubExprCodes): String = {
+val aggCodes = if (conf.codegenSplitAggregateFunc &&
+  aggCodeBlocks.map(_.length).sum > conf.methodSplitThreshold) {
+  val maybeSplitCodes = splitAggregateExpressions(
+ctx, aggNames, boundUpdateExprs, aggCodeBlocks, subExprs.states)
+
+  maybeSplitCodes.getOrElse(aggCodeBlocks.map(_.code))
+} else {
+  aggCodeBlocks.map(_.code)
+}
+
+aggCodes.zip(aggregateExpressions.map(ae => (ae.mode, ae.filter))).map {
+  case (aggCode, (Partial | Complete, Some(condition))) =>
+// Note: wrap in "do { } while(false);", so the generated checks can 
jump out
+// with "continue;"
+s"""
+   |do {
+   |  ${generatePredicateCode(ctx, condition, inputAttrs, input)}
+   |  $aggCode
+   |} while(false);
 
 Review comment:
   I don't understand why "while(false)" can take an effect here. Could you 
explain why it is needed here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update 
testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578601545
 
 
   ? @zero323 . It seems that you missed my point. I advised like the following.
   >  I'd like to recommend you to mention what you've done clearly. That's 
enough.
   
   Let me rephrase my words. "In the PR description, write that you didn't run 
the full test. Especially Arrow tests are skipped". It was my request.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR edited a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

HeartSaVioR edited a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF 
in source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588303
 
 
   > I think it's also fine to have git enforce it. Is there any downside to 
that?
   
   I don't think there's outstanding downside, as if it does have some 
considerable downsides someone should have been complained. Only 3 files were 
CR/LF and others have been LF. (cmd/bat files are enforced to have CR/LF as 
EOL, as they're only used in Windows OS.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588611
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588614
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22189/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588614
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22189/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588611
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

HeartSaVioR commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files, and enforce the EOL for java/scala/xml/py/R files to LF in 
gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588303
 
 
   > I think it's also fine to have git enforce it. Is there any downside to 
that?
   
   I don't think so, as if it does have some considerable downsides someone 
should have been complained. Only 3 files were CR/LF and others have been LF. 
(cmd/bat files are enforced to have CR/LF as EOL, as they're only used in 
Windows OS.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes

2020-01-26 Thread GitBox

SparkQA commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source 
files, and enforce the EOL for java/scala/xml/py/R files to LF in gitattributes
URL: https://github.com/apache/spark/pull/27365#issuecomment-578588187
 
 
   **[Test build #117430 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117430/testReport)**
 for PR 27365 at commit 
[`a5d975c`](https://github.com/apache/spark/commit/a5d975cf64b50458f716d235e754ccf9bd2b27c4).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

HeartSaVioR commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578586936
 
 
   @dongjoon-hyun `^M` in the PR description is CR/LF, so you may want to type 
CTRL+V -> CTRL+M in bash shell to get it. I'll update the PR description.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat
to >= 2.0.0
URL: https://github.com/apache/spark/pull/27359#issuecomment-578581851

> Please note that I'm supporting your effort on this PR. Otherwise, I'll
not chim in here to add comments.

Thank you, I appreciate that.

In general, full reproducible is defined by the Dockerfile which is shown at
the begging, but to put it here for reference

```
FROM rocker/verse:3.4.3
RUN apt-get update \
&& apt-get install -y --no-install-recommends gpg openjdk-8-jdk-headless
\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*ce

RUN wget -qO- https://keybase.io/zero323/pgp_keys.asc | gpg --import
RUN git clone --depth 1 --branch SPARK-23435
https://github.com/zero323/spark.git
WORKDIR spark
RUN git rev-parse HEAD
RUN git verify-commit -v HEAD
RUN build/mvn -DskipTests -Phive -Psparkr clean package
RUN R --version
RUN R -e "install.packages(c('e1071', 'praise'))"
RUN R -e "install.packages('testthat',
repos='https://cloud.r-project.org/'); packageVersion('testthat');
sessionInfo()"
RUN R/create-rd.sh
RUN R/create-docs.sh
RUN R/check-cran.sh
RUN R/run-tests.sh
```
It can be re-run to confirm that it reflects current state of things.

As show in the cast, build are done directly from this head of this branch
(signature is verified) and no changes to the codebase, beyond what is proposed
in this PR (and we don't touch any Arrow related components here at all).

As of skipping Arrow tests - that's default behavior defined in respective
test for example here

https://github.com/apache/spark/blob/43d9c7e7e57749ee611e0c97781a71a0645b5e9b/R/pkg/tests/fulltests/test_sparkSQL_arrow.R#L25

and following lines. So it is neither failure or result of any source
modification.

Can we make arrow tests run? Possibly, but:

- R Arrow package is not present in snapshot repositories used by rocker
images. Installing testthat from https://cloud.r-project.org, already pushed
things a lot. Additionally some transitive dependencies have hidden version
bounds.
- C++ Arrow bindings would require external system repositories, which can
break decencies for R.
- Using other images (let's say official R-base) is not an option, as we
need Tex as well as OpenSSL and Curl dev libraries and this will either break
or require update of R beyond 2.4 (at least it did for other build
configurations I considered).

At this point Spark has no coverage for any intermediate R version (Jenkins
runs 3.1 and then we have almost eight years of releases worth gap to 3.6 on
AppVeyor), not to mention version-OS combinations. That's troubling, and as
work related to this PR shown, can miss obvious errors. but not something
that can be really addressed by running ad-hoc tests outside project
infrastructure.

Anyway... If you have specific concerns about the process used here, and you
suspect that proposed changes can lead to problems in the future, I'll do my
best to address these.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun edited a comment on issue #27359: [SPARK-23435][SPARKR][TESTS] 
Update testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578573228
 
 
   We cannot say `We're good` when we know something wrong. I'd like to 
recommend you to mention what you've done clearly. That's enough. Please note 
that I'm supporting your effort on this PR. Otherwise, I'll not chim in here to 
add comments.
   > So skipped Arrow tests are expected.
   
   Especially, for the following.
   > I don't think that really affects the results though, as primary concern 
was CRAN tests and overall process, and Arrow related code hasn't been touched.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update 
testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578573228
 
 
   We cannot say `We're good` when we know something wrong. I'd like to 
recommend you to mention what you've done clearly. That's enough. Please note 
that I'm supporting your effort on this PR. Otherwise, I'll not chim in here to 
add comments.
   > So skipped Arrow tests are expected.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

BryanCutler commented on a change in pull request #27358: 
[SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to 
Pandas conversion
URL: https://github.com/apache/spark/pull/27358#discussion_r371049435
 
 

 ##
 File path: python/pyspark/sql/pandas/conversion.py
 ##
 @@ -109,7 +109,11 @@ def toPandas(self):
 # values, but we should use datetime.date to match the 
behavior with when
 # Arrow optimization is disabled.
 pdf = table.to_pandas(date_as_object=True)
-return _check_dataframe_localize_timestamps(pdf, 
timezone)
+for field in self.schema:
+if isinstance(field.dataType, TimestampType):
+pdf[field.name] = \
 
 Review comment:
   Thanks @viirya !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat 
to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578570778
 
 
   > @zero323 . Thank you for the screencast. However, it skipped all arrow 
related tests. Please playback the screencast.
   
   Unfortunately R arrow is not standalone package (like Python one) and it 
requires system packages with C++ bindings (installing `arrow` package is not 
sufficient), And that's dependency hell as these R images (there still more 
stable than R-base ones) are not really designed for updates. So skipped Arrow 
tests are expected.
   
   I don't think that really affects the results though, as primary concern was 
CRAN tests and overall process, and Arrow related code hasn't been touched.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun edited a comment on issue #27359: [SPARK-23435][SPARKR][TESTS] 
Update testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578569365
 
 
   @zero323 . Thank you for the screencast. However, it skipped all arrow 
related tests. Please playback the screencast.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update 
testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578569365
 
 
   @zero323 . Thank you for the screencast. However, it skipped all arrow 
related tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

dongjoon-hyun commented on a change in pull request #27359: 
[SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#discussion_r371046762
 
 

 ##
 File path: R/pkg/tests/run-all.R
 ##
 @@ -60,11 +59,23 @@ if (identical(Sys.getenv("NOT_CRAN"), "true")) {
   if (identical(Sys.getenv("NOT_CRAN"), "true")) {
 # set random seed for predictable results. mostly for base's sample() in 
tree and classification
 set.seed(42)
-# for testthat 1.0.2 later, change reporter from "summary" to 
default_reporter()
-testthat:::run_tests("SparkR",
- file.path(sparkRDir, "pkg", "tests", "fulltests"),
- NULL,
- "summary")
+
+# To be removed once testthat 1.x is removed from all builds
 
 Review comment:
   +1. BTW, we had better file a new JIRA and make this comment as an IDed TODO 
like the following.
   ```
   # TODO(SPARK-X) To be removed once testthat 1.x is removed from all 
builds
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat 
to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578568461
 
 
   > Please update the PR description. For example, the followings?
   
   All done @dongjoon-hyun 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun edited a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

dongjoon-hyun edited a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578568350
 
 
   @HeartSaVioR . When I follow the direction in the PR description at the 
master branch, the result is different. Did I miss something?
   ```
   $ git log --oneline -n1
   43d9c7e7e5 (HEAD -> master, apache/master, apache/HEAD) 
[SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to 
Pandas conversion
   
   $ grep -IUrl --color "^M" . | grep "\.java\|\.scala\|\.xml\|\.py\|\.R" | 
grep -v "/target/" | grep -v "/build/" | grep -v "/dist/" | grep -v 
"dependency-reduced-pom.xml" | grep -v ".pyc"
   ./python/pyspark/_globals.py
   ./python/pyspark/heapq3.py
   ./python/pyspark/mllib/linalg/__init__.py
   ./python/pyspark/shuffle.py
   ./python/pyspark/ml/linalg/__init__.py
   ./R/pkg/vignettes/sparkr-vignettes.Rmd
   ./examples/src/main/python/logistic_regression.py
   ./dev/github_jira_sync.py
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

dongjoon-hyun commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578568350
 
 
   @HeartSaVioR . When I follow the direction in the PR description, the result 
is different. Did I miss something?
   ```
   $ git log --oneline -n1
   43d9c7e7e5 (HEAD -> master, apache/master, apache/HEAD) 
[SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to 
Pandas conversion
   
   $ grep -IUrl --color "^M" . | grep "\.java\|\.scala\|\.xml\|\.py\|\.R" | 
grep -v "/target/" | grep -v "/build/" | grep -v "/dist/" | grep -v 
"dependency-reduced-pom.xml" | grep -v ".pyc"
   ./python/pyspark/_globals.py
   ./python/pyspark/heapq3.py
   ./python/pyspark/mllib/linalg/__init__.py
   ./python/pyspark/shuffle.py
   ./python/pyspark/ml/linalg/__init__.py
   ./R/pkg/vignettes/sparkr-vignettes.Rmd
   ./examples/src/main/python/logistic_regression.py
   ./dev/github_jira_sync.py
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

2020-01-26 Thread GitBox

zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat 
to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578568240
 
 
   @HyukjinKwon 
   
   > @zero323, do you mind if I ask to check R 3.4.x latest and testthat latest 
combination
   
   I'd say we're good:
   
   
[![asciicast](https://asciinema.org/a/xiIOy6OcntE6hxNXQwI7vcUl0.svg)](https://asciinema.org/a/xiIOy6OcntE6hxNXQwI7vcUl0)
   
   Additionally to local builds this gives us:
   
   - R 3.1.x, `testthat` 1.0.2 on Linux (Jenkins)
   - R 3.4.3, `testthat` 2.3.1 on Linux (docker build)
   - R 3.6.2, `testthtat` 2.3.1 on Windows


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

viirya commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] 
Prevent unnecessary copies of data during Arrow to Pandas conversion
URL: https://github.com/apache/spark/pull/27358#discussion_r371046394
 
 

 ##
 File path: python/pyspark/sql/pandas/conversion.py
 ##
 @@ -109,7 +109,11 @@ def toPandas(self):
 # values, but we should use datetime.date to match the 
behavior with when
 # Arrow optimization is disabled.
 pdf = table.to_pandas(date_as_object=True)
-return _check_dataframe_localize_timestamps(pdf, 
timezone)
+for field in self.schema:
+if isinstance(field.dataType, TimestampType):
+pdf[field.name] = \
 
 Review comment:
   ok. looks good then. thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

BryanCutler commented on a change in pull request #27358: 
[SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to 
Pandas conversion
URL: https://github.com/apache/spark/pull/27358#discussion_r371042017
 
 

 ##
 File path: python/pyspark/sql/pandas/conversion.py
 ##
 @@ -109,7 +109,11 @@ def toPandas(self):
 # values, but we should use datetime.date to match the 
behavior with when
 # Arrow optimization is disabled.
 pdf = table.to_pandas(date_as_object=True)
-return _check_dataframe_localize_timestamps(pdf, 
timezone)
+for field in self.schema:
+if isinstance(field.dataType, TimestampType):
+pdf[field.name] = \
 
 Review comment:
   Yeah, for the case of timestamps making a copy is unavailable. This is just 
to prevent non-timestamp columns that were also causing a copy when assigned 
back to the DataFrame


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] siknezevic commented on issue #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-01-26 Thread GitBox

siknezevic commented on issue #27246: [SPARK-30536][CORE][SQL] Sort-merge join 
operator spilling performance improvements
URL: https://github.com/apache/spark/pull/27246#issuecomment-578559734
 
 
   I fixed the issues in ExternalAppendOnlyUnsafeRowArray. Next, I coming days 
will push new PR for lazy spill reader initialization.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

viirya commented on a change in pull request #27358: [SPARK-30640][PYTHON][SQL] 
Prevent unnecessary copies of data during Arrow to Pandas conversion
URL: https://github.com/apache/spark/pull/27358#discussion_r371040215
 
 

 ##
 File path: python/pyspark/sql/pandas/conversion.py
 ##
 @@ -109,7 +109,11 @@ def toPandas(self):
 # values, but we should use datetime.date to match the 
behavior with when
 # Arrow optimization is disabled.
 pdf = table.to_pandas(date_as_object=True)
-return _check_dataframe_localize_timestamps(pdf, 
timezone)
+for field in self.schema:
+if isinstance(field.dataType, TimestampType):
+pdf[field.name] = \
 
 Review comment:
   Is it different? Doesn't this also assign the series back to the DataFrame?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] github-actions[bot] commented on issue #18898: [SPARK-21245][ML] Resolve code duplication for classification/regression summarizers

2020-01-26 Thread GitBox

github-actions[bot] commented on issue #18898: [SPARK-21245][ML] Resolve code 
duplication for classification/regression summarizers
URL: https://github.com/apache/spark/pull/18898#issuecomment-578557489
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] github-actions[bot] closed pull request #23327: [SPARK-26222][SQL] Track file listing time

2020-01-26 Thread GitBox

github-actions[bot] closed pull request #23327: [SPARK-26222][SQL] Track file 
listing time
URL: https://github.com/apache/spark/pull/23327
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] github-actions[bot] commented on issue #20690: [SPARK-23532][Standalone]Improve data locality when launching new executors for dynamic allocation

2020-01-26 Thread GitBox

github-actions[bot] commented on issue #20690: [SPARK-23532][Standalone]Improve 
data locality when launching new executors for dynamic allocation
URL: https://github.com/apache/spark/pull/20690#issuecomment-578557483
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler closed pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

BryanCutler closed pull request #27358: [SPARK-30640][PYTHON][SQL] Prevent 
unnecessary copies of data during Arrow to Pandas conversion
URL: https://github.com/apache/spark/pull/27358
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] BryanCutler commented on issue #27358: [SPARK-30640][PYTHON][SQL] Prevent unnecessary copies of data during Arrow to Pandas conversion

2020-01-26 Thread GitBox

BryanCutler commented on issue #27358: [SPARK-30640][PYTHON][SQL] Prevent 
unnecessary copies of data during Arrow to Pandas conversion
URL: https://github.com/apache/spark/pull/27358#issuecomment-578553730
 
 
   This is a pretty minor change, so I'm gonna go ahead and merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578546109
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117429/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to 
seed when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578546105
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to 
seed when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578546109
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117429/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578546105
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

SparkQA commented on issue #27354: [SPARK-30633][SQL] Append L to seed when 
type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578545881
 
 
   **[Test build #117429 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117429/testReport)**
 for PR 27354 at commit 
[`abe0be5`](https://github.com/apache/spark/commit/abe0be5e514eec1b014849300b5db12c12443a39).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578524901
 
 
   **[Test build #117429 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117429/testReport)**
 for PR 27354 at commit 
[`abe0be5`](https://github.com/apache/spark/commit/abe0be5e514eec1b014849300b5db12c12443a39).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27355: [SPARK-30625][SQL] Support 
`escape` as third parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578543478
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117428/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27355: [SPARK-30625][SQL] Support 
`escape` as third parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578543475
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27355: [SPARK-30625][SQL] Support `escape` as 
third parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578543478
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117428/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27355: [SPARK-30625][SQL] Support `escape` as 
third parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578543475
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

SparkQA commented on issue #27355: [SPARK-30625][SQL] Support `escape` as third 
parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578543163
 
 
   **[Test build #117428 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117428/testReport)**
 for PR 27355 at commit 
[`39e4bd2`](https://github.com/apache/spark/commit/39e4bd264b26c7840d7f1815b926c28837a50889).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27355: [SPARK-30625][SQL] Support `escape` as third parameter of the `like` function

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27355: [SPARK-30625][SQL] Support `escape` 
as third parameter of the `like` function
URL: https://github.com/apache/spark/pull/27355#issuecomment-578522206
 
 
   **[Test build #117428 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/117428/testReport)**
 for PR 27355 at commit 
[`39e4bd2`](https://github.com/apache/spark/commit/39e4bd264b26c7840d7f1815b926c28837a50889).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] asfgit closed pull request #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation

2020-01-26 Thread GitBox

asfgit closed pull request #26957: [SPARK-30314] Add identifier and catalog 
information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on issue #27289: [SPARK-30581][DOC] Document SORT BY Clause of SELECT statement in SQLReference

2020-01-26 Thread GitBox

dilipbiswal commented on issue #27289: [SPARK-30581][DOC] Document SORT BY 
Clause of SELECT statement in SQLReference
URL: https://github.com/apache/spark/pull/27289#issuecomment-578528636
 
 
   @maropu I had tried to document this in the main description section like 
this :
   
   `The SORT BY clause is used to return the result rows sorted within each 
partition in the user specified order. When there is more than one partition 
SORT BY may return result that is partially ordered. This is different than 
ORDER BY clause which guarantees a total order of the output.`
   
   What do you think ?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #27292: [SPARK-30582][WEBUI] Spark UI is not showing Aggregated Metrics by Executor in stage page

2020-01-26 Thread GitBox

SparkQA removed a comment on issue #27292: [SPARK-30582][WEBUI] Spark UI is not 
showing Aggregated Metrics by Executor in stage page
URL: https://github.com/apache/spark/pull/27292#issuecomment-578518568
 
 
   **[Test build #4994 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4994/testReport)**
 for PR 27292 at commit 
[`1efc3f5`](https://github.com/apache/spark/commit/1efc3f55e55e40b0fb1527938317482f0fb78cfa).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #27292: [SPARK-30582][WEBUI] Spark UI is not showing Aggregated Metrics by Executor in stage page

2020-01-26 Thread GitBox

SparkQA commented on issue #27292: [SPARK-30582][WEBUI] Spark UI is not showing 
Aggregated Metrics by Executor in stage page
URL: https://github.com/apache/spark/pull/27292#issuecomment-578526618
 
 
   **[Test build #4994 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4994/testReport)**
 for PR 27292 at commit 
[`1efc3f5`](https://github.com/apache/spark/commit/1efc3f55e55e40b0fb1527938317482f0fb78cfa).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to 
seed when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578525325
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578525328
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22188/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27354: [SPARK-30633][SQL] Append L to 
seed when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578525328
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/22188/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578525325
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578525218
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578525218
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

AmplabJenkins removed a comment on issue #27365: [MINOR][SQL] Convert CRLF into 
LF in source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578525220
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117425/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in source files

2020-01-26 Thread GitBox

AmplabJenkins commented on issue #27365: [MINOR][SQL] Convert CRLF into LF in 
source files
URL: https://github.com/apache/spark/pull/27365#issuecomment-578525220
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/117425/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] patrickcording commented on issue #27354: [SPARK-30633][SQL] Append L to seed when type is LongType

2020-01-26 Thread GitBox

patrickcording commented on issue #27354: [SPARK-30633][SQL] Append L to seed 
when type is LongType
URL: https://github.com/apache/spark/pull/27354#issuecomment-578525003
 
 
   @srowen, @dongjoon-hyun, I extended the first test to also run using integer 
seeds and when mixing integer and long seeds. I also extended `testHash` to 
explicitly use a long seed for hashing all sorts of inputs. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 270 matches

Mail list logo