[GitHub] [spark] AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer 
cancel job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530889195
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110519/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer 
cancel job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530889190
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25743: [SPARK-29036][SQL]SparkThriftServer 
cancel job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530878720
 
 
   **[Test build #110519 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110519/testReport)**
 for PR 25743 at commit 
[`202f5ee`](https://github.com/apache/spark/commit/202f5eef963820af574bcdfad62da4e00255d8ba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
SparkQA commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel 
job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530889036
 
 
   **[Test build #110519 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110519/testReport)**
 for PR 25743 at commit 
[`202f5ee`](https://github.com/apache/spark/commit/202f5eef963820af574bcdfad62da4e00255d8ba).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal commented on issue #25554: [SPARK-28796][DOC]Document DROP DATABASE statement in SQL Reference

2019-09-12 Thread GitBox
dilipbiswal commented on issue #25554: [SPARK-28796][DOC]Document DROP DATABASE 
statement in SQL Reference
URL: https://github.com/apache/spark/pull/25554#issuecomment-530887863
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25766: [SPARK-29061][SQL] 
Prints bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#discussion_r323816539
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/debug/DebuggingSuite.scala
 ##
 @@ -90,4 +90,30 @@ class DebuggingSuite extends SharedSparkSession {
 | id LongType: {}
 |""".stripMargin))
   }
+
+  test("Prints bytecode statistics in debugCodegen") {
+Seq(("SELECT sum(v) FROM VALUES(1) t(v)", (0, 0)),
+  // We expect HashAggregate uses an inner class for fast hash maps
+  // in partial aggregates with keys.
 
 Review comment:
   I'd like to avoid end-to-end tests in this case. It's highly coupled with 
how we codegen these operators and is easy to break if we change the 
implementation in the future.
   
   Can we add some UT that calls `CodeGenerator.compile` directly?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25766: [SPARK-29061][SQL] 
Prints bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#discussion_r323814030
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ##
 @@ -1336,11 +1347,13 @@ object CodeGenerator extends Logging {
 val codeAttr = 
Utils.classForName("org.codehaus.janino.util.ClassFile$CodeAttribute")
 val codeAttrField = codeAttr.getDeclaredField("code")
 codeAttrField.setAccessible(true)
-val codeSizes = classes.flatMap { case (_, classBytes) =>
-  
CodegenMetrics.METRIC_GENERATED_CLASS_BYTECODE_SIZE.update(classBytes.length)
+val codeStats = classes.map { case (_, classBytes) =>
 
 Review comment:
   I would like to make the code more readable, by
   ```
   val (classSizes, maxMethodSizes, constPoolSize) = classes.mapunzip3
   ByteCodeStats(
 maxClassCodeSize = classSizes.max,
 maxMethodCodeSize = maxMethodSizes.max,
 maxConstPoolSize = constPoolSize.max,
 // Minus 2 for `GeneratedClass` and an outer-most generated class
 numInnerClasses = classSizes.size - 2)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #25751: [SPARK-29042][Core] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-12 Thread GitBox
cloud-fan commented on issue #25751: [SPARK-29042][Core] Sampling-based RDD 
with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25751#issuecomment-530882523
 
 
   make sense, LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25751: [SPARK-29042][Core] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25751: [SPARK-29042][Core] 
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25751#discussion_r323810510
 
 

 ##
 File path: 
core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
 ##
 @@ -2779,6 +2779,45 @@ class DAGSchedulerSuite extends SparkFunSuite with 
LocalSparkContext with TimeLi
   .contains("Spark cannot rollback the ShuffleMapStage 1"))
   }
 
+  test("SPARK-29042: Sampled RDD with unordered input should be 
indeterminate") {
+val shuffleMapRdd1 = new MyRDD(sc, 2, Nil, indeterminate = false)
+
+val shuffleDep1 = new ShuffleDependency(shuffleMapRdd1, new 
HashPartitioner(2))
+val shuffleId1 = shuffleDep1.shuffleId
+val shuffleMapRdd2 = new MyRDD(sc, 2, List(shuffleDep1), tracker = 
mapOutputTracker)
+
+assert(shuffleMapRdd2.outputDeterministicLevel == 
DeterministicLevel.UNORDERED)
+
+val sampledRdd = shuffleMapRdd2.sample(true, 0.3, 1000L)
+assert(sampledRdd.outputDeterministicLevel == 
DeterministicLevel.INDETERMINATE)
 
 Review comment:
   I think we can stop here. We have enough test coverage for test rerun when 
the RDD is INDETERMINATE. We just need to prove that the sampled RDD with 
unordered input is INDETERMINATE


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] alaazbair edited a comment on issue #25682: [SPARK-28842][DOC]Cleanup the formatting/trailing spaces in the K8s integration testing guide

2019-09-12 Thread GitBox
alaazbair edited a comment on issue #25682: [SPARK-28842][DOC]Cleanup the 
formatting/trailing spaces in the K8s integration testing guide
URL: https://github.com/apache/spark/pull/25682#issuecomment-528546021
 
 
   @holdenk Could you please review my PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
Support passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#discussion_r323808715
 
 

 ##
 File path: 
external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroDataSourceV2.scala
 ##
 @@ -35,7 +36,10 @@ class AvroDataSourceV2 extends FileDataSourceV2 {
 AvroTable(tableName, sparkSession, options, paths, None, 
fallbackFileFormat)
   }
 
-  override def getTable(options: CaseInsensitiveStringMap, schema: 
StructType): Table = {
+  override def getTable(
+  options: CaseInsensitiveStringMap,
+  schema: StructType,
+  partitions: Array[Transform]): Table = {
 
 Review comment:
   Or we can make table properties case insensitive.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
Support passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#discussion_r323808342
 
 

 ##
 File path: 
external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroDataSourceV2.scala
 ##
 @@ -35,7 +36,10 @@ class AvroDataSourceV2 extends FileDataSourceV2 {
 AvroTable(tableName, sparkSession, options, paths, None, 
fallbackFileFormat)
   }
 
-  override def getTable(options: CaseInsensitiveStringMap, schema: 
StructType): Table = {
+  override def getTable(
+  options: CaseInsensitiveStringMap,
+  schema: StructType,
+  partitions: Array[Transform]): Table = {
 
 Review comment:
   But we do have a problem here. Table properties are case sensitive while 
scan options are case insensitive.
   
   Think about 2 cases:
   1. `spark.read.format("myFormat").options(...).schema(...).load()`.
   We need to get the table with the user-specifed options and schema. When 
scan the table, we need to use the user-specified options as scan options. The 
problem is, `DataFrameReader.options` specifies both table properties and scan 
options in this case.
   2. `CREATE TABLE t USING myFormat TABLEPROP ...` and then 
`spark.read.options(...).table("t")`
   In this case, `DataFrameReader.options` only specifies scan options.
   
   Ideally, `TableProvider.getTable` takes table properties which should be 
case sensitive. However, `DataFrameReader.options` also specifies scan options 
which should be case insensitive.
   
   I don't have a good idea now. Maybe it's OK to treat this as a special table 
which accepts case insensitive table properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on issue #25751: [SPARK-29042][Core] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-12 Thread GitBox
viirya commented on issue #25751: [SPARK-29042][Core] Sampling-based RDD with 
unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25751#issuecomment-530879739
 
 
   It is a problem in ML applications.
   
   In ML, sample is used to prepare training data. ML algorithm fits the model 
based on the sampled data. If rerun tasks of sample produce different output 
during model fitting, ML results will be unreliable and also buggy.
   
   Each sample is random output, but once you sampled, the output should be 
determinate.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
SparkQA commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel 
job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530878720
 
 
   **[Test build #110519 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110519/testReport)**
 for PR 25743 at commit 
[`202f5ee`](https://github.com/apache/spark/commit/202f5eef963820af574bcdfad62da4e00255d8ba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer 
cancel job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530877843
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15494/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25743: 
[SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread 
interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530877843
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15494/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25743: [SPARK-29036][SQL]SparkThriftServer 
cancel job after execute() thread interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530877832
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25743: 
[SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread 
interrupted
URL: https://github.com/apache/spark/pull/25743#issuecomment-530877832
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530870831
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110515/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530870821
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530870821
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530801110
 
 
   **[Test build #110515 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110515/testReport)**
 for PR 25766 at commit 
[`fa4234c`](https://github.com/apache/spark/commit/fa4234c0cbdb8aaeb1360d7565f6db5eebe87f30).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530870831
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110515/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530870441
 
 
   **[Test build #110515 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110515/testReport)**
 for PR 25766 at commit 
[`fa4234c`](https://github.com/apache/spark/commit/fa4234c0cbdb8aaeb1360d7565f6db5eebe87f30).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class ByteCodeStats(`
 * `   * Returns the bytecode statistics (max class bytecode size, max 
method bytecode size,`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mgaido91 commented on a change in pull request #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
mgaido91 commented on a change in pull request #25766: [SPARK-29061][SQL] 
Prints bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#discussion_r323786212
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/debug/package.scala
 ##
 @@ -81,11 +82,14 @@ package object debug {
   def writeCodegen(append: String => Unit, plan: SparkPlan): Unit = {
 val codegenSeq = codegenStringSeq(plan)
 append(s"Found ${codegenSeq.size} WholeStageCodegen subtrees.\n")
-for (((subtree, code), i) <- codegenSeq.zipWithIndex) {
-  append(s"== Subtree ${i + 1} / ${codegenSeq.size} ==\n")
+for (((subtree, code, codeStats), i) <- codegenSeq.zipWithIndex) {
+  val codeStatsStr = s"maxClassCodeSize:${codeStats.maxClassCodeSize} " +
 
 Review comment:
   thank you


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` to org.apache.spark.sql.types package

2019-09-12 Thread GitBox
kiszk commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` to 
org.apache.spark.sql.types package
URL: https://github.com/apache/spark/pull/25022#issuecomment-530862520
 
 
   ping @priyankagargnitk


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] support data source v2 in CREATE TABLE USING

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
support data source v2 in CREATE TABLE USING
URL: https://github.com/apache/spark/pull/25651#discussion_r323785172
 
 

 ##
 File path: 
external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroDataSourceV2.scala
 ##
 @@ -35,7 +36,10 @@ class AvroDataSourceV2 extends FileDataSourceV2 {
 AvroTable(tableName, sparkSession, options, paths, None, 
fallbackFileFormat)
   }
 
-  override def getTable(options: CaseInsensitiveStringMap, schema: 
StructType): Table = {
+  override def getTable(
+  options: CaseInsensitiveStringMap,
+  schema: StructType,
+  partitions: Array[Transform]): Table = {
 
 Review comment:
   read options should be passed in `Table.newScanBuilder`. The `options` here 
is the table properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25734: [SPARK-28939][SQL][2.4] 
Propagate SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530859446
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25734: [SPARK-28939][SQL][2.4] 
Propagate SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530859453
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110512/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate 
SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530859453
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110512/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate 
SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530859446
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25734: [SPARK-28939][SQL][2.4] Propagate 
SQLConf for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530767235
 
 
   **[Test build #110512 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110512/testReport)**
 for PR 25734 at commit 
[`1b145e2`](https://github.com/apache/spark/commit/1b145e2158679dc27fce07a8ddf17f6341175afe).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf for plans executed by toRdd

2019-09-12 Thread GitBox
SparkQA commented on issue #25734: [SPARK-28939][SQL][2.4] Propagate SQLConf 
for plans executed by toRdd
URL: https://github.com/apache/spark/pull/25734#issuecomment-530858897
 
 
   **[Test build #110512 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110512/testReport)**
 for PR 25734 at commit 
[`1b145e2`](https://github.com/apache/spark/commit/1b145e2158679dc27fce07a8ddf17f6341175afe).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25774: [SPARK-29069][SQL] 
ResolveInsertInto should not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530852507
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15493/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25774: [SPARK-29069][SQL] 
ResolveInsertInto should not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530852496
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
SparkQA commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should 
not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530853522
 
 
   **[Test build #110518 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110518/testReport)**
 for PR 25774 at commit 
[`2753af5`](https://github.com/apache/spark/commit/2753af5c2adbbb0c27decda3afb7a06e8ff1a31f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto 
should not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530852507
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15493/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto 
should not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530852496
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
cloud-fan commented on issue #25774: [SPARK-29069][SQL] ResolveInsertInto 
should not do table lookup
URL: https://github.com/apache/spark/pull/25774#issuecomment-530850602
 
 
   cc @brkyvz @rdblue @gengliangwang @HyukjinKwon 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25774: [SPARK-29069][SQL] 
ResolveInsertInto should not do table lookup
URL: https://github.com/apache/spark/pull/25774#discussion_r323770221
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -671,6 +671,15 @@ class Analyzer(
   case scala.Right(tableOpt) => tableOpt
 }
 v2TableOpt.map(DataSourceV2Relation.create).getOrElse(u)
+
+  case i @ InsertIntoStatement(u: UnresolvedRelation, _, _, _, _) if 
i.query.resolved =>
 
 Review comment:
   simpler to `ResolveRelations`, `ResolveTables` should handle both 
`UnresolvedRelation` and `InsertIntoStatement(UnresolvedRelation, ...)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25774: [SPARK-29069][SQL] 
ResolveInsertInto should not do table lookup
URL: https://github.com/apache/spark/pull/25774#discussion_r323770329
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ##
 @@ -785,41 +794,28 @@ class Analyzer(
 
   object ResolveInsertInto extends Rule[LogicalPlan] {
 override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators 
{
-  case i @ InsertIntoStatement(u: UnresolvedRelation, _, _, _, _) if 
i.query.resolved =>
-lookupV2Relation(u.multipartIdentifier) match {
-  case scala.Left((_, _, Some(v2Table: Table))) =>
-resolveV2Insert(i, v2Table)
-  case scala.Right(Some(v2Table: Table)) =>
-resolveV2Insert(i, v2Table)
-  case _ =>
-i
+  case i @ InsertIntoStatement(r: DataSourceV2Relation, _, _, _, _) if 
i.query.resolved =>
+// ifPartitionNotExists is append with validation, but validation is 
not supported
 
 Review comment:
   just indentation changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan opened a new pull request #25774: [SPARK-29069][SQL] ResolveInsertInto should not do table lookup

2019-09-12 Thread GitBox
cloud-fan opened a new pull request #25774: [SPARK-29069][SQL] 
ResolveInsertInto should not do table lookup
URL: https://github.com/apache/spark/pull/25774
 
 
   
   
   ### What changes were proposed in this pull request?
   
   
   It's more clear to only do table lookup in `ResolveTables` rule (for v2 
tables) and `ResolveRelations` rule (for v1 tables). `ResolveInsertInto` should 
only resolve the `InsertIntoStatement` with resolved relations.
   
   ### Why are the changes needed?
   
   to make the code simpler
   
   ### Does this PR introduce any user-facing change?
   
   no
   
   ### How was this patch tested?
   
   existing tests


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
SparkQA commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530843065
 
 
   **[Test build #110517 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110517/testReport)**
 for PR 25690 at commit 
[`5b2766a`](https://github.com/apache/spark/commit/5b2766a2259e7f8b776f97e4b152562069796e18).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530842326
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15492/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530842315
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530842326
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15492/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530842315
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
SparkQA commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530839668
 
 
   **[Test build #110516 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110516/testReport)**
 for PR 25690 at commit 
[`5b2766a`](https://github.com/apache/spark/commit/5b2766a2259e7f8b776f97e4b152562069796e18).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
wangyum commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530839400
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] 
Extend `EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530837299
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] 
Extend `EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530837311
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110511/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530837311
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110511/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530837299
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on issue #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-09-12 Thread GitBox
srowen commented on issue #22282: [SPARK-23539][SS] Add support for Kafka 
headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#issuecomment-530836992
 
 
   I think it's OK @dongjinleekr , just needs a rebase now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530836651
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110514/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530836638
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530759715
 
 
   **[Test build #110511 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110511/testReport)**
 for PR 25772 at commit 
[`766110a`](https://github.com/apache/spark/commit/766110ad07bc1a9911b80a179033df6ad9c924fb).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530836651
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110514/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
SparkQA commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` 
benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530836603
 
 
   **[Test build #110511 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110511/testReport)**
 for PR 25772 at commit 
[`766110a`](https://github.com/apache/spark/commit/766110ad07bc1a9911b80a179033df6ad9c924fb).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530836638
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
SparkQA commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530836226
 
 
   **[Test build #110514 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110514/testReport)**
 for PR 25690 at commit 
[`5b2766a`](https://github.com/apache/spark/commit/5b2766a2259e7f8b776f97e4b152562069796e18).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to 
local file
URL: https://github.com/apache/spark/pull/25690#issuecomment-530788398
 
 
   **[Test build #110514 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110514/testReport)**
 for PR 25690 at commit 
[`5b2766a`](https://github.com/apache/spark/commit/5b2766a2259e7f8b776f97e4b152562069796e18).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #25770: [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics

2019-09-12 Thread GitBox
srowen commented on a change in pull request #25770: [SPARK-29064][CORE] Add 
PrometheusResource to export Executor metrics
URL: https://github.com/apache/spark/pull/25770#discussion_r323752067
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ui/SparkUI.scala
 ##
 @@ -66,6 +66,7 @@ private[spark] class SparkUI private (
 addStaticHandler(SparkUI.STATIC_RESOURCE_DIR)
 attachHandler(createRedirectHandler("/", "/jobs/", basePath = basePath))
 attachHandler(ApiRootResource.getServletHandler(this))
+attachHandler(PrometheusResource.getServletHandler(this))
 
 Review comment:
   Should this be more optional, like only attached if one configures something 
to use Prometheus?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Master/Worker/Driver

2019-09-12 Thread GitBox
srowen commented on a change in pull request #25769: [SPARK-29032][CORE] Add 
PrometheusServlet to monitor Master/Worker/Driver
URL: https://github.com/apache/spark/pull/25769#discussion_r323751789
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/metrics/MetricsConfig.scala
 ##
 @@ -43,6 +43,12 @@ private[spark] class MetricsConfig(conf: SparkConf) extends 
Logging {
 prop.setProperty("*.sink.servlet.path", "/metrics/json")
 prop.setProperty("master.sink.servlet.path", "/metrics/master/json")
 prop.setProperty("applications.sink.servlet.path", 
"/metrics/applications/json")
+
+prop.setProperty("*.sink.prometheusServlet.class",
 
 Review comment:
   Actually one last question - does this cause prometheusServlet to always be 
configured and available? I had thought this more opt-in, that it wouldn't be 
on by default unless configured. Just checking whether that's true.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #25764: [SPARK-29060][SQL] Add tree traversal helper for adaptive spark plans

2019-09-12 Thread GitBox
cloud-fan commented on issue #25764: [SPARK-29060][SQL] Add tree traversal 
helper for adaptive spark plans
URL: https://github.com/apache/spark/pull/25764#issuecomment-530835150
 
 
   thanks, merging to master!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #25764: [SPARK-29060][SQL] Add tree traversal helper for adaptive spark plans

2019-09-12 Thread GitBox
cloud-fan closed pull request #25764: [SPARK-29060][SQL] Add tree traversal 
helper for adaptive spark plans
URL: https://github.com/apache/spark/pull/25764
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to monitor Master/Worker/Driver

2019-09-12 Thread GitBox
srowen commented on issue #25769: [SPARK-29032][CORE] Add PrometheusServlet to 
monitor Master/Worker/Driver
URL: https://github.com/apache/spark/pull/25769#issuecomment-530834494
 
 
   @itsvikramagr the difference is that that change actually added Prometheus 
and all its dependencies to Spark. This just uses Prometheus if it's present.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25764: [SPARK-29060][SQL] Add tree traversal helper for adaptive spark plans

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25764: [SPARK-29060][SQL] Add 
tree traversal helper for adaptive spark plans
URL: https://github.com/apache/spark/pull/25764#discussion_r323748844
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanHelper.scala
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.adaptive
+
+import org.apache.spark.sql.execution.SparkPlan
+
+/**
+ * This class provides utility methods related to tree traversal of an 
[[AdaptiveSparkPlanExec]]
+ * plan. Unlike their counterparts in 
[[org.apache.spark.sql.catalyst.trees.TreeNode]] or
+ * [[org.apache.spark.sql.catalyst.plans.QueryPlan]], these methods traverse 
down leaf nodes of
+ * adaptive plans, i.e., [[AdaptiveSparkPlanExec]] and [[QueryStageExec]].
+ */
+trait AdaptiveSparkPlanHelper {
+
+  /**
+   * Find the first [[SparkPlan]] that satisfies the condition specified by 
`f`.
+   * The condition is recursively applied to this node and all of its children 
(pre-order).
+   */
+  def find(p: SparkPlan)(f: SparkPlan => Boolean): Option[SparkPlan] = if 
(f(p)) {
+Some(p)
+  } else {
+allChildren(p).foldLeft(Option.empty[SparkPlan]) { (l, r) => 
l.orElse(find(r)(f)) }
+  }
+
+  /**
+   * Runs the given function on this node and then recursively on children.
+   * @param f the function to be applied to each node in the tree.
+   */
+  def foreach(p: SparkPlan)(f: SparkPlan => Unit): Unit = {
+f(p)
+allChildren(p).foreach(foreach(_)(f))
+  }
+
+  /**
+   * Runs the given function recursively on children then on this node.
+   * @param f the function to be applied to each node in the tree.
+   */
+  def foreachUp(p: SparkPlan)(f: SparkPlan => Unit): Unit = {
+allChildren(p).foreach(foreachUp(_)(f))
+f(p)
+  }
+
+  /**
+   * Returns a Seq containing the result of applying the given function to each
+   * node in this tree in a preorder traversal.
+   * @param f the function to be applied.
+   */
+  def map[A](p: SparkPlan)(f: SparkPlan => A): Seq[A] = {
+val ret = new collection.mutable.ArrayBuffer[A]()
+foreach(p)(ret += f(_))
+ret
+  }
+
+  /**
+   * Returns a Seq by applying a function to all nodes in this tree and using 
the elements of the
+   * resulting collections.
+   */
+  def flatMap[A](p: SparkPlan)(f: SparkPlan => TraversableOnce[A]): Seq[A] = {
+val ret = new collection.mutable.ArrayBuffer[A]()
+foreach(p)(ret ++= f(_))
+ret
+  }
+
+  /**
+   * Returns a Seq containing the result of applying a partial function to all 
elements in this
+   * tree on which the function is defined.
+   */
+  def collect[B](p: SparkPlan)(pf: PartialFunction[SparkPlan, B]): Seq[B] = {
+val ret = new collection.mutable.ArrayBuffer[B]()
+val lifted = pf.lift
+foreach(p)(node => lifted(node).foreach(ret.+=))
+ret
+  }
+
+  /**
+   * Returns a Seq containing the leaves in this tree.
+   */
+  def collectLeaves(p: SparkPlan): Seq[SparkPlan] = {
+collect(p) { case plan if allChildren(plan).isEmpty => plan }
+  }
+
+  /**
+   * Finds and returns the first [[SparkPlan]] of the tree for which the given 
partial function
+   * is defined (pre-order), and applies the partial function to it.
+   */
+  def collectFirst[B](p: SparkPlan)(pf: PartialFunction[SparkPlan, B]): 
Option[B] = {
+val lifted = pf.lift
+lifted(p).orElse {
+  allChildren(p).foldLeft(Option.empty[B]) { (l, r) => 
l.orElse(collectFirst(r)(pf)) }
+}
+  }
+
+  /**
+   * Returns a sequence containing the result of applying a partial function 
to all elements in this
+   * plan, also considering all the plans in its (nested) subqueries
+   */
+  def collectInPlanAndSubqueries[B](p: SparkPlan)(f: 
PartialFunction[SparkPlan, B]): Seq[B] = {
+(p +: subqueriesAll(p)).flatMap(collect(_)(f))
+  }
+
+  /**
+   * Returns a sequence containing the subqueries in this plan, also including 
the (nested)
+   * subquries in its children
+   */
+  def subqueriesAll(p: SparkPlan): Seq[SparkPlan] = {
 
 Review comment:
   nvm, this consistent with `QueryPlan.subqueriesAll`


[GitHub] [spark] srowen commented on issue #25759: [SPARK-19147][CORE] netty throw NPE

2019-09-12 Thread GitBox
srowen commented on issue #25759: [SPARK-19147][CORE] netty throw NPE
URL: https://github.com/apache/spark/pull/25759#issuecomment-530829156
 
 
   Also please improve the title of this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #25759: [SPARK-19147][CORE] netty throw NPE

2019-09-12 Thread GitBox
srowen commented on a change in pull request #25759: [SPARK-19147][CORE] netty 
throw NPE
URL: https://github.com/apache/spark/pull/25759#discussion_r323742872
 
 

 ##
 File path: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
 ##
 @@ -192,7 +192,20 @@ public TransportClient createClient(String remoteHost, 
int remotePort)
   logger.info("Found inactive connection to {}, creating a new one.", 
resolvedAddress);
 }
   }
-  clientPool.clients[clientIndex] = createClient(resolvedAddress);
+  try {
+clientPool.clients[clientIndex] = createClient(resolvedAddress);
+  } catch (Exception e) {
+// createClient() is called by task and close() is called by executor.
+// When stop the executor, close() will set workerGroup = null,
+// NPE will occur in createClient which generate many exception in log.
+// For exception occurs after close(), treated it as an expected 
Exception
+// and transform it to InterruptedException which can be processed by 
Executor.
+// See SPARK-19147
+if (workerGroup == null) {
+  throw new InterruptedException(e.getMessage());
 
 Review comment:
   This is still going to generate an exception in the logs, no? should it just 
be a log warning?
   This is I think too indirect. Why not throw `IllegalStateException` in 
`createClient` instead in this case and catch for it specifically?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk edited a comment on issue #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-12 Thread GitBox
MaxGekk edited a comment on issue #25716: [SPARK-29012][SQL] Support special 
timestamp values
URL: https://github.com/apache/spark/pull/25716#issuecomment-530826650
 
 
   I have some performance related concerns regarding to using the config. In 
current implementation, decision is pretty cheap - just comparing first byte. 
In the case of the config usage, we will need to retrieve it and compare its 
value with other string which can bring visible overhead even if PostgreSQL 
compatibility mode is turned off here 
https://github.com/apache/spark/pull/25716/files#diff-da60f07e1826788aaeb07f295fae4b8aR223
   
   Are you absolutely sure about using this config in the PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk edited a comment on issue #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-12 Thread GitBox
MaxGekk edited a comment on issue #25716: [SPARK-29012][SQL] Support special 
timestamp values
URL: https://github.com/apache/spark/pull/25716#issuecomment-530826650
 
 
   I have some performance related concerns regarding to using the config. In 
current implementation, decision is pretty cheap - just comparing first byte. 
In the case of the config usage, we will need to retrieve it and compare its 
value with other string which can bring visible overhead even if PostgreSQL 
compatibility mode is turned off.
   
   Are you absolutely sure about using this config in the PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on issue #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-12 Thread GitBox
MaxGekk commented on issue #25716: [SPARK-29012][SQL] Support special timestamp 
values
URL: https://github.com/apache/spark/pull/25716#issuecomment-530826650
 
 
   I have some performance related concerns regarding to using the config. In 
current implementation, decision is pretty cheap - just comparing first byte. 
In the case of the config usage, we will need to retrieve it and compare its 
value with other string which can bring visible overhead even if PostgreSQL 
compatibility mode is turned off.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #25743: [SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread interrupted

2019-09-12 Thread GitBox
wangyum commented on a change in pull request #25743: 
[SPARK-29036][SQL]SparkThriftServer cancel job after execute() thread 
interrupted
URL: https://github.com/apache/spark/pull/25743#discussion_r323740056
 
 

 ##
 File path: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala
 ##
 @@ -267,6 +267,9 @@ private[hive] class SparkExecuteStatementOperation(
   // Actually do need to catch Throwable as some failures don't inherit 
from Exception and
   // HiveServer will silently swallow them.
   case e: Throwable =>
+if (statementId != null) {
 
 Review comment:
   Could we add a comment explaining why we need this change?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25754: [SPARK-29048] Improve 
performance on Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530825209
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25754: [SPARK-29048] Improve 
performance on Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530825216
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110510/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25754: [SPARK-29048] Improve performance on 
Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530825216
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110510/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25754: [SPARK-29048] Improve performance on 
Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530825209
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec

2019-09-12 Thread GitBox
cloud-fan commented on issue #25710: [SPARK-29008][SQL] Define an individual 
method for each common subexpression in HashAggregateExec
URL: https://github.com/apache/spark/pull/25710#issuecomment-530824963
 
 
   LGTM, cc @rednaxelafx to take another look


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25754: [SPARK-29048] Improve performance on 
Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530751633
 
 
   **[Test build #110510 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110510/testReport)**
 for PR 25754 at commit 
[`ab3e5d4`](https://github.com/apache/spark/commit/ab3e5d4e5119ec05553c9a2f8cdf3b6544f699ed).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25754: [SPARK-29048] Improve performance on Column.isInCollection() with a large size collection

2019-09-12 Thread GitBox
SparkQA commented on issue #25754: [SPARK-29048] Improve performance on 
Column.isInCollection() with a large size collection
URL: https://github.com/apache/spark/pull/25754#issuecomment-530824372
 
 
   **[Test build #110510 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110510/testReport)**
 for PR 25754 at commit 
[`ab3e5d4`](https://github.com/apache/spark/commit/ab3e5d4e5119ec05553c9a2f8cdf3b6544f699ed).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] 
Extend `EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530812586
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25772: [SPARK-29065][SQL][TEST] 
Extend `EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530812593
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110509/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530812593
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110509/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530812586
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
SparkQA removed a comment on issue #25772: [SPARK-29065][SQL][TEST] Extend 
`EXTRACT` benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530737271
 
 
   **[Test build #110509 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110509/testReport)**
 for PR 25772 at commit 
[`e54c41a`](https://github.com/apache/spark/commit/e54c41af5f0f7bde357c151dfd7ebdb060fda83a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
SparkQA commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` 
benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530811832
 
 
   **[Test build #110509 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110509/testReport)**
 for PR 25772 at commit 
[`e54c41a`](https://github.com/apache/spark/commit/e54c41af5f0f7bde357c151dfd7ebdb060fda83a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec

2019-09-12 Thread GitBox
maropu commented on a change in pull request #25710: [SPARK-29008][SQL] Define 
an individual method for each common subexpression in HashAggregateExec
URL: https://github.com/apache/spark/pull/25710#discussion_r323718636
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala
 ##
 @@ -419,4 +419,27 @@ class WholeStageCodegenSuite extends QueryTest with 
SharedSparkSession {
   }
 }
   }
+
+  test("Give up splitting subexpression code if a parameter length goes over 
the limit") {
+withSQLConf(
+SQLConf.CODEGEN_SPLIT_AGGREGATE_FUNC.key -> "false",
 
 Review comment:
   Yea, we need to. If that flag is true, `HashAggregateExec` throws an 
exception in this test: 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L327


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-12 Thread GitBox
maropu commented on issue #25716: [SPARK-29012][SQL] Support special timestamp 
values
URL: https://github.com/apache/spark/pull/25716#issuecomment-530805052
 
 
   How about holding this pr until this weekend for the @gengliangwang work? I 
personally think we don't have any reason to rush to merge this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
maropu commented on issue #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` 
benchmark
URL: https://github.com/apache/spark/pull/25772#issuecomment-530803392
 
 
   Thanks! Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu closed pull request #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` benchmark

2019-09-12 Thread GitBox
maropu closed pull request #25772: [SPARK-29065][SQL][TEST] Extend `EXTRACT` 
benchmark
URL: https://github.com/apache/spark/pull/25772
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530802868
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530802874
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15491/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530802874
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15491/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530802868
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
maropu commented on a change in pull request #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#discussion_r323711557
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ##
 @@ -1353,19 +1370,17 @@ object CodeGenerator extends Logging {
 byteCodeSize
   }
 }
-Some(stats)
+(classCodeSize, methodCodeSizes.max, constPoolSize)
   } catch {
 case NonFatal(e) =>
   logWarning("Error calculating stats of compiled class.", e)
-  None
+  (classCodeSize, -1, -1)
   }
-}.flatten
-
-if (codeSizes.nonEmpty) {
-  codeSizes.max
-} else {
-  0
 }
+
+ByteCodeStats(codeStats.reduce[(Int, Int, Int)] { case (v1, v2) =>
+  (Math.max(v1._1, v2._1), Math.max(v1._2, v2._2), Math.max(v1._3, v2._3))
 
 Review comment:
   Currently, this pr prints statistics per a whole-stage codegen entry, so the 
current one looks ok to me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
maropu commented on a change in pull request #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#discussion_r323710328
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ##
 @@ -1353,19 +1370,17 @@ object CodeGenerator extends Logging {
 byteCodeSize
   }
 }
-Some(stats)
+(classCodeSize, methodCodeSizes.max, constPoolSize)
 
 Review comment:
   How about the latest code? I added a new metric (# of inner classes), so 
using a tuple in that part is ok?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
SparkQA commented on issue #25766: [SPARK-29061][SQL] Prints bytecode 
statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530801110
 
 
   **[Test build #110515 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110515/testReport)**
 for PR 25766 at commit 
[`fa4234c`](https://github.com/apache/spark/commit/fa4234c0cbdb8aaeb1360d7565f6db5eebe87f30).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530800419
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/15490/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints bytecode statistics in debugCodegen

2019-09-12 Thread GitBox
AmplabJenkins removed a comment on issue #25766: [SPARK-29061][SQL] Prints 
bytecode statistics in debugCodegen
URL: https://github.com/apache/spark/pull/25766#issuecomment-530800415
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    2   3   4   5   6   7   8   9   10   >