[GitHub] dongjoon-hyun commented on issue #23456: [SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array
dongjoon-hyun commented on issue #23456: [SPARK-26538][SQL] Set default precision and scale for elements of postgres numeric array URL: https://github.com/apache/spark/pull/23456#issuecomment-454298685 HI, @a-shkarupin . Yep. `alsh` is added to Spark contributor group and SPARK-26538 is assigned to you. If you are not in the contributor group, we cannot assign you an issue. Since you are added now, there is no problem in assigning. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] Deegue commented on a change in pull request #23506: [SPARK-26577][SQL] Add input optimizer when reading Hive table by SparkSQL
Deegue commented on a change in pull request #23506: [SPARK-26577][SQL] Add input optimizer when reading Hive table by SparkSQL URL: https://github.com/apache/spark/pull/23506#discussion_r247790569 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ## @@ -311,6 +309,36 @@ class HadoopTableReader( // Only take the value (skip the key) because Hive works only with values. rdd.map(_._2) } + + /** + * If `spark.sql.hive.fileInputFormat.enabled` is true, this function will optimize the input + * method(including format and the size of splits) while reading Hive tables. + */ + private def getAndOptimizeInput( +inputClassName: String): Class[InputFormat[Writable, Writable]] = { + +var ifc = Utils.classForName(inputClassName) + .asInstanceOf[java.lang.Class[InputFormat[Writable, Writable]]] +if (conf.getConf(HiveUtils.HIVE_FILE_INPUT_FORMAT_ENABLED)) { + hadoopConf.set("mapreduce.input.fileinputformat.split.maxsize", Review comment: oops.. tried before but now it works! It's my fault.. I'll remove the redundant code and change the `spark.sql.hive.fileInputFormat.enabled` to `spark.sql.hive.inputFormat.optimizer.enabled`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
AmplabJenkins commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454296772 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7084/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
AmplabJenkins removed a comment on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454296772 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7084/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
AmplabJenkins removed a comment on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454296771 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
rxin commented on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454296822 I'm kind of wondering whether it'd make sense to add these. It adds a lot of code which would incur some maintenance cost, and users can trivially implement these themselves, or just use the untyped version, and we don't need to spend time discussing whether these should follow SQL semantics or Scala semantics. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
AmplabJenkins commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454296771 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
SparkQA commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454296607 **[Test build #101237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101237/testReport)** for PR 23548 at commit [`71e69a4`](https://github.com/apache/spark/commit/71e69a421b9230505f58f85837a0092b4c139b09). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] ueshin commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
ueshin commented on issue #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548#issuecomment-454295559 cc @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] ueshin opened a new pull request #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame.
ueshin opened a new pull request #23548: [SPARK-26620][PYTHON] Make `DataFrameReader.json()` and `csv()` in Python should accept DataFrame. URL: https://github.com/apache/spark/pull/23548 ## What changes were proposed in this pull request? Currently `DataFrameReader.json()` and `csv()` in Python doesn't accept DataFrame, but they should accept if the schema of the given `DataFrame` contains only one String column. ## How was this patch tested? Added corresponding tests and existing tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched
Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched URL: https://github.com/apache/spark/pull/22806#discussion_r247785397 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -286,6 +286,44 @@ private[spark] class TaskSchedulerImpl( } } + /** + * SPARK-25250: Whenever any Task gets successfully completed, we simply mark the + * corresponding partition id as completed in all attempts for that particular stage and + * additionally, for a Result Stage, we also kill the remaining task attempts running on the + * same partition. As a result, we do not see any Killed tasks due to + * TaskCommitDenied Exceptions showing up in the UI. When this method is called from + * DAGScheduler.scala on a task completion event being fired, it is assumed that the new + * TaskSet has already been created and registered. However, a small possibility does exist + * that when this method gets called, possibly the new TaskSet might have not been added + * to taskSetsByStageIdAndAttempt. In such a case, we might still hit the same issue. However, + * the above scenario has not yet been reproduced. + */ + override def completeTasks(partitionId: Int, stageId: Int, killTasks: Boolean): Unit = { +taskSetsByStageIdAndAttempt.getOrElse(stageId, Map()).values.foreach { tsm => + tsm.partitionToIndex.get(partitionId) match { +case Some(index) => + tsm.markPartitionAsAlreadyCompleted(index) Review comment: > also increasing for completed tasks. Do you mean the failed tasks from zombie TaskSets? > We should call `markPartitionCompleted()` only for running tasks and call `markPartitionAsAlreadyCompleted()` for non-running tasks. I don't think we need `markPartitionAsAlreadyCompleted()` any more, and I think just call `markPartitionCompleted()` for active TaskSet would be ok. > For running tasks, we should call markPartitionCompleted() only when we successfully kill them. why ? > If the scheduler backend does not support killing tasks, we call markPartitionAsAlreadyCompleted(). If the scheduler backend does not support killing tasks, so just waiting `TaskSetManager` to handle the running tasks when they finish. And we could also call `markPartitionCompleted()` before this. > whether or not it is a good idea to call markPartitionCompleted(). Hmmm, I think the race condition is none business of calling `markPartitionCompleted()`, it's about whether we're calling it in `DAGScheduler` scope or not as @squito explained above. But I'm quite surprised you saw the race condition again at this time. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched
Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched URL: https://github.com/apache/spark/pull/22806#discussion_r247785397 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -286,6 +286,44 @@ private[spark] class TaskSchedulerImpl( } } + /** + * SPARK-25250: Whenever any Task gets successfully completed, we simply mark the + * corresponding partition id as completed in all attempts for that particular stage and + * additionally, for a Result Stage, we also kill the remaining task attempts running on the + * same partition. As a result, we do not see any Killed tasks due to + * TaskCommitDenied Exceptions showing up in the UI. When this method is called from + * DAGScheduler.scala on a task completion event being fired, it is assumed that the new + * TaskSet has already been created and registered. However, a small possibility does exist + * that when this method gets called, possibly the new TaskSet might have not been added + * to taskSetsByStageIdAndAttempt. In such a case, we might still hit the same issue. However, + * the above scenario has not yet been reproduced. + */ + override def completeTasks(partitionId: Int, stageId: Int, killTasks: Boolean): Unit = { +taskSetsByStageIdAndAttempt.getOrElse(stageId, Map()).values.foreach { tsm => + tsm.partitionToIndex.get(partitionId) match { +case Some(index) => + tsm.markPartitionAsAlreadyCompleted(index) Review comment: > also increasing for completed tasks. Do you mean the failed tasks from zombie TaskSets? > We should call `markPartitionCompleted()` only for running tasks and call `markPartitionAsAlreadyCompleted()` for non-running tasks. I don't think we need `markPartitionAsAlreadyCompleted()` any more, and I think just call `markPartitionCompleted()` for active TaskSet would be ok. > For running tasks, we should call markPartitionCompleted() only when we successfully kill them. why ? > If the scheduler backend does not support killing tasks, we call markPartitionAsAlreadyCompleted(). If the scheduler backend does not support killing tasks, so just waiting `TaskSetManager` to handle the running tasks when they finish. And we could also call `markPartitionCompleted()` before this. `whether or not it is a good idea to call markPartitionCompleted().` Hmmm, I think the race condition is none business of calling `markPartitionCompleted()`, it's about whether we're calling it in `DAGScheduler` scope or not as @squito explained above. But I'm quite surprised you saw the race condition again at this time. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] gatorsmile commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
gatorsmile commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454292594 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] felixcheung commented on a change in pull request #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
felixcheung commented on a change in pull request #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#discussion_r247785036 ## File path: sql/core/src/test/scala/org/apache/spark/sql/sources/v2/SimpleWritableDataSource.scala ## @@ -68,22 +65,50 @@ class SimpleWritableDataSource extends DataSourceV2 new CSVReaderFactory(serializableConf) } -override def readSchema(): StructType = writeSchema +override def readSchema(): StructType = tableSchema } - override def getTable(options: DataSourceOptions): Table = { -val path = new Path(options.get("path").get()) -val conf = SparkContext.getActive.get.hadoopConfiguration -new SimpleBatchTable { - override def newScanBuilder(options: DataSourceOptions): ScanBuilder = { -new MyScanBuilder(path.toUri.toString, conf) + class MyWriteBuilder(path: String) extends WriteBuilder with SupportsSaveMode { +private var queryId: String = _ +private var mode: SaveMode = _ + +override def withQueryId(queryId: String): WriteBuilder = { + this.queryId = queryId + this +} + +override def mode(mode: SaveMode): WriteBuilder = { + this.mode = mode + this +} + +override def buildForBatch(): BatchWrite = { + assert(mode != null) + + val hadoopPath = new Path(path) + val hadoopConf = SparkContext.getActive.get.hadoopConfiguration + val fs = hadoopPath.getFileSystem(hadoopConf) + + if (mode == SaveMode.ErrorIfExists) { +if (fs.exists(hadoopPath)) { + throw new RuntimeException("data already exists.") +} + } + if (mode == SaveMode.Ignore) { +if (fs.exists(hadoopPath)) { + return null +} + } + if (mode == SaveMode.Overwrite) { +fs.delete(hadoopPath, true) Review comment: IMO semantically this is different from what I expect - overwrite means when all data is there, replace the existing dir. this is implemented as - remove existing dir, then place the data there failure mode behavior is different.. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
AmplabJenkins removed a comment on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454288929 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
AmplabJenkins removed a comment on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454288932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7083/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
AmplabJenkins commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454288932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7083/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
AmplabJenkins commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454288929 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454288535 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101235/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
SparkQA commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454288730 **[Test build #101236 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101236/testReport)** for PR 23540 at commit [`ddeee64`](https://github.com/apache/spark/commit/ddeee64b5701a2c4c4e1ff17613a5badbd406a38). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
SparkQA removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454284395 **[Test build #101235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101235/testReport)** for PR 23547 at commit [`25eaf30`](https://github.com/apache/spark/commit/25eaf307589a920023fa0098ee13162ad0f3a653). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454288532 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454288532 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454288535 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101235/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
SparkQA commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454288516 **[Test build #101235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101235/testReport)** for PR 23547 at commit [`25eaf30`](https://github.com/apache/spark/commit/25eaf307589a920023fa0098ee13162ad0f3a653). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] attilapiros commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
attilapiros commented on issue #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#issuecomment-454287636 @squito There is no more leaks I found in core (regarding TransportClient/Server). When we talked I just missed some cleanup made in afterEach/afterAll (my first logging for checking the number of created/closed instances was only in org.apache.spark.SparkFunSuite#withFixture). Regarding ThreadAudit I cannot say (making the tests longer is something we should avoid but I do not know whether that time would be relevant comparing to the whole). I am very curious is this already helps for @NiharS for https://github.com/apache/spark/pull/22114. What about making his change here waiting for Jenkins to test it and then removing that commit? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
AmplabJenkins removed a comment on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454286575 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
AmplabJenkins removed a comment on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454286584 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101219/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
AmplabJenkins commented on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454286584 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101219/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
AmplabJenkins commented on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454286575 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
SparkQA removed a comment on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454248602 **[Test build #101219 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101219/testReport)** for PR 23304 at commit [`04c0a22`](https://github.com/apache/spark/commit/04c0a22a77efa690c85f0852c724433452d411b2). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
reviews@spark.apache.org
SparkQA commented on issue #23304: [SPARK-26353][SQL]Add typed aggregate functions: max&&min URL: https://github.com/apache/spark/pull/23304#issuecomment-454286186 **[Test build #101219 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101219/testReport)** for PR 23304 at commit [`04c0a22`](https://github.com/apache/spark/commit/04c0a22a77efa690c85f0852c724433452d411b2). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TypedMinDouble[IN, OUT] extends Aggregator[IN, MutableDouble, OUT] ` * `class JavaTypedMinDouble[IN](override val f: IN => Double)` * `class ScalaTypedMinDouble[IN](override val f: IN => Double)` * `trait TypedMaxDouble[IN, OUT] extends Aggregator[IN, MutableDouble, OUT] ` * `class JavaTypedMaxDouble[IN](override val f: IN => Double)` * `class ScalaTypedMaxDouble[IN](override val f: IN => Double)` * `trait TypedMinLong[IN, OUT] extends Aggregator[IN, MutableLong, OUT] ` * `class JavaTypedMinLong[IN](override val f: IN => Long) extends TypedMinLong[IN, java.lang.Long] ` * `class ScalaTypedMinLong[IN](override val f: IN => Long) extends TypedMinLong[IN, Option[Long]] ` * `trait TypedMaxLong[IN, OUT] extends Aggregator[IN, MutableLong, OUT] ` * `class JavaTypedMaxLong[IN](override val f: IN => Long) extends TypedMaxLong[IN, java.lang.Long] ` * `class ScalaTypedMaxLong[IN](override val f: IN => Long) extends TypedMaxLong[IN, Option[Long]] ` * `class MutableLong(var value: Long) extends Serializable` * `class MutableDouble(var value: Double) extends Serializable` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454285585 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins removed a comment on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454285586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7082/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched
Ngone51 commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched URL: https://github.com/apache/spark/pull/22806#discussion_r247779264 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -286,6 +286,44 @@ private[spark] class TaskSchedulerImpl( } } + /** + * SPARK-25250: Whenever any Task gets successfully completed, we simply mark the + * corresponding partition id as completed in all attempts for that particular stage and + * additionally, for a Result Stage, we also kill the remaining task attempts running on the + * same partition. As a result, we do not see any Killed tasks due to + * TaskCommitDenied Exceptions showing up in the UI. When this method is called from + * DAGScheduler.scala on a task completion event being fired, it is assumed that the new + * TaskSet has already been created and registered. However, a small possibility does exist + * that when this method gets called, possibly the new TaskSet might have not been added Review comment: I think @squito has a good ponit here. Previously, I was thinking what if the active TaskSet has not been created when we marking completed partition for all TaskSets and does this fix still works ? Now, I realize that whether the active TaskSet has been created or not, it still works: * created obviously, fine. * not created then, when `DAGScheduler` calling `submitMissingTasks`, it will figure out which missing partitions to compute(including the partitions which were completed by tasks from previous stage attempt). So, the new created TaskSet also know about the completed partition. And these are all benefit from event loop, which perform as a single thread. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454285585 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
AmplabJenkins commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454285586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7082/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] eatoncys removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
eatoncys removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454283506 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
SparkQA commented on issue #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547#issuecomment-454284395 **[Test build #101235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101235/testReport)** for PR 23547 at commit [`25eaf30`](https://github.com/apache/spark/commit/25eaf307589a920023fa0098ee13162ad0f3a653). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] rednaxelafx opened a new pull request #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders
rednaxelafx opened a new pull request #23547: [WIP][SQL] Don't always trust primitive types being not-nullable from Scala type information when creating Encoders URL: https://github.com/apache/spark/pull/23547 ## What changes were proposed in this pull request? TBD ## How was this patch tested? TBD (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] asfgit closed pull request #23533: [CORE][MINOR] Fix some typos about MemoryMode
asfgit closed pull request #23533: [CORE][MINOR] Fix some typos about MemoryMode URL: https://github.com/apache/spark/pull/23533 As this is a foreign pull request (from a fork), the diff has been sent to your commit mailing list, comm...@spark.apache.org This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] felixcheung commented on issue #23499: [SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core.
felixcheung commented on issue #23499: [SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. URL: https://github.com/apache/spark/pull/23499#issuecomment-454283710 (I’d like to know about shading too) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] eatoncys commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
eatoncys commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454283506 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode
HyukjinKwon commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode URL: https://github.com/apache/spark/pull/23533#issuecomment-454283105 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
cloud-fan commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282941 Hi @rdblue , thanks for the review! It will be great to finish all the write operations soon, and adding overwrite is good as the next step! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] attilapiros commented on a change in pull request #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests
attilapiros commented on a change in pull request #23540: [SPARK-26615][Core] Fixing transport server/client resource leaks in the core unittests URL: https://github.com/apache/spark/pull/23540#discussion_r24099 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala ## @@ -140,18 +138,8 @@ class BlockManagerSuite extends SparkFunSuite with Matchers with BeforeAndAfterE override def afterEach(): Unit = { try { conf = null - if (store != null) { -store.stop() -store = null - } - if (store2 != null) { -store2.stop() -store2 = null - } - if (store3 != null) { -store3.stop() -store3 = null - } + allStores.foreach(_.stop()) + allStores.clear() Review comment: There are more leaks some are very well hidden :) The not so pro at hiding (which can be easily corrected using the old cleanup stategy) is [BlockManagerSuite.scala#L1397](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala#L1397 ) And there are two `makeBlockManager` calls where exceptions are expected: - [BlockManagerSuite.scala#L1391](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala#L1391) - [BlockManagerSuite.scala#L1384](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala#L1384) The exception is coming from [`blockManager.initialize("app-id")`](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala#L109) which goes further down to [`registerWithExternalShuffleServer()`](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L264). But here we already have an initialised blockTransferService and a shuffleClient. So `makeBlockManager` creates a BlockManager which should be cleaned up but because of the exception that instance cannot be saved to the old store variables. This is where the old cleanup strategy was broken. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
AmplabJenkins removed a comment on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282569 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7081/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
AmplabJenkins removed a comment on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282566 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
AmplabJenkins commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282569 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7081/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
AmplabJenkins commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282566 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write)
SparkQA commented on issue #23208: [SPARK-25530][SQL] data source v2 API refactor (batch write) URL: https://github.com/apache/spark/pull/23208#issuecomment-454282305 **[Test build #101234 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101234/testReport)** for PR 23208 at commit [`693fb98`](https://github.com/apache/spark/commit/693fb986e54ef8f7d09d6c028d18df50d2db117e). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query URL: https://github.com/apache/spark/pull/23531#discussion_r247552872 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala ## @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { /** * Returns a copy of this node where `f` has been applied to all the nodes children. */ - def mapChildren(f: BaseType => BaseType): BaseType = { -if (children.nonEmpty) { + def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): BaseType = { Review comment: You are right @mgaido91, `BroadcastExchangeExec` in the recursive term cause some issues there. We can't prepare them until we have the result of the anchor term. This is why I introduced `doNotPrepareInAdvance`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454280693 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101217/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query URL: https://github.com/apache/spark/pull/23531#discussion_r247552872 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala ## @@ -289,13 +294,13 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product { /** * Returns a copy of this node where `f` has been applied to all the nodes children. */ - def mapChildren(f: BaseType => BaseType): BaseType = { -if (children.nonEmpty) { + def mapChildren(f: BaseType => BaseType, forceCopy: Boolean = false): BaseType = { Review comment: You are right @mgaido91, `BroadcastExchangeExec` in the recursive term cause some issues there. We can't prepare them in `RecursiveTable` until we have the result of the anchor term. This is why I introduced `doNotPrepareInAdvance`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454280693 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101217/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454280383 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101233/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454280688 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454280688 Build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454280380 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
SparkQA removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454241464 **[Test build #101217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101217/testReport)** for PR 23010 at commit [`780aa48`](https://github.com/apache/spark/commit/780aa48ae9c1eea27d63cf2051ca550bee0c3cef). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
SparkQA commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454280524 **[Test build #101217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101217/testReport)** for PR 23010 at commit [`780aa48`](https://github.com/apache/spark/commit/780aa48ae9c1eea27d63cf2051ca550bee0c3cef). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds the following public classes _(experimental)_: * ` case class Empty2Null(child: Expression) extends UnaryExpression with String2StringExpression ` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
SparkQA removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454278429 **[Test build #101233 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101233/testReport)** for PR 19788 at commit [`c6ebe0e`](https://github.com/apache/spark/commit/c6ebe0e979d0705bafec0dca7df90c8e31a29ffd). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454280368 **[Test build #101233 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101233/testReport)** for PR 19788 at commit [`c6ebe0e`](https://github.com/apache/spark/commit/c6ebe0e979d0705bafec0dca7df90c8e31a29ffd). * This patch **fails Java style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454280383 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101233/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454280380 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454280021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101230/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
AmplabJenkins commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454280021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101230/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454280017 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
AmplabJenkins commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454280017 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
SparkQA removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454268343 **[Test build #101230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101230/testReport)** for PR 21632 at commit [`7d2f131`](https://github.com/apache/spark/commit/7d2f131a6110e044e85565774de1a7991568d2bc). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
SparkQA commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454279752 **[Test build #101230 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101230/testReport)** for PR 21632 at commit [`7d2f131`](https://github.com/apache/spark/commit/7d2f131a6110e044e85565774de1a7991568d2bc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees
AmplabJenkins removed a comment on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-454268468 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7078/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454278429 **[Test build #101233 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101233/testReport)** for PR 19788 at commit [`c6ebe0e`](https://github.com/apache/spark/commit/c6ebe0e979d0705bafec0dca7df90c8e31a29ffd). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454277711 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454277716 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7080/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454277711 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
AmplabJenkins commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454277716 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7080/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode
SparkQA removed a comment on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode URL: https://github.com/apache/spark/pull/23533#issuecomment-454232682 **[Test build #4514 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4514/testReport)** for PR 23533 at commit [`4975d5a`](https://github.com/apache/spark/commit/4975d5a0c1e2a2681c8244d87044eeae873bbb7c). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode
SparkQA commented on issue #23533: [CORE][MINOR] Fix some typos about MemoryMode URL: https://github.com/apache/spark/pull/23533#issuecomment-454277494 **[Test build #4514 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4514/testReport)** for PR 23533 at commit [`4975d5a`](https://github.com/apache/spark/commit/4975d5a0c1e2a2681c8244d87044eeae873bbb7c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types
SparkQA commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-454276618 **[Test build #101232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101232/testReport)** for PR 23010 at commit [`f9701fb`](https://github.com/apache/spark/commit/f9701fbe660ceee706ca7cfa989d82876a343ba2). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454276046 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101216/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454276043 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454276043 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
AmplabJenkins commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454276046 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101216/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
SparkQA removed a comment on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454240348 **[Test build #101216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101216/testReport)** for PR 19788 at commit [`aa6134b`](https://github.com/apache/spark/commit/aa6134b98006027bda7e984b6332e316da10ae07). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs
SparkQA commented on issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs URL: https://github.com/apache/spark/pull/19788#issuecomment-454275871 **[Test build #101216 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101216/testReport)** for PR 19788 at commit [`aa6134b`](https://github.com/apache/spark/commit/aa6134b98006027bda7e984b6332e316da10ae07). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan edited a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
cloud-fan edited a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-454275695 LGTM with only one comment: https://github.com/apache/spark/pull/23512#discussion_r247770929 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
cloud-fan commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-454275695 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on a change in pull request #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
cloud-fan commented on a change in pull request #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#discussion_r247770929 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala ## @@ -78,8 +78,8 @@ class DateExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper { } checkEvaluation(DayOfYear(Literal.create(null, DateType)), null) -checkEvaluation(DayOfYear(Literal(new Date(sdf.parse("1582-10-15 13:10:15").getTime))), 288) -checkEvaluation(DayOfYear(Literal(new Date(sdf.parse("1582-10-04 13:10:15").getTime))), 277) +checkEvaluation(DayOfYear(Cast(Literal("1582-10-15 13:10:15"), DateType)), 288) Review comment: can we remove `sdf` from this test suite then? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23543: [SPARK-25935][SQL] Allow null rows for bad records from JSON parsers
cloud-fan commented on issue #23543: [SPARK-25935][SQL] Allow null rows for bad records from JSON parsers URL: https://github.com/apache/spark/pull/23543#issuecomment-454274586 good catch! I've updated the title This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] cloud-fan commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible
cloud-fan commented on issue #23510: [SPARK-26590][CORE] make fetch-block-to-disk backward compatible URL: https://github.com/apache/spark/pull/23510#issuecomment-454274318 > so the client accepts it, but will the client stream to disk still, or will it fallback to still fetching to memory? When the old server returns a normal chunk fetch response, the new client will process it just like the client has sent a normal chunk fetch request, and put the data in memory. > It seems it should be possible to stream to disk, as the server is really sending virtually the same bytes either way (just a different header, more or less) AFAIK the streaming response is very different from chunk fetch response. The chunk fetch response will send the data in one message, so the client already puts the data in memory when it receives the message. The stream response is a notice of the following small messages, and the real data is sent via many small messages, so that client has a chance to put it in disk incrementally. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454274125 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101215/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
AmplabJenkins removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454274122 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454274122 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
AmplabJenkins commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454274125 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101215/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
SparkQA removed a comment on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454239280 **[Test build #101215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101215/testReport)** for PR 23396 at commit [`5ab965e`](https://github.com/apache/spark/commit/5ab965e7362471e6c833c1cf411e8576ab8a5f5e). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support
SparkQA commented on issue #23396: [SPARK-26397][SQL] Driver side only metrics support URL: https://github.com/apache/spark/pull/23396#issuecomment-454273781 **[Test build #101215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101215/testReport)** for PR 23396 at commit [`5ab965e`](https://github.com/apache/spark/commit/5ab965e7362471e6c833c1cf411e8576ab8a5f5e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#issuecomment-454273350 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7079/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
AmplabJenkins commented on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#issuecomment-454273350 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/7079/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
AmplabJenkins removed a comment on issue #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#issuecomment-454273349 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org