[GitHub] [spark] cloud-fan closed pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


cloud-fan closed pull request #29539:
URL: https://github.com/apache/spark/pull/29539


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


cloud-fan commented on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681634549


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


viirya commented on a change in pull request #29535:
URL: https://github.com/apache/spark/pull/29535#discussion_r478139414



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
##
@@ -42,7 +42,9 @@ class UnresolvedException[TreeType <: TreeNode[_]](tree: 
TreeType, function: Str
  * @param multipartIdentifier table name

Review comment:
   Add `options` to param doc?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


cloud-fan commented on a change in pull request #29535:
URL: https://github.com/apache/spark/pull/29535#discussion_r478181061



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -822,7 +822,9 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def table(tableName: String): DataFrame = {
 assertNoSpecifiedSchema("table")
-sparkSession.table(tableName)
+val multipartIdentifier =
+  sparkSession.sessionState.sqlParser.parseMultipartIdentifier(tableName)
+Dataset.ofRows(sparkSession, UnresolvedRelation(multipartIdentifier, 
extraOptions))

Review comment:
   `toMap` is required to get the original map (keys are not lowercased)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


cloud-fan commented on a change in pull request #29535:
URL: https://github.com/apache/spark/pull/29535#discussion_r478180602



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -822,7 +822,9 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def table(tableName: String): DataFrame = {
 assertNoSpecifiedSchema("table")
-sparkSession.table(tableName)
+val multipartIdentifier =
+  sparkSession.sessionState.sqlParser.parseMultipartIdentifier(tableName)
+Dataset.ofRows(sparkSession, UnresolvedRelation(multipartIdentifier, 
extraOptions))

Review comment:
   we should create `CaseInsensitiveStringMap` here:
   ```
   new CaseInsensitiveStringMap(extraOptions.toMap)
   ```

##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala
##
@@ -822,7 +822,9 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
*/
   def table(tableName: String): DataFrame = {
 assertNoSpecifiedSchema("table")
-sparkSession.table(tableName)
+val multipartIdentifier =
+  sparkSession.sessionState.sqlParser.parseMultipartIdentifier(tableName)
+Dataset.ofRows(sparkSession, UnresolvedRelation(multipartIdentifier, 
extraOptions))

Review comment:
   we should create `CaseInsensitiveStringMap` here:
   ```
   new CaseInsensitiveStringMap(extraOptions.toMap.asJava)
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


Ngone51 commented on a change in pull request #28911:
URL: https://github.com/apache/spark/pull/28911#discussion_r478179838



##
File path: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##
@@ -463,54 +466,73 @@ final class ShuffleBlockFetcherIterator(
* track in-memory are the ManagedBuffer references themselves.
*/
   private[this] def fetchHostLocalBlocks(hostLocalDirManager: 
HostLocalDirManager): Unit = {
-val cachedDirsByExec = hostLocalDirManager.getCachedHostLocalDirs()
-val (hostLocalBlocksWithCachedDirs, hostLocalBlocksWithMissingDirs) =
-  hostLocalBlocksByExecutor
-.map { case (hostLocalBmId, bmInfos) =>
-  (hostLocalBmId, bmInfos, 
cachedDirsByExec.get(hostLocalBmId.executorId))
-}.partition(_._3.isDefined)
-val bmId = blockManager.blockManagerId
-val immutableHostLocalBlocksWithoutDirs =
-  hostLocalBlocksWithMissingDirs.map { case (hostLocalBmId, bmInfos, _) =>
-hostLocalBmId -> bmInfos
-  }.toMap
-if (immutableHostLocalBlocksWithoutDirs.nonEmpty) {
+val cachedDirsByExec = hostLocalDirManager.getCachedHostLocalDirs
+val (hostLocalBlocksWithCachedDirs, hostLocalBlocksWithMissingDirs) = {
+  val (hasCache, noCache) = hostLocalBlocksByExecutor.partition { case 
(hostLocalBmId, _) =>
+cachedDirsByExec.contains(hostLocalBmId.executorId)
+  }
+  (hasCache.toMap, noCache.toMap)
+}
+
+if (hostLocalBlocksWithMissingDirs.nonEmpty) {
   logDebug(s"Asynchronous fetching host-local blocks without cached 
executors' dir: " +
-s"${immutableHostLocalBlocksWithoutDirs.mkString(", ")}")
-  val execIdsWithoutDirs = 
immutableHostLocalBlocksWithoutDirs.keys.map(_.executorId).toArray
-  hostLocalDirManager.getHostLocalDirs(execIdsWithoutDirs) {
-case Success(dirs) =>
-  immutableHostLocalBlocksWithoutDirs.foreach { case (hostLocalBmId, 
blockInfos) =>
-blockInfos.takeWhile { case (blockId, _, mapIndex) =>
-  fetchHostLocalBlock(
-blockId,
-mapIndex,
-dirs.get(hostLocalBmId.executorId),
-hostLocalBmId)
-}
-  }
-  logDebug(s"Got host-local blocks (without cached executors' dir) in 
" +
-s"${Utils.getUsedTimeNs(startTimeNs)}")
-
-case Failure(throwable) =>
-  logError(s"Error occurred while fetching host local blocks", 
throwable)
-  val (hostLocalBmId, blockInfoSeq) = 
immutableHostLocalBlocksWithoutDirs.head
-  val (blockId, _, mapIndex) = blockInfoSeq.head
-  results.put(FailureFetchResult(blockId, mapIndex, hostLocalBmId, 
throwable))
+s"${hostLocalBlocksWithMissingDirs.mkString(", ")}")
+
+  // If the external shuffle service is enabled, we'll fetch the local 
directories for
+  // multiple executors from the external shuffle service, which located 
at the same host
+  // with the executors, in once. Otherwise, we'll fetch the local 
directories from those
+  // executors directly one by one. The fetch requests won't be too much 
since one host is
+  // almost impossible to have many executors at the same time practically.
+  val dirFetchRequests = if (blockManager.externalShuffleServiceEnabled) {
+val host = blockManager.blockManagerId.host
+val port = blockManager.externalShuffleServicePort
+Seq((host, port, hostLocalBlocksWithMissingDirs.keys.toArray))
+  } else {
+hostLocalBlocksWithMissingDirs.keys.map(bmId => (bmId.host, bmId.port, 
Array(bmId))).toSeq
+  }
+
+  dirFetchRequests.foreach { case (host, port, bmIds) =>
+hostLocalDirManager.getHostLocalDirs(host, port, 
bmIds.map(_.executorId)) {
+  case Success(dirsByExecId) =>
+fetchMultipleHostLocalBlocks(
+  hostLocalBlocksWithMissingDirs.filterKeys(bmIds.contains),
+  dirsByExecId,
+  cached = false)
+
+  case Failure(throwable) =>
+logError("Error occurred while fetching host local blocks", 
throwable)
+val bmId = bmIds.head
+val blockInfoSeq = hostLocalBlocksWithMissingDirs(bmId)
+val (blockId, _, mapIndex) = blockInfoSeq.head
+results.put(FailureFetchResult(blockId, mapIndex, bmId, throwable))
+}
   }
 }
+
 if (hostLocalBlocksWithCachedDirs.nonEmpty) {
   logDebug(s"Synchronous fetching host-local blocks with cached executors' 
dir: " +
   s"${hostLocalBlocksWithCachedDirs.mkString(", ")}")
-  hostLocalBlocksWithCachedDirs.foreach { case (_, blockInfos, localDirs) 
=>
-blockInfos.foreach { case (blockId, _, mapIndex) =>
-  if (!fetchHostLocalBlock(blockId, mapIndex, localDirs.get, bmId)) {

Review comment:
   @attilapiros  This looks like a bug before. The `bmId` is for the 
current executor but blocks can be other executors on the sa

[GitHub] [spark] cloud-fan commented on a change in pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


cloud-fan commented on a change in pull request #29535:
URL: https://github.com/apache/spark/pull/29535#discussion_r478179037



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
##
@@ -42,7 +42,9 @@ class UnresolvedException[TreeType <: TreeNode[_]](tree: 
TreeType, function: Str
  * @param multipartIdentifier table name
  */
 case class UnresolvedRelation(
-multipartIdentifier: Seq[String]) extends LeafNode with NamedRelation {
+multipartIdentifier: Seq[String],
+options: CaseInsensitiveMap[String] = 
CaseInsensitiveMap[String](Map.empty))

Review comment:
   can we use `CaseInsensitiveStringMap` since it's only for v2?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29547: [SPARK-32705][SQL] Fix serialization issue for EmptyHashedRelation

2020-08-26 Thread GitBox


cloud-fan commented on pull request #29547:
URL: https://github.com/apache/spark/pull/29547#issuecomment-681618946


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29547: [SPARK-32705][SQL] Fix serialization issue for EmptyHashedRelation

2020-08-26 Thread GitBox


cloud-fan closed pull request #29547:
URL: https://github.com/apache/spark/pull/29547


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


Ngone51 commented on pull request #28911:
URL: https://github.com/apache/spark/pull/28911#issuecomment-681619106


   Thank you for the review. I've addressed the comments. Please take another 
look when you have time.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #25419: [SPARK-28698][SQL] Support user-specified output schema in `to_avro`

2020-08-26 Thread GitBox


cloud-fan commented on pull request #25419:
URL: https://github.com/apache/spark/pull/25419#issuecomment-681618395


   @dirrao in general only bug fixes can go to earlier branches.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


Ngone51 commented on a change in pull request #28911:
URL: https://github.com/apache/spark/pull/28911#discussion_r478175273



##
File path: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
##
@@ -466,42 +464,51 @@ final class ShuffleBlockFetcherIterator(
 val cachedDirsByExec = hostLocalDirManager.getCachedHostLocalDirs()
 val (hostLocalBlocksWithCachedDirs, hostLocalBlocksWithMissingDirs) =
   hostLocalBlocksByExecutor
-.map { case (hostLocalBmId, bmInfos) =>
-  (hostLocalBmId, bmInfos, 
cachedDirsByExec.get(hostLocalBmId.executorId))
+.map { case (hostLocalBmId, blockInfos) =>
+  (hostLocalBmId, blockInfos, 
cachedDirsByExec.get(hostLocalBmId.executorId))
 }.partition(_._3.isDefined)
-val bmId = blockManager.blockManagerId
 val immutableHostLocalBlocksWithoutDirs =
-  hostLocalBlocksWithMissingDirs.map { case (hostLocalBmId, bmInfos, _) =>
-hostLocalBmId -> bmInfos
+  hostLocalBlocksWithMissingDirs.map { case (hostLocalBmId, blockInfos, _) 
=>
+hostLocalBmId -> blockInfos
   }.toMap
 if (immutableHostLocalBlocksWithoutDirs.nonEmpty) {
   logDebug(s"Asynchronous fetching host-local blocks without cached 
executors' dir: " +
 s"${immutableHostLocalBlocksWithoutDirs.mkString(", ")}")
-  val execIdsWithoutDirs = 
immutableHostLocalBlocksWithoutDirs.keys.map(_.executorId).toArray
-  hostLocalDirManager.getHostLocalDirs(execIdsWithoutDirs) {
-case Success(dirs) =>
-  immutableHostLocalBlocksWithoutDirs.foreach { case (hostLocalBmId, 
blockInfos) =>
-blockInfos.takeWhile { case (blockId, _, mapIndex) =>
-  fetchHostLocalBlock(
-blockId,
-mapIndex,
-dirs.get(hostLocalBmId.executorId),
-hostLocalBmId)
+  val dirFetchRequests = if (blockManager.externalShuffleServiceEnabled) {
+val host = blockManager.blockManagerId.host
+val port = blockManager.externalShuffleServicePort
+Seq((host, port, immutableHostLocalBlocksWithoutDirs.keys.toArray))
+  } else {
+immutableHostLocalBlocksWithoutDirs.keys
+  .map(bmId => (bmId.host, bmId.port, Array(bmId))).toSeq
+  }
+
+  dirFetchRequests.foreach { case (host, port, bmIds) =>
+hostLocalDirManager.getHostLocalDirs(host, port, 
bmIds.map(_.executorId)) {
+  case Success(dirsByExecId) =>

Review comment:
   yes





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load() and DataStreamW

2020-08-26 Thread GitBox


cloud-fan closed pull request #29543:
URL: https://github.com/apache/spark/pull/29543


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load() and DataS

2020-08-26 Thread GitBox


cloud-fan commented on pull request #29543:
URL: https://github.com/apache/spark/pull/29543#issuecomment-681617465


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29555: [SPARK-32693][SQL] Compare two dataframes with same schema except nullable property

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29555:
URL: https://github.com/apache/spark/pull/29555#issuecomment-681600157







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29555: [SPARK-32693][SQL] Compare two dataframes with same schema except nullable property

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29555:
URL: https://github.com/apache/spark/pull/29555#issuecomment-681600157







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29555: [SPARK-32693][SQL] Compare two dataframes with same schema except nullable property

2020-08-26 Thread GitBox


SparkQA commented on pull request #29555:
URL: https://github.com/apache/spark/pull/29555#issuecomment-681597858


   **[Test build #127946 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127946/testReport)**
 for PR 29555 at commit 
[`20b2ba1`](https://github.com/apache/spark/commit/20b2ba1404f1b780b501b1266b4458b099c0670b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #29555: [SPARK-32693][SQL] Compare two dataframes with same schema except nullable property

2020-08-26 Thread GitBox


viirya opened a new pull request #29555:
URL: https://github.com/apache/spark/pull/29555


   
   
   ### What changes were proposed in this pull request?
   
   
   This PR changes key data types check in `HashJoin` to use `sameType`.
   
   ### Why are the changes needed?
   
   
   Looks at the resolving condition of `SetOperation`, it requires only each 
left data types should be `sameType` as the right ones. Logically the `EqualTo` 
expression in equi-join, also requires only left data type `sameType` as right 
data type. Then `HashJoin` requires left keys data type exactly the same as 
right keys data type, looks not reasonable.
   
   It makes inconsistent results when doing `except` between two dataframes.
   
   If two dataframes don't have nested fields, even their field nullable 
property different, `HashJoin` passes the key type check because it checks 
field individually so field nullable property is ignored.
   
   If two dataframes have nested fields like struct, `HashJoin` fails the key 
type check because now it compare two struct types and nullable property now 
affects. 
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   Yes. Making consistent `except` operation between dataframes.
   
   ### How was this patch tested?
   
   
   Unit test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dirrao edited a comment on pull request #25419: [SPARK-28698][SQL] Support user-specified output schema in `to_avro`

2020-08-26 Thread GitBox


dirrao edited a comment on pull request #25419:
URL: https://github.com/apache/spark/pull/25419#issuecomment-681580344







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dirrao commented on pull request #25419: [SPARK-28698][SQL] Support user-specified output schema in `to_avro`

2020-08-26 Thread GitBox


dirrao commented on pull request #25419:
URL: https://github.com/apache/spark/pull/25419#issuecomment-681580344


   is it possible to get the bug fix back-ported to 2.4.4?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #28911:
URL: https://github.com/apache/spark/pull/28911#issuecomment-681543539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #28911:
URL: https://github.com/apache/spark/pull/28911#issuecomment-681543539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28911: [SPARK-32077][CORE] Support host-local shuffle data reading when external shuffle service is disabled

2020-08-26 Thread GitBox


SparkQA commented on pull request #28911:
URL: https://github.com/apache/spark/pull/28911#issuecomment-681541690


   **[Test build #127945 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127945/testReport)**
 for PR 28911 at commit 
[`2aa71f6`](https://github.com/apache/spark/commit/2aa71f659bb3be4ee7a7ae71beaf953fc77c4433).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


huaxingao commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681480921


   @cloud-fan @viirya Could you please check one more time? Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681461215







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681461215







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681189913


   **[Test build #127943 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127943/testReport)**
 for PR 29535 at commit 
[`0bc7698`](https://github.com/apache/spark/commit/0bc76982c4873b184096fce33981321dec751d84).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681456113


   **[Test build #127943 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127943/testReport)**
 for PR 29535 at commit 
[`0bc7698`](https://github.com/apache/spark/commit/0bc76982c4873b184096fce33981321dec751d84).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681342930







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681342930







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


SparkQA commented on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681339756


   **[Test build #127941 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127941/testReport)**
 for PR 27019 at commit 
[`a43aa25`](https://github.com/apache/spark/commit/a43aa25943e4a253764d5885f0d28c2f8743baa9).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `trait GeneratePredicateHelper extends PredicateHelper `
 * `case class FilterExec(condition: Expression, child: SparkPlan)`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681173648


   **[Test build #127941 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127941/testReport)**
 for PR 27019 at commit 
[`a43aa25`](https://github.com/apache/spark/commit/a43aa25943e4a253764d5885f0d28c2f8743baa9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on pull request #29547: [SPARK-32705][SQL] Fix serialization issue for EmptyHashedRelation

2020-08-26 Thread GitBox


c21 commented on pull request #29547:
URL: https://github.com/apache/spark/pull/29547#issuecomment-681328705


   > I considered adding a UT for this issue, too. But it's wrong usage about 
using scala object with Externalizable interface, since it is removed, I think 
scala case object itself can guarantee the serialization; besides, I've run the 
entire TPCDS just to make sure the E2E coverage, I think it should be enough, 
^_^
   
   @leanken - cool, thanks for testing this end to end. :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681314628


   **[Test build #127944 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127944/testReport)**
 for PR 29539 at commit 
[`d3888d4`](https://github.com/apache/spark/commit/d3888d45ef8c5cedd5a15944be7100b5fcdb9dd8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681323342







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681323342







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


SparkQA commented on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681323197


   **[Test build #127944 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127944/testReport)**
 for PR 29539 at commit 
[`d3888d4`](https://github.com/apache/spark/commit/d3888d45ef8c5cedd5a15944be7100b5fcdb9dd8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on a change in pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


yaooqinn commented on a change in pull request #29539:
URL: https://github.com/apache/spark/pull/29539#discussion_r478008853



##
File path: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/SparkMetadataOperationSuite.scala
##
@@ -333,4 +333,31 @@ class SparkMetadataOperationSuite extends 
HiveThriftJdbcTest {
   assert(pos === 17, "all columns should have been verified")
 }
   }
+
+  test("get columns operation should handle interval column properly") {

Review comment:
   done, thank you guys for the check~





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681315044







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681315044







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29539: [SPARK-32696][SQL][test-hive1.2][test-hadoop2.7]Get columns operation should handle interval column properly

2020-08-26 Thread GitBox


SparkQA commented on pull request #29539:
URL: https://github.com/apache/spark/pull/29539#issuecomment-681314628


   **[Test build #127944 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127944/testReport)**
 for PR 29539 at commit 
[`d3888d4`](https://github.com/apache/spark/commit/d3888d45ef8c5cedd5a15944be7100b5fcdb9dd8).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] manuzhang edited a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union

2020-08-26 Thread GitBox


manuzhang edited a comment on pull request #28947:
URL: https://github.com/apache/spark/pull/28947#issuecomment-681295842


   @cloud-fan @LantaoJin 
   Any progress or any suggestion on where we can move forward with this 
improvement ? We've seen a lot of skew joins not being handled by AQE due to 
union.
   
   cc @maryannxue @JkSelf 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] manuzhang commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union

2020-08-26 Thread GitBox


manuzhang commented on pull request #28947:
URL: https://github.com/apache/spark/pull/28947#issuecomment-681295842


   @cloud-fan @LantaoJin 
   Any progress or any suggestion on where we can move forward with this 
improvement ? We've seen a lot of skew joins not being handled by AQE due to 
union.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29554: [SPARK-32707] Support File Based spark.authenticate secret in Mesos Backend

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29554:
URL: https://github.com/apache/spark/pull/29554#issuecomment-681290492


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29554: [SPARK-32707] Support File Based spark.authenticate secret in Mesos Backend

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29554:
URL: https://github.com/apache/spark/pull/29554#issuecomment-681292248


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] leanken commented on pull request #29547: [SPARK-32705][SQL] Fix serialization issue for EmptyHashedRelation

2020-08-26 Thread GitBox


leanken commented on pull request #29547:
URL: https://github.com/apache/spark/pull/29547#issuecomment-681291807


   @cloud-fan Test passed, ready to merge.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] leanken commented on pull request #29547: [SPARK-32705][SQL] Fix serialization issue for EmptyHashedRelation

2020-08-26 Thread GitBox


leanken commented on pull request #29547:
URL: https://github.com/apache/spark/pull/29547#issuecomment-681291235


   > Thanks @leanken for the fix. Just wondering is it possible to add a unit 
test for the issue?
   
   I considered adding a UT for this issue, too. But it's wrong usage about 
using scala object with Externalizable interface, since it is removed, I think 
scala case object itself can guarantee the serialization; besides, I've run the 
entire TPCDS just to make sure the E2E coverage, I think it should be enough, 
^_^



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29554: [SPARK-32707] Support File Based spark.authenticate secret in Mesos Backend

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29554:
URL: https://github.com/apache/spark/pull/29554#issuecomment-681290492


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] farhan5900 opened a new pull request #29554: [SPARK-32707] Support File Based spark.authenticate secret in Mesos Backend

2020-08-26 Thread GitBox


farhan5900 opened a new pull request #29554:
URL: https://github.com/apache/spark/pull/29554


   
   ### What changes were proposed in this pull request?
   This PR introduces the functionality of supporting file-based authenticate 
secret for the Mesos backend by using the available configs 
`spark.authenticate.secret.file`, `spark.authenticate.secret.driver.file` and 
`spark.authenticate.secret.executor.file`.
   
   ### Why are the changes needed?
   Passing secrets in a file is more secure than passing secret as plain text.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Added a case in unittest.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] manuzhang edited a comment on pull request #29540: [SPARK-32698][SQL] Do not fall back to default parallelism if the minimum number of coalesced partitions is not set in AQE

2020-08-26 Thread GitBox


manuzhang edited a comment on pull request #29540:
URL: https://github.com/apache/spark/pull/29540#issuecomment-681198316


   @Dooyoung-Hwang Thanks for comments.
   
   > At the first glance, performance is more important than having small 
number of files.
   > We had better be careful because we don't want any performance regression.
   
   I thought so but let'm ask the question how much performance difference is a 
regression. Will user notice a 5 min increase of running time if the whole job 
takes more than 30 min ? Probably not from my experience but **users have 
immediately reported the increased number of small files compared with the 
previous day because that had impact on their downstream jobs in the 
pipeline**. 
   
   Moreover, I'm arguing about the uncertainty and complexity that fallback 
mechanism has brought in. Even if the job is running slower or even crashes due 
to coalescing, we can suggest user to tune down `advisoryTargetSize` or tune up 
executor memory. Once it works it's done. It won't change across several runs 
as in the case of fallback to default parallelism. Meanwhile we have to explain 
why `advisoryTargetSize` isn't working and why it's different from yesterday.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] manuzhang commented on pull request #29540: [SPARK-32698][SQL] Do not fall back to default parallelism if the minimum number of coalesced partitions is not set in AQE

2020-08-26 Thread GitBox


manuzhang commented on pull request #29540:
URL: https://github.com/apache/spark/pull/29540#issuecomment-681198316


   @Dooyoung-Hwang Thanks for comments.
   
   > At the first glance, performance is more important than having small 
number of files.
   > We had better be careful because we don't want any performance regression.
   
   I thought so but let'm ask the question how much performance difference is a 
regression. Will user notice a 5 min increase of running time if the whole job 
takes more than 30 min ? Probably not from my experience but **users have 
immediately reported the increased number of small files compared with the 
previous day because that had impact on their downstream jobs in the 
pipeline**. 
   
   Moreover, I'm arguing about the uncertainty and complexity that fallback 
mechanism have brought in. Even if the job is running slower or even crashes 
due to coalescing, we can suggest user to tune down `advisoryTargetSize` or 
tune up executor memory. Once it works it's done. It won't change across 
several runs as in the case of falling back to default parallelism. Meanwhile 
we have to explain why `advisoryTargetSize` isn't working and why it's 
different from yesterday.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] manuzhang edited a comment on pull request #29540: [SPARK-32698][SQL] Do not fall back to default parallelism if the minimum number of coalesced partitions is not set in AQE

2020-08-26 Thread GitBox


manuzhang edited a comment on pull request #29540:
URL: https://github.com/apache/spark/pull/29540#issuecomment-681198316


   @Dooyoung-Hwang Thanks for comments.
   
   > At the first glance, performance is more important than having small 
number of files.
   > We had better be careful because we don't want any performance regression.
   
   I thought so but let'm ask the question how much performance difference is a 
regression. Will user notice a 5 min increase of running time if the whole job 
takes more than 30 min ? Probably not from my experience but **users have 
immediately reported the increased number of small files compared with the 
previous day because that had impact on their downstream jobs in the 
pipeline**. 
   
   Moreover, I'm arguing about the uncertainty and complexity that fallback 
mechanism has brought in. Even if the job is running slower or even crashes due 
to coalescing, we can suggest user to tune down `advisoryTargetSize` or tune up 
executor memory. Once it works it's done. It won't change across several runs 
as in the case of falling back to default parallelism. Meanwhile we have to 
explain why `advisoryTargetSize` isn't working and why it's different from 
yesterday.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #25609: [SPARK-28896][K8S] Support defining HADOOP_CONF_DIR and config map at the same time

2020-08-26 Thread GitBox


github-actions[bot] closed pull request #25609:
URL: https://github.com/apache/spark/pull/25609


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2020-08-26 Thread GitBox


github-actions[bot] commented on pull request #25651:
URL: https://github.com/apache/spark/pull/25651#issuecomment-681195134


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #28557: [SPARK-31737][SQL] SparkSQL can't recognize the modified length of Hive varchar

2020-08-26 Thread GitBox


github-actions[bot] commented on pull request #28557:
URL: https://github.com/apache/spark/pull/28557#issuecomment-681195109


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-08-26 Thread GitBox


github-actions[bot] commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-681195120


   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681190302







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681190302







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681189913


   **[Test build #127943 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127943/testReport)**
 for PR 29535 at commit 
[`0bc7698`](https://github.com/apache/spark/commit/0bc76982c4873b184096fce33981321dec751d84).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681183029


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127942/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681181596


   **[Test build #127942 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127942/testReport)**
 for PR 29535 at commit 
[`05b55cb`](https://github.com/apache/spark/commit/05b55cbb5107597fc85687c355ecb0b983605f8e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681183024


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681183024







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681183010


   **[Test build #127942 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127942/testReport)**
 for PR 29535 at commit 
[`05b55cb`](https://github.com/apache/spark/commit/05b55cbb5107597fc85687c355ecb0b983605f8e).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681181901







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681181901







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29535: [SPARK-32592][SQL] Make DataFrameReader.table take the specified options

2020-08-26 Thread GitBox


SparkQA commented on pull request #29535:
URL: https://github.com/apache/spark/pull/29535#issuecomment-681181596


   **[Test build #127942 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127942/testReport)**
 for PR 29535 at commit 
[`05b55cb`](https://github.com/apache/spark/commit/05b55cbb5107597fc85687c355ecb0b983605f8e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681178909







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681178909







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681173625


   **[Test build #127940 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127940/testReport)**
 for PR 29553 at commit 
[`844fe5c`](https://github.com/apache/spark/commit/844fe5c87fff19042e3abf84be590ce9deb0eac7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


SparkQA commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681178757


   **[Test build #127940 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127940/testReport)**
 for PR 29553 at commit 
[`844fe5c`](https://github.com/apache/spark/commit/844fe5c87fff19042e3abf84be590ce9deb0eac7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681174039







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681173969







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681174039







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681173969







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681171525


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


SparkQA commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681173625


   **[Test build #127940 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127940/testReport)**
 for PR 29553 at commit 
[`844fe5c`](https://github.com/apache/spark/commit/844fe5c87fff19042e3abf84be590ce9deb0eac7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27019: [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec

2020-08-26 Thread GitBox


SparkQA commented on pull request #27019:
URL: https://github.com/apache/spark/pull/27019#issuecomment-681173648


   **[Test build #127941 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127941/testReport)**
 for PR 27019 at commit 
[`a43aa25`](https://github.com/apache/spark/commit/a43aa25943e4a253764d5885f0d28c2f8743baa9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29544: [SPARK-32704][SQL] Logging plan changes for execution

2020-08-26 Thread GitBox


maropu commented on pull request #29544:
URL: https://github.com/apache/spark/pull/29544#issuecomment-681173222


   cc: @cloud-fan @viirya 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681171240


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681171525


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


viirya commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681171481


   ok to test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29553:
URL: https://github.com/apache/spark/pull/29553#issuecomment-681171240


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Louiszr opened a new pull request #29553: [SPARK-32092][ML][PySpark][FOLLOWUP] Fixed CrossValidatorModel.copy() to copy models instead of list

2020-08-26 Thread GitBox


Louiszr opened a new pull request #29553:
URL: https://github.com/apache/spark/pull/29553


   
   
   ### What changes were proposed in this pull request?
   
   Fixed `CrossValidatorModel.copy()` so that it correctly calls `.copy()` on 
the models instead of lists of models.
   
   ### Why are the changes needed?
   
   `copy()` was first changed in #29445 . The issue was found in CI of #29524 
and fixed. This PR introduces the exact same change so that 
`CrossValidatorModel.copy()` and its related tests are aligned in branch 
`master` and branch `branch-3.0`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Updated `test_copy` to make sure `copy()` is called on models instead of 
lists of models.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29543:
URL: https://github.com/apache/spark/pull/29543#issuecomment-681168945







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load() and D

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29543:
URL: https://github.com/apache/spark/pull/29543#issuecomment-681168945







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load() and DataStr

2020-08-26 Thread GitBox


SparkQA commented on pull request #29543:
URL: https://github.com/apache/spark/pull/29543#issuecomment-681168362


   **[Test build #127932 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127932/testReport)**
 for PR 29543 at commit 
[`261b609`](https://github.com/apache/spark/commit/261b6097907097a77b3c85606510e19c85a648d0).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load() and

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29543:
URL: https://github.com/apache/spark/pull/29543#issuecomment-681050019


   **[Test build #127932 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127932/testReport)**
 for PR 29543 at commit 
[`261b609`](https://github.com/apache/spark/commit/261b6097907097a77b3c85606510e19c85a648d0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681166024







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681166024







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


SparkQA commented on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681165324


   **[Test build #127937 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127937/testReport)**
 for PR 29552 at commit 
[`47075de`](https://github.com/apache/spark/commit/47075dee4d22d0617b59121168c1a72498f3051d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681092628


   **[Test build #127937 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127937/testReport)**
 for PR 29552 at commit 
[`47075de`](https://github.com/apache/spark/commit/47075dee4d22d0617b59121168c1a72498f3051d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681163259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681163259







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


SparkQA removed a comment on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681099313


   **[Test build #127939 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127939/testReport)**
 for PR 29552 at commit 
[`1062cb9`](https://github.com/apache/spark/commit/1062cb999ba13f839f71472a013ad5dde7da156a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


SparkQA commented on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681162454


   **[Test build #127939 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127939/testReport)**
 for PR 29552 at commit 
[`1062cb9`](https://github.com/apache/spark/commit/1062cb999ba13f839f71472a013ad5dde7da156a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29550: [SPARK-32481][TESTS][FOLLOWUP] Use different directory name for MacOS

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29550:
URL: https://github.com/apache/spark/pull/29550#issuecomment-681157807







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins removed a comment on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681157961







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29552: [SPARK-32481][CORE][SQL][test-hadoop2.7][test-hive1.2] Support truncate table to move data to trash

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29552:
URL: https://github.com/apache/spark/pull/29552#issuecomment-681157961







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29550: [SPARK-32481][TESTS][FOLLOWUP] Use different directory name for MacOS

2020-08-26 Thread GitBox


AmplabJenkins commented on pull request #29550:
URL: https://github.com/apache/spark/pull/29550#issuecomment-681157807







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >