[GitHub] [spark] AmplabJenkins removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28999:
URL: https://github.com/apache/spark/pull/28999#issuecomment-654056075


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654055697


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125033/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-654056087


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28969: [SPARK-32150][BUILD] Upgrade to ZStd 1.4.5-4

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28969:
URL: https://github.com/apache/spark/pull/28969#issuecomment-654007651


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124934/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #29006:
URL: https://github.com/apache/spark/pull/29006#issuecomment-654056389







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-654056588







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-06 Thread GitBox


HeartSaVioR commented on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-654065873


   Thanks for the links. That's the all what I'd like to see.
   
   > This is a redundant code of the package-private JDK counterpart. As the 
code not a perfect match even it could happen one method results in a bit 
different (but semantically equal) path.
   
   Yeah I just wanted to see which code JDK would run to normalize the path by 
itself (so the comment `here the old createNormalizedInternedPathname was as 
good as it could imitate the java.io.FileSystem#normalize()` is the answer for 
me), and honestly didn't know the method name would be just "normalize". (I 
should have just try finding by myself. My bad.)
   
   For sure, I prefer to follow the normalization provided by the JDK, which at 
least don't use regex which would be slower than the char manipulation. That 
said, I agree that we feel confident to exclude the test part as well, as the 
code is replaced with JDK one we tend to have belief.
   
   That said, assuming we never create weird file name containing separators, 
the only thing the normalization is in effect is localDirs - we could probably 
cost only once for each entry to normalize the entry, and avoid normalizing all 
further calls.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-06 Thread GitBox


SparkQA commented on pull request #28961:
URL: https://github.com/apache/spark/pull/28961#issuecomment-654072848


   **[Test build #125045 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125045/testReport)**
 for PR 28961 at commit 
[`3811ae9`](https://github.com/apache/spark/commit/3811ae93c2966d87619624670e338b7c6d34b7d4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver

2020-07-06 Thread GitBox


SparkQA commented on pull request #28991:
URL: https://github.com/apache/spark/pull/28991#issuecomment-654077152


   **[Test build #125039 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125039/testReport)**
 for PR 28991 at commit 
[`262d306`](https://github.com/apache/spark/commit/262d3060cd5794f094ebf05e18b7154aaeb8511c).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR edited a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-06 Thread GitBox


HeartSaVioR edited a comment on pull request #28967:
URL: https://github.com/apache/spark/pull/28967#issuecomment-654065873


   Thanks for the links. That's all what I'd like to see.
   
   > This is a redundant code of the package-private JDK counterpart. As the 
code not a perfect match even it could happen one method results in a bit 
different (but semantically equal) path.
   
   Yeah I just wanted to see which code JDK would run to normalize the path by 
itself (so the comment `here the old createNormalizedInternedPathname was as 
good as it could imitate the java.io.FileSystem#normalize()` is the answer for 
me), and honestly didn't know the method name would be just "normalize". (I 
should have just try finding by myself. My bad.)
   
   For sure, I prefer to follow the normalization provided by the JDK, which at 
least don't use regex which would be slower than the char manipulation. That 
said, I agree that we feel confident to exclude the test part as well, as the 
code is replaced with JDK one we tend to have belief.
   
   That said, assuming we never create weird file name containing separators, 
the only thing the normalization is in effect is localDirs - we could probably 
cost only once for each entry to normalize the entry, and avoid normalizing all 
further calls. (I meant path being changed during normalization. The 
normalization check can't be avoided, as JDK will do.)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on a change in pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-06 Thread GitBox


attilapiros commented on a change in pull request #28967:
URL: https://github.com/apache/spark/pull/28967#discussion_r450045017



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java
##
@@ -45,34 +31,16 @@ public static File getFile(String[] localDirs, int 
subDirsPerLocalDir, String fi
 int hash = JavaUtils.nonNegativeHash(filename);
 String localDir = localDirs[hash % localDirs.length];
 int subDirId = (hash / localDirs.length) % subDirsPerLocalDir;
-return new File(createNormalizedInternedPathname(
-localDir, String.format("%02x", subDirId), filename));
-  }
-
-  /**
-   * This method is needed to avoid the situation when multiple File instances 
for the
-   * same pathname "foo/bar" are created, each with a separate copy of the 
"foo/bar" String.
-   * According to measurements, in some scenarios such duplicate strings may 
waste a lot
-   * of memory (~ 10% of the heap). To avoid that, we intern the pathname, and 
before that
-   * we make sure that it's in a normalized form (contains no "//", "///" 
etc.) Otherwise,
-   * the internal code in java.io.File would normalize it later, creating a 
new "foo/bar"
-   * String copy. Unfortunately, we cannot just reuse the normalization code 
that java.io.File
-   * uses, since it is in the package-private class java.io.FileSystem.
-   *
-   * On Windows, separator "\" is used instead of "/".
-   *
-   * "\\" is a legal character in path name on Unix-like OS, but illegal on 
Windows.
-   */
-  @VisibleForTesting
-  static String createNormalizedInternedPathname(String dir1, String dir2, 
String fname) {
-String pathname = dir1 + File.separator + dir2 + File.separator + fname;
-Matcher m = MULTIPLE_SEPARATORS.matcher(pathname);
-pathname = m.replaceAll(Matcher.quoteReplacement(File.separator));
-// A single trailing slash needs to be taken care of separately
-if (pathname.length() > 1 && pathname.charAt(pathname.length() - 1) == 
File.separatorChar) {
-  pathname = pathname.substring(0, pathname.length() - 1);
-}
-return pathname.intern();
+final String notNormalizedPath =
+  localDir + File.separator + String.format("%02x", subDirId) + 
File.separator + filename;
+// Interning the normalized path as according to measurements, in some 
scenarios such
+// duplicate strings may waste a lot of memory (~ 10% of the heap).
+// Unfortunately, we cannot just call the normalization code that 
java.io.File
+// uses, since it is in the package-private class java.io.FileSystem.
+// So we are creating a File just to get the normalized path back to 
intern it.
+// Finally a new File is built and returned with this interned normalized 
path.
+final String normalizedInternedPath = new 
File(notNormalizedPath).getPath().intern();

Review comment:
   I am sorry but this is not how interning a String works. The `intern()` 
method gives back a new reference  to a String from the constant pool when the 
String not already interned. So the `this` cannot be re-referenced as there 
could be several places which refers to that. This example makes it clear:
   
   ```java
   public class InternExample {
   public static void main(String args[]) {
   String s1 = new String("hello");
   String s2 = "hello";
   String s3 = s1.intern(); //returns string from pool, now it will be 
same as s2   
   System.out.println(s1 == s2); //false because reference is different 
 
   System.out.println(s2 == s3); //true because reference is same  
   }
   }
   ```
   
   You can run it here: https://www.javatpoint.com/java-string-intern.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir

2020-07-06 Thread GitBox


SparkQA commented on pull request #28997:
URL: https://github.com/apache/spark/pull/28997#issuecomment-654083343


   **[Test build #125048 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125048/testReport)**
 for PR 28997 at commit 
[`d8ffb71`](https://github.com/apache/spark/commit/d8ffb712b0820bfb4dbcae2c908a94a3a22c6397).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sidedoorleftroad commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir

2020-07-06 Thread GitBox


sidedoorleftroad commented on pull request #28997:
URL: https://github.com/apache/spark/pull/28997#issuecomment-654083284


   The code has been modified.
   Submitting code for the first time is too nervous.
   Thanks a lot!
   @dongjoon-hyun @HeartSaVioR 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service

2020-07-06 Thread GitBox


HeartSaVioR commented on a change in pull request #28967:
URL: https://github.com/apache/spark/pull/28967#discussion_r450052572



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java
##
@@ -45,34 +31,16 @@ public static File getFile(String[] localDirs, int 
subDirsPerLocalDir, String fi
 int hash = JavaUtils.nonNegativeHash(filename);
 String localDir = localDirs[hash % localDirs.length];
 int subDirId = (hash / localDirs.length) % subDirsPerLocalDir;
-return new File(createNormalizedInternedPathname(
-localDir, String.format("%02x", subDirId), filename));
-  }
-
-  /**
-   * This method is needed to avoid the situation when multiple File instances 
for the
-   * same pathname "foo/bar" are created, each with a separate copy of the 
"foo/bar" String.
-   * According to measurements, in some scenarios such duplicate strings may 
waste a lot
-   * of memory (~ 10% of the heap). To avoid that, we intern the pathname, and 
before that
-   * we make sure that it's in a normalized form (contains no "//", "///" 
etc.) Otherwise,
-   * the internal code in java.io.File would normalize it later, creating a 
new "foo/bar"
-   * String copy. Unfortunately, we cannot just reuse the normalization code 
that java.io.File
-   * uses, since it is in the package-private class java.io.FileSystem.
-   *
-   * On Windows, separator "\" is used instead of "/".
-   *
-   * "\\" is a legal character in path name on Unix-like OS, but illegal on 
Windows.
-   */
-  @VisibleForTesting
-  static String createNormalizedInternedPathname(String dir1, String dir2, 
String fname) {
-String pathname = dir1 + File.separator + dir2 + File.separator + fname;
-Matcher m = MULTIPLE_SEPARATORS.matcher(pathname);
-pathname = m.replaceAll(Matcher.quoteReplacement(File.separator));
-// A single trailing slash needs to be taken care of separately
-if (pathname.length() > 1 && pathname.charAt(pathname.length() - 1) == 
File.separatorChar) {
-  pathname = pathname.substring(0, pathname.length() - 1);
-}
-return pathname.intern();
+final String notNormalizedPath =
+  localDir + File.separator + String.format("%02x", subDirId) + 
File.separator + filename;
+// Interning the normalized path as according to measurements, in some 
scenarios such
+// duplicate strings may waste a lot of memory (~ 10% of the heap).
+// Unfortunately, we cannot just call the normalization code that 
java.io.File
+// uses, since it is in the package-private class java.io.FileSystem.
+// So we are creating a File just to get the normalized path back to 
intern it.
+// Finally a new File is built and returned with this interned normalized 
path.
+final String normalizedInternedPath = new 
File(notNormalizedPath).getPath().intern();

Review comment:
   Ah yes you're right. My bad I was confused a bit. Thanks for correcting 
me! So we had to do the double-normalization check, but second one would be 
just string seek.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28997:
URL: https://github.com/apache/spark/pull/28997#issuecomment-654083726







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28997:
URL: https://github.com/apache/spark/pull/28997#issuecomment-654083726







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-654097454







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-654097454







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28912:
URL: https://github.com/apache/spark/pull/28912#issuecomment-654097261







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window

2020-07-06 Thread GitBox


SparkQA commented on pull request #28987:
URL: https://github.com/apache/spark/pull/28987#issuecomment-654107835


   **[Test build #125050 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125050/testReport)**
 for PR 28987 at commit 
[`70da8b5`](https://github.com/apache/spark/commit/70da8b5299096942af5926b4222990e135ceffaa).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28975: [SPARK-32148][SS] Fix stream-stream join issue on missing to copy reused unsafe row

2020-07-06 Thread GitBox


cloud-fan commented on a change in pull request #28975:
URL: https://github.com/apache/spark/pull/28975#discussion_r450086968



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala
##
@@ -259,6 +269,9 @@ class SymmetricHashJoinStateManager(
   return null
 }
 
+// Make a copy on value row, as below cleanup logic may update the 
value row silently.
+currentValue = currentValue.copy(value = currentValue.value.copy())

Review comment:
   After seeing the new changes, I think the first version looks better. 
The caller sides is nested and we still have necessary copies for v1 format. 
What do you think? @viirya





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile

2020-07-06 Thread GitBox


cloud-fan commented on pull request #29006:
URL: https://github.com/apache/spark/pull/29006#issuecomment-654041468


   If a PR wants to test hive 1.2, it must add both [test-hive1.2] and 
[test-hadoop2.7]?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] TJX2014 commented on a change in pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence

2020-07-06 Thread GitBox


TJX2014 commented on a change in pull request #28926:
URL: https://github.com/apache/spark/pull/28926#discussion_r450010508



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -2612,6 +2612,11 @@ object Sequence {
   val stepDays = step.days
   val stepMicros = step.microseconds
 
+  if (scale == MICROS_PER_DAY && stepMonths == 0 && stepDays == 0) {
+throw new IllegalArgumentException(

Review comment:
   I am both ok, thank you for your attention.  :-)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence

2020-07-06 Thread GitBox


cloud-fan commented on a change in pull request #28926:
URL: https://github.com/apache/spark/pull/28926#discussion_r450009801



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -2612,6 +2612,11 @@ object Sequence {
   val stepDays = step.days
   val stepMicros = step.microseconds
 
+  if (scale == MICROS_PER_DAY && stepMonths == 0 && stepDays == 0) {
+throw new IllegalArgumentException(
+  "sequence step must be a day interval if start and end values are 
dates")
+  }
+
   if (stepMonths == 0 && stepMicros == 0 && scale == MICROS_PER_DAY) {

Review comment:
   We can probably add comments for each branch. For example, this branch 
is for adding days to date start/end.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654043310







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR

2020-07-06 Thread GitBox


maropu commented on pull request #28912:
URL: https://github.com/apache/spark/pull/28912#issuecomment-654050670


   Ah, I see. we need the tag `[test-hadoop2.7]`, too...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-654024002


   **[Test build #125029 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125029/testReport)**
 for PR 28683 at commit 
[`9af1413`](https://github.com/apache/spark/commit/9af1413c72bc857c465847db483f32e87fff167b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28912:
URL: https://github.com/apache/spark/pull/28912#issuecomment-654000969


   **[Test build #125019 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125019/testReport)**
 for PR 28912 at commit 
[`4aaa34b`](https://github.com/apache/spark/commit/4aaa34bb85e9d702514978ce31bc740f325294f7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-654033039


   **[Test build #125032 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125032/testReport)**
 for PR 27428 at commit 
[`73dc600`](https://github.com/apache/spark/commit/73dc600015d98a2a64b76293fa17bed89254dd2c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-653991792







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28808:
URL: https://github.com/apache/spark/pull/28808#issuecomment-654009976


   **[Test build #125024 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125024/testReport)**
 for PR 28808 at commit 
[`3e116f5`](https://github.com/apache/spark/commit/3e116f5f3284b22eec712990d271751b799f4ece).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #29006:
URL: https://github.com/apache/spark/pull/29006#issuecomment-653994905


   **[Test build #125016 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125016/testReport)**
 for PR 29006 at commit 
[`5fed5a2`](https://github.com/apache/spark/commit/5fed5a2c5f74672414123d7e221e0c11b90356c2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28986:
URL: https://github.com/apache/spark/pull/28986#issuecomment-654008236


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124930/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28999:
URL: https://github.com/apache/spark/pull/28999#issuecomment-654009975


   **[Test build #125021 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125021/testReport)**
 for PR 28999 at commit 
[`87a3752`](https://github.com/apache/spark/commit/87a3752e8c93240231e00deea467e63a915ce17a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28997:
URL: https://github.com/apache/spark/pull/28997#issuecomment-653953206


   **[Test build #124981 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124981/testReport)**
 for PR 28997 at commit 
[`2834c30`](https://github.com/apache/spark/commit/2834c30cbfeb1b1da74f0e3cd6a5e00bfdaa628f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654054494


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29007: [SPARK-XXXXX][SQL][DOCS] consistency in argument naming for time functions

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #29007:
URL: https://github.com/apache/spark/pull/29007#issuecomment-654028341


   **[Test build #125030 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125030/testReport)**
 for PR 29007 at commit 
[`524023b`](https://github.com/apache/spark/commit/524023bfb22fa781e3a9321f8118992bb971da76).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-654055192


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29004: [SPARK-32178][TESTS] Disable test-dependencies.sh from Jenkins jobs

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #29004:
URL: https://github.com/apache/spark/pull/29004#issuecomment-653988306







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-653959224


   **[Test build #124985 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124985/testReport)**
 for PR 28904 at commit 
[`8034ca4`](https://github.com/apache/spark/commit/8034ca4bb8401f2070b8bb8048722255d145fdec).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #27983:
URL: https://github.com/apache/spark/pull/27983#issuecomment-653992215


   **[Test build #125013 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125013/testReport)**
 for PR 27983 at commit 
[`f52f376`](https://github.com/apache/spark/commit/f52f376cb890043145517d36d05bbbe7cf4937cd).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28986:
URL: https://github.com/apache/spark/pull/28986#issuecomment-653951023







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654047169


   **[Test build #125035 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125035/testReport)**
 for PR 28988 at commit 
[`04f6bb6`](https://github.com/apache/spark/commit/04f6bb640d449d9a17bb34ea8f5f04faf3697b0a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-654009977


   **[Test build #125022 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125022/testReport)**
 for PR 28996 at commit 
[`5e4f670`](https://github.com/apache/spark/commit/5e4f67002955fa0536498df6657b5db5b17b0d56).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654054483


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125035/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-654055192







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654054501


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125036/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-06 Thread GitBox


SparkQA commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-654054349


   **[Test build #125029 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125029/testReport)**
 for PR 28683 at commit 
[`9af1413`](https://github.com/apache/spark/commit/9af1413c72bc857c465847db483f32e87fff167b).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dilipbiswal commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints

2020-07-06 Thread GitBox


dilipbiswal commented on pull request #28683:
URL: https://github.com/apache/spark/pull/28683#issuecomment-654054782


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28998:
URL: https://github.com/apache/spark/pull/28998#issuecomment-654012820


   **[Test build #125026 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125026/testReport)**
 for PR 28998 at commit 
[`e06fb0c`](https://github.com/apache/spark/commit/e06fb0cb1fccee2e0159dbbe5618dcf53d1a4304).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-653980498


   **[Test build #124999 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124999/testReport)**
 for PR 29002 at commit 
[`d724179`](https://github.com/apache/spark/commit/d72417972ef3d49891f6cc34c5468113ca3abdb7).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654054472


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654047198







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28676:
URL: https://github.com/apache/spark/pull/28676#issuecomment-654010078


   **[Test build #125025 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125025/testReport)**
 for PR 28676 at commit 
[`c5f4803`](https://github.com/apache/spark/commit/c5f48032907dc5b550d410984254015e4e3ae235).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28963: [SPARK-32145][SQL][test-hive1.2][test-hadoop2.7] ThriftCLIService.GetOperationStatus should include exception's stack trace to the error mess

2020-07-06 Thread GitBox


SparkQA commented on pull request #28963:
URL: https://github.com/apache/spark/pull/28963#issuecomment-654062701


   **[Test build #125043 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125043/testReport)**
 for PR 28963 at commit 
[`d72074a`](https://github.com/apache/spark/commit/d72074aff830c670685278eb40b08df1237f7c66).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #28975: [SPARK-32148][SS] Fix stream-stream join issue on missing to copy reused unsafe row

2020-07-06 Thread GitBox


HeartSaVioR commented on a change in pull request #28975:
URL: https://github.com/apache/spark/pull/28975#discussion_r450042481



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala
##
@@ -259,6 +269,9 @@ class SymmetricHashJoinStateManager(
   return null
 }
 
+// Make a copy on value row, as below cleanup logic may update the 
value row silently.
+currentValue = currentValue.copy(value = currentValue.value.copy())

Review comment:
   Yes. That wasn't necessary for format V1 as the original row was stored 
into state store, and state store (strictly saying, the implementation of HDFS 
state store provider) makes sure these rows are copied version.
   
   For other places, it can propagate to the callers outside of state manager, 
and looks like these callers don't need to copy the row. (It's super tricky for 
me to determine whether the copy is necessary or not, if the code is not in a 
simple loop or stream.)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28998:
URL: https://github.com/apache/spark/pull/28998#issuecomment-654081005







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28999:
URL: https://github.com/apache/spark/pull/28999#issuecomment-654080398







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #28963: [SPARK-32145][SQL][test-hive1.2][test-hadoop2.7] ThriftCLIService.GetOperationStatus should include exception's stack trace to the error mes

2020-07-06 Thread GitBox


yaooqinn commented on pull request #28963:
URL: https://github.com/apache/spark/pull/28963#issuecomment-654105199


   kindly ping @juliuszsompolski @cloud-fan thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28987:
URL: https://github.com/apache/spark/pull/28987#issuecomment-654093081


   **[Test build #125050 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125050/testReport)**
 for PR 28987 at commit 
[`70da8b5`](https://github.com/apache/spark/commit/70da8b5299096942af5926b4222990e135ceffaa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28987:
URL: https://github.com/apache/spark/pull/28987#issuecomment-654108351







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28965: [SPARK-32124][CORE][FOLLOW-UP] Use the invalid value Int.MinValue to fill the map index when the event logs from the old Spark version

2020-07-06 Thread GitBox


SparkQA commented on pull request #28965:
URL: https://github.com/apache/spark/pull/28965#issuecomment-654123803


   **[Test build #125058 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125058/testReport)**
 for PR 28965 at commit 
[`9dfa5c0`](https://github.com/apache/spark/commit/9dfa5c0d2a5cbbaefe3f062549e69b911ee6a2ea).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28965: [SPARK-32124][CORE][FOLLOW-UP] Use the invalid value Int.MinValue to fill the map index when the event logs from the old Spark version

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28965:
URL: https://github.com/apache/spark/pull/28965#issuecomment-654124343







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28926:
URL: https://github.com/apache/spark/pull/28926#issuecomment-654141703







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654140900


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28987:
URL: https://github.com/apache/spark/pull/28987#issuecomment-654148412







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28969: [SPARK-32150][BUILD] Upgrade to ZStd 1.4.5-4

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28969:
URL: https://github.com/apache/spark/pull/28969#issuecomment-654148531







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28987:
URL: https://github.com/apache/spark/pull/28987#issuecomment-654148412







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28808:
URL: https://github.com/apache/spark/pull/28808#issuecomment-654157398







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #27366:
URL: https://github.com/apache/spark/pull/27366#issuecomment-654157384







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28808:
URL: https://github.com/apache/spark/pull/28808#issuecomment-654157398







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-06 Thread GitBox


SparkQA commented on pull request #27366:
URL: https://github.com/apache/spark/pull/27366#issuecomment-654157015


   **[Test build #125065 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125065/testReport)**
 for PR 27366 at commit 
[`0a133ad`](https://github.com/apache/spark/commit/0a133ad87a27cd53efc2f13e4cae04ec4affef94).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-06 Thread GitBox


cloud-fan commented on a change in pull request #27428:
URL: https://github.com/apache/spark/pull/27428#discussion_r450007512



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala
##
@@ -148,24 +204,105 @@ object RewriteDistinctAggregates extends 
Rule[LogicalPlan] {
 val distinctAggs = exprs.flatMap { _.collect {
   case ae: AggregateExpression if ae.isDistinct => ae
 }}
-// We need at least two distinct aggregates for this rule because 
aggregation
-// strategy can handle a single distinct group.
+// We need at least two distinct aggregates or a single distinct aggregate 
with a filter for
+// this rule because aggregation strategy can handle a single distinct 
group without a filter.
 // This check can produce false-positives, e.g., SUM(DISTINCT a) & 
COUNT(DISTINCT a).
-distinctAggs.size > 1
+distinctAggs.size > 1 || distinctAggs.exists(_.filter.isDefined)
   }
 
   def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
-case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a)
+case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) =>
+  val expandAggregate = extractFiltersInDistinctAggregates(a)
+  rewriteDistinctAggregates(expandAggregate)
   }
 
-  def rewrite(a: Aggregate): Aggregate = {
+  private def extractFiltersInDistinctAggregates(a: Aggregate): Aggregate = {
+val aggExpressions = collectAggregateExprs(a)
+val (distinctAggExpressions, regularAggExpressions) = 
aggExpressions.partition(_.isDistinct)
+if (distinctAggExpressions.exists(_.filter.isDefined)) {
+  // Constructs pairs between old and new expressions for regular 
aggregates. Because we
+  // will construct a new `Aggregate` and the children of the distinct 
aggregates will be
+  // changed to generated ones, we need to create new references to avoid 
collisions between
+  // distinct and regular aggregate children.
+  val regularAggExprs = 
regularAggExpressions.filter(_.children.exists(!_.foldable))
+  val regularFunChildren = regularAggExprs
+.flatMap(_.aggregateFunction.children.filter(!_.foldable))
+  val regularFilterAttrs = regularAggExprs.flatMap(_.filterAttributes)
+  val regularAggChildren = (regularFunChildren ++ 
regularFilterAttrs).distinct
+  val regularAggChildrenMap = regularAggChildren.map {
+case ne: NamedExpression => ne -> ne
+case other => other -> Alias(other, other.toString)()
+  }
+  val namedRegularAggChildren = regularAggChildrenMap.map(_._2)
+  val regularAggChildAttrLookup = regularAggChildrenMap.map { kv =>
+(kv._1, kv._2.toAttribute)
+  }.toMap
+  val regularAggPairs = regularAggExprs.map {
+case ae @ AggregateExpression(af, _, _, filter, _) =>
+  val newChildren = af.children.map(c => 
regularAggChildAttrLookup.getOrElse(c, c))
+  val raf = 
af.withNewChildren(newChildren).asInstanceOf[AggregateFunction]
+  val filterOpt = filter.map(_.transform {
+case a: Attribute => regularAggChildAttrLookup.getOrElse(a, a)
+  })
+  val aggExpr = ae.copy(aggregateFunction = raf, filter = filterOpt)
+  (ae, aggExpr)
+  }
 
-// Collect all aggregate expressions.
-val aggExpressions = a.aggregateExpressions.flatMap { e =>
-  e.collect {
-case ae: AggregateExpression => ae
+  // Constructs pairs between old and new expressions for distinct 
aggregates, too.
+  val distinctAggExprs = distinctAggExpressions.filter(e => 
e.children.exists(!_.foldable))
+  val (projections, distinctAggPairs) = distinctAggExprs.map {
+case ae @ AggregateExpression(af, _, _, filter, _) =>
+  // First, In order to reduce costs, it is better to handle the 
filter clause locally.
+  // e.g. COUNT (DISTINCT a) FILTER (WHERE id > 1), evaluate expression
+  // If(id > 1) 'a else null first, and use the result as output.
+  // Second, If at least two DISTINCT aggregate expression which may 
references the
+  // same attributes. We need to construct the generated attributes so 
as the output not
+  // lost. e.g. SUM (DISTINCT a), COUNT (DISTINCT a) FILTER (WHERE id 
> 1) will output
+  // attribute '_gen_distinct-1 and attribute '_gen_distinct-2 instead 
of two 'a.
+  // Note: The illusionary mechanism may result in at least two 
distinct groups, so we
+  // still need to call `rewrite`.
+  val unfoldableChildren = af.children.filter(!_.foldable)
+  // Expand projection
+  val projectionMap = unfoldableChildren.map {
+case e if filter.isDefined =>
+  val ife = If(filter.get, e, nullify(e))
+  e -> Alias(ife, 
s"_gen_distinct_${NamedExpression.newExprId.id}")()
+case e => e -> Alias(e, 

[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #27366:
URL: https://github.com/apache/spark/pull/27366#issuecomment-654044546







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #27366:
URL: https://github.com/apache/spark/pull/27366#issuecomment-654044546


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #29008:
URL: https://github.com/apache/spark/pull/29008#issuecomment-654042478


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #29008:
URL: https://github.com/apache/spark/pull/29008#issuecomment-654044389


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


viirya commented on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654044476


   retest this please...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28964:
URL: https://github.com/apache/spark/pull/28964#issuecomment-654049253


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125034/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties

2020-07-06 Thread GitBox


SparkQA commented on pull request #28964:
URL: https://github.com/apache/spark/pull/28964#issuecomment-654048965


   **[Test build #125034 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125034/testReport)**
 for PR 28964 at commit 
[`91c7c42`](https://github.com/apache/spark/commit/91c7c428429a20f491c6d0be8b8cb83549f09e4d).
* This patch **fails to generate documentation**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #28964:
URL: https://github.com/apache/spark/pull/28964#issuecomment-654049241


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28964:
URL: https://github.com/apache/spark/pull/28964#issuecomment-654049241







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties

2020-07-06 Thread GitBox


SparkQA removed a comment on pull request #28964:
URL: https://github.com/apache/spark/pull/28964#issuecomment-654037792


   **[Test build #125034 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125034/testReport)**
 for PR 28964 at commit 
[`91c7c42`](https://github.com/apache/spark/commit/91c7c428429a20f491c6d0be8b8cb83549f09e4d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29003: [SPARK-32177][WEBUI] Remove the weird line from near the Spark logo on mouseover in the WebUI

2020-07-06 Thread GitBox


AmplabJenkins removed a comment on pull request #29003:
URL: https://github.com/apache/spark/pull/29003#issuecomment-654048895


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression

2020-07-06 Thread GitBox


SparkQA commented on pull request #28808:
URL: https://github.com/apache/spark/pull/28808#issuecomment-654054385


   **[Test build #125024 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125024/testReport)**
 for PR 28808 at commit 
[`3e116f5`](https://github.com/apache/spark/commit/3e116f5f3284b22eec712990d271751b799f4ece).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.

2020-07-06 Thread GitBox


SparkQA commented on pull request #28986:
URL: https://github.com/apache/spark/pull/28986#issuecomment-654054369







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654054494







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC

2020-07-06 Thread GitBox


SparkQA commented on pull request #28953:
URL: https://github.com/apache/spark/pull/28953#issuecomment-654054354







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-06 Thread GitBox


SparkQA commented on pull request #27428:
URL: https://github.com/apache/spark/pull/27428#issuecomment-654054386


   **[Test build #125032 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125032/testReport)**
 for PR 27428 at commit 
[`73dc600`](https://github.com/apache/spark/commit/73dc600015d98a2a64b76293fa17bed89254dd2c).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp

2020-07-06 Thread GitBox


SparkQA commented on pull request #28998:
URL: https://github.com/apache/spark/pull/28998#issuecomment-654054348


   **[Test build #125026 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125026/testReport)**
 for PR 28998 at commit 
[`e06fb0c`](https://github.com/apache/spark/commit/e06fb0cb1fccee2e0159dbbe5618dcf53d1a4304).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-06 Thread GitBox


SparkQA commented on pull request #29002:
URL: https://github.com/apache/spark/pull/29002#issuecomment-654054383


   **[Test build #124999 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124999/testReport)**
 for PR 29002 at commit 
[`d724179`](https://github.com/apache/spark/commit/d72417972ef3d49891f6cc34c5468113ca3abdb7).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


SparkQA commented on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654054352


   **[Test build #125035 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125035/testReport)**
 for PR 28988 at commit 
[`04f6bb6`](https://github.com/apache/spark/commit/04f6bb640d449d9a17bb34ea8f5f04faf3697b0a).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations

2020-07-06 Thread GitBox


AmplabJenkins commented on pull request #28988:
URL: https://github.com/apache/spark/pull/28988#issuecomment-654054472







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5

2020-07-06 Thread GitBox


SparkQA commented on pull request #28957:
URL: https://github.com/apache/spark/pull/28957#issuecomment-654054389







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-06 Thread GitBox


SparkQA commented on pull request #28676:
URL: https://github.com/apache/spark/pull/28676#issuecomment-654054368


   **[Test build #125025 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125025/testReport)**
 for PR 28676 at commit 
[`c5f4803`](https://github.com/apache/spark/commit/c5f48032907dc5b550d410984254015e4e3ae235).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile

2020-07-06 Thread GitBox


SparkQA commented on pull request #29006:
URL: https://github.com/apache/spark/pull/29006#issuecomment-654054355


   **[Test build #125016 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125016/testReport)**
 for PR 29006 at commit 
[`5fed5a2`](https://github.com/apache/spark/commit/5fed5a2c5f74672414123d7e221e0c11b90356c2).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue

2020-07-06 Thread GitBox


SparkQA commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-654054397


   **[Test build #124985 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124985/testReport)**
 for PR 28904 at commit 
[`8034ca4`](https://github.com/apache/spark/commit/8034ca4bb8401f2070b8bb8048722255d145fdec).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code

2020-07-06 Thread GitBox


SparkQA commented on pull request #27983:
URL: https://github.com/apache/spark/pull/27983#issuecomment-654054370


   **[Test build #125013 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125013/testReport)**
 for PR 27983 at commit 
[`f52f376`](https://github.com/apache/spark/commit/f52f376cb890043145517d36d05bbbe7cf4937cd).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR

2020-07-06 Thread GitBox


SparkQA commented on pull request #28912:
URL: https://github.com/apache/spark/pull/28912#issuecomment-654054351


   **[Test build #125019 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125019/testReport)**
 for PR 28912 at commit 
[`4aaa34b`](https://github.com/apache/spark/commit/4aaa34bb85e9d702514978ce31bc740f325294f7).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

2020-07-06 Thread GitBox


SparkQA commented on pull request #28996:
URL: https://github.com/apache/spark/pull/28996#issuecomment-654054390


   **[Test build #125022 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125022/testReport)**
 for PR 28996 at commit 
[`5e4f670`](https://github.com/apache/spark/commit/5e4f67002955fa0536498df6657b5db5b17b0d56).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >