[GitHub] [spark] AmplabJenkins removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate
AmplabJenkins removed a comment on pull request #28999: URL: https://github.com/apache/spark/pull/28999#issuecomment-654056075 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654055697 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125033/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls
AmplabJenkins removed a comment on pull request #28996: URL: https://github.com/apache/spark/pull/28996#issuecomment-654056087 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28969: [SPARK-32150][BUILD] Upgrade to ZStd 1.4.5-4
AmplabJenkins removed a comment on pull request #28969: URL: https://github.com/apache/spark/pull/28969#issuecomment-654007651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124934/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile
AmplabJenkins commented on pull request #29006: URL: https://github.com/apache/spark/pull/29006#issuecomment-654056389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-654056588 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service
HeartSaVioR commented on pull request #28967: URL: https://github.com/apache/spark/pull/28967#issuecomment-654065873 Thanks for the links. That's the all what I'd like to see. > This is a redundant code of the package-private JDK counterpart. As the code not a perfect match even it could happen one method results in a bit different (but semantically equal) path. Yeah I just wanted to see which code JDK would run to normalize the path by itself (so the comment `here the old createNormalizedInternedPathname was as good as it could imitate the java.io.FileSystem#normalize()` is the answer for me), and honestly didn't know the method name would be just "normalize". (I should have just try finding by myself. My bad.) For sure, I prefer to follow the normalization provided by the JDK, which at least don't use regex which would be slower than the char manipulation. That said, I agree that we feel confident to exclude the test part as well, as the code is replaced with JDK one we tend to have belief. That said, assuming we never create weird file name containing separators, the only thing the normalization is in effect is localDirs - we could probably cost only once for each entry to normalize the entry, and avoid normalizing all further calls. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
SparkQA commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-654072848 **[Test build #125045 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125045/testReport)** for PR 28961 at commit [`3811ae9`](https://github.com/apache/spark/commit/3811ae93c2966d87619624670e338b7c6d34b7d4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28991: [SPARK-26533][SQL][test-hive1.2][test-hadoop2.7] Support query auto timeout cancel on thriftserver
SparkQA commented on pull request #28991: URL: https://github.com/apache/spark/pull/28991#issuecomment-654077152 **[Test build #125039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125039/testReport)** for PR 28991 at commit [`262d306`](https://github.com/apache/spark/commit/262d3060cd5794f094ebf05e18b7154aaeb8511c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR edited a comment on pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service
HeartSaVioR edited a comment on pull request #28967: URL: https://github.com/apache/spark/pull/28967#issuecomment-654065873 Thanks for the links. That's all what I'd like to see. > This is a redundant code of the package-private JDK counterpart. As the code not a perfect match even it could happen one method results in a bit different (but semantically equal) path. Yeah I just wanted to see which code JDK would run to normalize the path by itself (so the comment `here the old createNormalizedInternedPathname was as good as it could imitate the java.io.FileSystem#normalize()` is the answer for me), and honestly didn't know the method name would be just "normalize". (I should have just try finding by myself. My bad.) For sure, I prefer to follow the normalization provided by the JDK, which at least don't use regex which would be slower than the char manipulation. That said, I agree that we feel confident to exclude the test part as well, as the code is replaced with JDK one we tend to have belief. That said, assuming we never create weird file name containing separators, the only thing the normalization is in effect is localDirs - we could probably cost only once for each entry to normalize the entry, and avoid normalizing all further calls. (I meant path being changed during normalization. The normalization check can't be avoided, as JDK will do.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] attilapiros commented on a change in pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service
attilapiros commented on a change in pull request #28967: URL: https://github.com/apache/spark/pull/28967#discussion_r450045017 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java ## @@ -45,34 +31,16 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi int hash = JavaUtils.nonNegativeHash(filename); String localDir = localDirs[hash % localDirs.length]; int subDirId = (hash / localDirs.length) % subDirsPerLocalDir; -return new File(createNormalizedInternedPathname( -localDir, String.format("%02x", subDirId), filename)); - } - - /** - * This method is needed to avoid the situation when multiple File instances for the - * same pathname "foo/bar" are created, each with a separate copy of the "foo/bar" String. - * According to measurements, in some scenarios such duplicate strings may waste a lot - * of memory (~ 10% of the heap). To avoid that, we intern the pathname, and before that - * we make sure that it's in a normalized form (contains no "//", "///" etc.) Otherwise, - * the internal code in java.io.File would normalize it later, creating a new "foo/bar" - * String copy. Unfortunately, we cannot just reuse the normalization code that java.io.File - * uses, since it is in the package-private class java.io.FileSystem. - * - * On Windows, separator "\" is used instead of "/". - * - * "\\" is a legal character in path name on Unix-like OS, but illegal on Windows. - */ - @VisibleForTesting - static String createNormalizedInternedPathname(String dir1, String dir2, String fname) { -String pathname = dir1 + File.separator + dir2 + File.separator + fname; -Matcher m = MULTIPLE_SEPARATORS.matcher(pathname); -pathname = m.replaceAll(Matcher.quoteReplacement(File.separator)); -// A single trailing slash needs to be taken care of separately -if (pathname.length() > 1 && pathname.charAt(pathname.length() - 1) == File.separatorChar) { - pathname = pathname.substring(0, pathname.length() - 1); -} -return pathname.intern(); +final String notNormalizedPath = + localDir + File.separator + String.format("%02x", subDirId) + File.separator + filename; +// Interning the normalized path as according to measurements, in some scenarios such +// duplicate strings may waste a lot of memory (~ 10% of the heap). +// Unfortunately, we cannot just call the normalization code that java.io.File +// uses, since it is in the package-private class java.io.FileSystem. +// So we are creating a File just to get the normalized path back to intern it. +// Finally a new File is built and returned with this interned normalized path. +final String normalizedInternedPath = new File(notNormalizedPath).getPath().intern(); Review comment: I am sorry but this is not how interning a String works. The `intern()` method gives back a new reference to a String from the constant pool when the String not already interned. So the `this` cannot be re-referenced as there could be several places which refers to that. This example makes it clear: ```java public class InternExample { public static void main(String args[]) { String s1 = new String("hello"); String s2 = "hello"; String s3 = s1.intern(); //returns string from pool, now it will be same as s2 System.out.println(s1 == s2); //false because reference is different System.out.println(s2 == s3); //true because reference is same } } ``` You can run it here: https://www.javatpoint.com/java-string-intern. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir
SparkQA commented on pull request #28997: URL: https://github.com/apache/spark/pull/28997#issuecomment-654083343 **[Test build #125048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125048/testReport)** for PR 28997 at commit [`d8ffb71`](https://github.com/apache/spark/commit/d8ffb712b0820bfb4dbcae2c908a94a3a22c6397). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sidedoorleftroad commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir
sidedoorleftroad commented on pull request #28997: URL: https://github.com/apache/spark/pull/28997#issuecomment-654083284 The code has been modified. Submitting code for the first time is too nervous. Thanks a lot! @dongjoon-hyun @HeartSaVioR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #28967: [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service
HeartSaVioR commented on a change in pull request #28967: URL: https://github.com/apache/spark/pull/28967#discussion_r450052572 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java ## @@ -45,34 +31,16 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi int hash = JavaUtils.nonNegativeHash(filename); String localDir = localDirs[hash % localDirs.length]; int subDirId = (hash / localDirs.length) % subDirsPerLocalDir; -return new File(createNormalizedInternedPathname( -localDir, String.format("%02x", subDirId), filename)); - } - - /** - * This method is needed to avoid the situation when multiple File instances for the - * same pathname "foo/bar" are created, each with a separate copy of the "foo/bar" String. - * According to measurements, in some scenarios such duplicate strings may waste a lot - * of memory (~ 10% of the heap). To avoid that, we intern the pathname, and before that - * we make sure that it's in a normalized form (contains no "//", "///" etc.) Otherwise, - * the internal code in java.io.File would normalize it later, creating a new "foo/bar" - * String copy. Unfortunately, we cannot just reuse the normalization code that java.io.File - * uses, since it is in the package-private class java.io.FileSystem. - * - * On Windows, separator "\" is used instead of "/". - * - * "\\" is a legal character in path name on Unix-like OS, but illegal on Windows. - */ - @VisibleForTesting - static String createNormalizedInternedPathname(String dir1, String dir2, String fname) { -String pathname = dir1 + File.separator + dir2 + File.separator + fname; -Matcher m = MULTIPLE_SEPARATORS.matcher(pathname); -pathname = m.replaceAll(Matcher.quoteReplacement(File.separator)); -// A single trailing slash needs to be taken care of separately -if (pathname.length() > 1 && pathname.charAt(pathname.length() - 1) == File.separatorChar) { - pathname = pathname.substring(0, pathname.length() - 1); -} -return pathname.intern(); +final String notNormalizedPath = + localDir + File.separator + String.format("%02x", subDirId) + File.separator + filename; +// Interning the normalized path as according to measurements, in some scenarios such +// duplicate strings may waste a lot of memory (~ 10% of the heap). +// Unfortunately, we cannot just call the normalization code that java.io.File +// uses, since it is in the package-private class java.io.FileSystem. +// So we are creating a File just to get the normalized path back to intern it. +// Finally a new File is built and returned with this interned normalized path. +final String normalizedInternedPath = new File(notNormalizedPath).getPath().intern(); Review comment: Ah yes you're right. My bad I was confused a bit. Thanks for correcting me! So we had to do the double-normalization check, but second one would be just string seek. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir
AmplabJenkins removed a comment on pull request #28997: URL: https://github.com/apache/spark/pull/28997#issuecomment-654083726 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir
AmplabJenkins commented on pull request #28997: URL: https://github.com/apache/spark/pull/28997#issuecomment-654083726 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins commented on pull request #27428: URL: https://github.com/apache/spark/pull/27428#issuecomment-654097454 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
AmplabJenkins removed a comment on pull request #27428: URL: https://github.com/apache/spark/pull/27428#issuecomment-654097454 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR
AmplabJenkins removed a comment on pull request #28912: URL: https://github.com/apache/spark/pull/28912#issuecomment-654097261 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window
SparkQA commented on pull request #28987: URL: https://github.com/apache/spark/pull/28987#issuecomment-654107835 **[Test build #125050 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125050/testReport)** for PR 28987 at commit [`70da8b5`](https://github.com/apache/spark/commit/70da8b5299096942af5926b4222990e135ceffaa). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28975: [SPARK-32148][SS] Fix stream-stream join issue on missing to copy reused unsafe row
cloud-fan commented on a change in pull request #28975: URL: https://github.com/apache/spark/pull/28975#discussion_r450086968 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala ## @@ -259,6 +269,9 @@ class SymmetricHashJoinStateManager( return null } +// Make a copy on value row, as below cleanup logic may update the value row silently. +currentValue = currentValue.copy(value = currentValue.value.copy()) Review comment: After seeing the new changes, I think the first version looks better. The caller sides is nested and we still have necessary copies for v1 format. What do you think? @viirya This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile
cloud-fan commented on pull request #29006: URL: https://github.com/apache/spark/pull/29006#issuecomment-654041468 If a PR wants to test hive 1.2, it must add both [test-hive1.2] and [test-hadoop2.7]? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] TJX2014 commented on a change in pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence
TJX2014 commented on a change in pull request #28926: URL: https://github.com/apache/spark/pull/28926#discussion_r450010508 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ## @@ -2612,6 +2612,11 @@ object Sequence { val stepDays = step.days val stepMicros = step.microseconds + if (scale == MICROS_PER_DAY && stepMonths == 0 && stepDays == 0) { +throw new IllegalArgumentException( Review comment: I am both ok, thank you for your attention. :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence
cloud-fan commented on a change in pull request #28926: URL: https://github.com/apache/spark/pull/28926#discussion_r450009801 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ## @@ -2612,6 +2612,11 @@ object Sequence { val stepDays = step.days val stepMicros = step.microseconds + if (scale == MICROS_PER_DAY && stepMonths == 0 && stepDays == 0) { +throw new IllegalArgumentException( + "sequence step must be a day interval if start and end values are dates") + } + if (stepMonths == 0 && stepMicros == 0 && scale == MICROS_PER_DAY) { Review comment: We can probably add comments for each branch. For example, this branch is for adding days to date start/end. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
AmplabJenkins commented on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654043310 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR
maropu commented on pull request #28912: URL: https://github.com/apache/spark/pull/28912#issuecomment-654050670 Ah, I see. we need the tag `[test-hadoop2.7]`, too... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints
SparkQA removed a comment on pull request #28683: URL: https://github.com/apache/spark/pull/28683#issuecomment-654024002 **[Test build #125029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125029/testReport)** for PR 28683 at commit [`9af1413`](https://github.com/apache/spark/commit/9af1413c72bc857c465847db483f32e87fff167b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR
SparkQA removed a comment on pull request #28912: URL: https://github.com/apache/spark/pull/28912#issuecomment-654000969 **[Test build #125019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125019/testReport)** for PR 28912 at commit [`4aaa34b`](https://github.com/apache/spark/commit/4aaa34bb85e9d702514978ce31bc740f325294f7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA removed a comment on pull request #27428: URL: https://github.com/apache/spark/pull/27428#issuecomment-654033039 **[Test build #125032 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125032/testReport)** for PR 27428 at commit [`73dc600`](https://github.com/apache/spark/commit/73dc600015d98a2a64b76293fa17bed89254dd2c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5
SparkQA removed a comment on pull request #28957: URL: https://github.com/apache/spark/pull/28957#issuecomment-653991792 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression
SparkQA removed a comment on pull request #28808: URL: https://github.com/apache/spark/pull/28808#issuecomment-654009976 **[Test build #125024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125024/testReport)** for PR 28808 at commit [`3e116f5`](https://github.com/apache/spark/commit/3e116f5f3284b22eec712990d271751b799f4ece). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile
SparkQA removed a comment on pull request #29006: URL: https://github.com/apache/spark/pull/29006#issuecomment-653994905 **[Test build #125016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125016/testReport)** for PR 29006 at commit [`5fed5a2`](https://github.com/apache/spark/commit/5fed5a2c5f74672414123d7e221e0c11b90356c2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.
AmplabJenkins removed a comment on pull request #28986: URL: https://github.com/apache/spark/pull/28986#issuecomment-654008236 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124930/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate
SparkQA removed a comment on pull request #28999: URL: https://github.com/apache/spark/pull/28999#issuecomment-654009975 **[Test build #125021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125021/testReport)** for PR 28999 at commit [`87a3752`](https://github.com/apache/spark/commit/87a3752e8c93240231e00deea467e63a915ce17a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28997: [SPARK-32172][CORE]Use createDirectory instead of mkdir
SparkQA removed a comment on pull request #28997: URL: https://github.com/apache/spark/pull/28997#issuecomment-653953206 **[Test build #124981 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124981/testReport)** for PR 28997 at commit [`2834c30`](https://github.com/apache/spark/commit/2834c30cbfeb1b1da74f0e3cd6a5e00bfdaa628f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654054494 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29007: [SPARK-XXXXX][SQL][DOCS] consistency in argument naming for time functions
SparkQA removed a comment on pull request #29007: URL: https://github.com/apache/spark/pull/29007#issuecomment-654028341 **[Test build #125030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125030/testReport)** for PR 29007 at commit [`524023b`](https://github.com/apache/spark/commit/524023bfb22fa781e3a9321f8118992bb971da76). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints
AmplabJenkins removed a comment on pull request #28683: URL: https://github.com/apache/spark/pull/28683#issuecomment-654055192 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29004: [SPARK-32178][TESTS] Disable test-dependencies.sh from Jenkins jobs
SparkQA removed a comment on pull request #29004: URL: https://github.com/apache/spark/pull/29004#issuecomment-653988306 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
SparkQA removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-653959224 **[Test build #124985 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124985/testReport)** for PR 28904 at commit [`8034ca4`](https://github.com/apache/spark/commit/8034ca4bb8401f2070b8bb8048722255d145fdec). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
SparkQA removed a comment on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-653992215 **[Test build #125013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125013/testReport)** for PR 27983 at commit [`f52f376`](https://github.com/apache/spark/commit/f52f376cb890043145517d36d05bbbe7cf4937cd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.
SparkQA removed a comment on pull request #28986: URL: https://github.com/apache/spark/pull/28986#issuecomment-653951023 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
SparkQA removed a comment on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654047169 **[Test build #125035 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125035/testReport)** for PR 28988 at commit [`04f6bb6`](https://github.com/apache/spark/commit/04f6bb640d449d9a17bb34ea8f5f04faf3697b0a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls
SparkQA removed a comment on pull request #28996: URL: https://github.com/apache/spark/pull/28996#issuecomment-654009977 **[Test build #125022 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125022/testReport)** for PR 28996 at commit [`5e4f670`](https://github.com/apache/spark/commit/5e4f67002955fa0536498df6657b5db5b17b0d56). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
AmplabJenkins removed a comment on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654054483 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125035/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints
AmplabJenkins commented on pull request #28683: URL: https://github.com/apache/spark/pull/28683#issuecomment-654055192 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654054501 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125036/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints
SparkQA commented on pull request #28683: URL: https://github.com/apache/spark/pull/28683#issuecomment-654054349 **[Test build #125029 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125029/testReport)** for PR 28683 at commit [`9af1413`](https://github.com/apache/spark/commit/9af1413c72bc857c465847db483f32e87fff167b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on pull request #28683: [SPARK-31875][SQL] Provide a option to disable user supplied Hints
dilipbiswal commented on pull request #28683: URL: https://github.com/apache/spark/pull/28683#issuecomment-654054782 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp
SparkQA removed a comment on pull request #28998: URL: https://github.com/apache/spark/pull/28998#issuecomment-654012820 **[Test build #125026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125026/testReport)** for PR 28998 at commit [`e06fb0c`](https://github.com/apache/spark/commit/e06fb0cb1fccee2e0159dbbe5618dcf53d1a4304). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread
SparkQA removed a comment on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-653980498 **[Test build #124999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124999/testReport)** for PR 29002 at commit [`d724179`](https://github.com/apache/spark/commit/d72417972ef3d49891f6cc34c5468113ca3abdb7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
AmplabJenkins removed a comment on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654054472 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654047198 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning
SparkQA removed a comment on pull request #28676: URL: https://github.com/apache/spark/pull/28676#issuecomment-654010078 **[Test build #125025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125025/testReport)** for PR 28676 at commit [`c5f4803`](https://github.com/apache/spark/commit/c5f48032907dc5b550d410984254015e4e3ae235). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28963: [SPARK-32145][SQL][test-hive1.2][test-hadoop2.7] ThriftCLIService.GetOperationStatus should include exception's stack trace to the error mess
SparkQA commented on pull request #28963: URL: https://github.com/apache/spark/pull/28963#issuecomment-654062701 **[Test build #125043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125043/testReport)** for PR 28963 at commit [`d72074a`](https://github.com/apache/spark/commit/d72074aff830c670685278eb40b08df1237f7c66). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #28975: [SPARK-32148][SS] Fix stream-stream join issue on missing to copy reused unsafe row
HeartSaVioR commented on a change in pull request #28975: URL: https://github.com/apache/spark/pull/28975#discussion_r450042481 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala ## @@ -259,6 +269,9 @@ class SymmetricHashJoinStateManager( return null } +// Make a copy on value row, as below cleanup logic may update the value row silently. +currentValue = currentValue.copy(value = currentValue.value.copy()) Review comment: Yes. That wasn't necessary for format V1 as the original row was stored into state store, and state store (strictly saying, the implementation of HDFS state store provider) makes sure these rows are copied version. For other places, it can propagate to the callers outside of state manager, and looks like these callers don't need to copy the row. (It's super tricky for me to determine whether the copy is necessary or not, if the code is not in a simple loop or stream.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp
AmplabJenkins commented on pull request #28998: URL: https://github.com/apache/spark/pull/28998#issuecomment-654081005 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28999: [MINOR][SQL] Re-use GetTimestamp in ParseToDate
AmplabJenkins removed a comment on pull request #28999: URL: https://github.com/apache/spark/pull/28999#issuecomment-654080398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on pull request #28963: [SPARK-32145][SQL][test-hive1.2][test-hadoop2.7] ThriftCLIService.GetOperationStatus should include exception's stack trace to the error mes
yaooqinn commented on pull request #28963: URL: https://github.com/apache/spark/pull/28963#issuecomment-654105199 kindly ping @juliuszsompolski @cloud-fan thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window
SparkQA removed a comment on pull request #28987: URL: https://github.com/apache/spark/pull/28987#issuecomment-654093081 **[Test build #125050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125050/testReport)** for PR 28987 at commit [`70da8b5`](https://github.com/apache/spark/commit/70da8b5299096942af5926b4222990e135ceffaa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window
AmplabJenkins commented on pull request #28987: URL: https://github.com/apache/spark/pull/28987#issuecomment-654108351 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28965: [SPARK-32124][CORE][FOLLOW-UP] Use the invalid value Int.MinValue to fill the map index when the event logs from the old Spark version
SparkQA commented on pull request #28965: URL: https://github.com/apache/spark/pull/28965#issuecomment-654123803 **[Test build #125058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125058/testReport)** for PR 28965 at commit [`9dfa5c0`](https://github.com/apache/spark/commit/9dfa5c0d2a5cbbaefe3f062549e69b911ee6a2ea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28965: [SPARK-32124][CORE][FOLLOW-UP] Use the invalid value Int.MinValue to fill the map index when the event logs from the old Spark version
AmplabJenkins commented on pull request #28965: URL: https://github.com/apache/spark/pull/28965#issuecomment-654124343 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28926: [SPARK-32133][SQL] Forbid time field steps for date start/end in Sequence
AmplabJenkins removed a comment on pull request #28926: URL: https://github.com/apache/spark/pull/28926#issuecomment-654141703 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function
AmplabJenkins removed a comment on pull request #29009: URL: https://github.com/apache/spark/pull/29009#issuecomment-654140900 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window
AmplabJenkins removed a comment on pull request #28987: URL: https://github.com/apache/spark/pull/28987#issuecomment-654148412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28969: [SPARK-32150][BUILD] Upgrade to ZStd 1.4.5-4
AmplabJenkins commented on pull request #28969: URL: https://github.com/apache/spark/pull/28969#issuecomment-654148531 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28987: [SPARK-32162][PYTHON][TESTS] Improve error message of Pandas grouped map test with window
AmplabJenkins commented on pull request #28987: URL: https://github.com/apache/spark/pull/28987#issuecomment-654148412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression
AmplabJenkins commented on pull request #28808: URL: https://github.com/apache/spark/pull/28808#issuecomment-654157398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-654157384 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression
AmplabJenkins removed a comment on pull request #28808: URL: https://github.com/apache/spark/pull/28808#issuecomment-654157398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
SparkQA commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-654157015 **[Test build #125065 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125065/testReport)** for PR 27366 at commit [`0a133ad`](https://github.com/apache/spark/commit/0a133ad87a27cd53efc2f13e4cae04ec4affef94). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
cloud-fan commented on a change in pull request #27428: URL: https://github.com/apache/spark/pull/27428#discussion_r450007512 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteDistinctAggregates.scala ## @@ -148,24 +204,105 @@ object RewriteDistinctAggregates extends Rule[LogicalPlan] { val distinctAggs = exprs.flatMap { _.collect { case ae: AggregateExpression if ae.isDistinct => ae }} -// We need at least two distinct aggregates for this rule because aggregation -// strategy can handle a single distinct group. +// We need at least two distinct aggregates or a single distinct aggregate with a filter for +// this rule because aggregation strategy can handle a single distinct group without a filter. // This check can produce false-positives, e.g., SUM(DISTINCT a) & COUNT(DISTINCT a). -distinctAggs.size > 1 +distinctAggs.size > 1 || distinctAggs.exists(_.filter.isDefined) } def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => rewrite(a) +case a: Aggregate if mayNeedtoRewrite(a.aggregateExpressions) => + val expandAggregate = extractFiltersInDistinctAggregates(a) + rewriteDistinctAggregates(expandAggregate) } - def rewrite(a: Aggregate): Aggregate = { + private def extractFiltersInDistinctAggregates(a: Aggregate): Aggregate = { +val aggExpressions = collectAggregateExprs(a) +val (distinctAggExpressions, regularAggExpressions) = aggExpressions.partition(_.isDistinct) +if (distinctAggExpressions.exists(_.filter.isDefined)) { + // Constructs pairs between old and new expressions for regular aggregates. Because we + // will construct a new `Aggregate` and the children of the distinct aggregates will be + // changed to generated ones, we need to create new references to avoid collisions between + // distinct and regular aggregate children. + val regularAggExprs = regularAggExpressions.filter(_.children.exists(!_.foldable)) + val regularFunChildren = regularAggExprs +.flatMap(_.aggregateFunction.children.filter(!_.foldable)) + val regularFilterAttrs = regularAggExprs.flatMap(_.filterAttributes) + val regularAggChildren = (regularFunChildren ++ regularFilterAttrs).distinct + val regularAggChildrenMap = regularAggChildren.map { +case ne: NamedExpression => ne -> ne +case other => other -> Alias(other, other.toString)() + } + val namedRegularAggChildren = regularAggChildrenMap.map(_._2) + val regularAggChildAttrLookup = regularAggChildrenMap.map { kv => +(kv._1, kv._2.toAttribute) + }.toMap + val regularAggPairs = regularAggExprs.map { +case ae @ AggregateExpression(af, _, _, filter, _) => + val newChildren = af.children.map(c => regularAggChildAttrLookup.getOrElse(c, c)) + val raf = af.withNewChildren(newChildren).asInstanceOf[AggregateFunction] + val filterOpt = filter.map(_.transform { +case a: Attribute => regularAggChildAttrLookup.getOrElse(a, a) + }) + val aggExpr = ae.copy(aggregateFunction = raf, filter = filterOpt) + (ae, aggExpr) + } -// Collect all aggregate expressions. -val aggExpressions = a.aggregateExpressions.flatMap { e => - e.collect { -case ae: AggregateExpression => ae + // Constructs pairs between old and new expressions for distinct aggregates, too. + val distinctAggExprs = distinctAggExpressions.filter(e => e.children.exists(!_.foldable)) + val (projections, distinctAggPairs) = distinctAggExprs.map { +case ae @ AggregateExpression(af, _, _, filter, _) => + // First, In order to reduce costs, it is better to handle the filter clause locally. + // e.g. COUNT (DISTINCT a) FILTER (WHERE id > 1), evaluate expression + // If(id > 1) 'a else null first, and use the result as output. + // Second, If at least two DISTINCT aggregate expression which may references the + // same attributes. We need to construct the generated attributes so as the output not + // lost. e.g. SUM (DISTINCT a), COUNT (DISTINCT a) FILTER (WHERE id > 1) will output + // attribute '_gen_distinct-1 and attribute '_gen_distinct-2 instead of two 'a. + // Note: The illusionary mechanism may result in at least two distinct groups, so we + // still need to call `rewrite`. + val unfoldableChildren = af.children.filter(!_.foldable) + // Expand projection + val projectionMap = unfoldableChildren.map { +case e if filter.isDefined => + val ife = If(filter.get, e, nullify(e)) + e -> Alias(ife, s"_gen_distinct_${NamedExpression.newExprId.id}")() +case e => e -> Alias(e,
[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-654044546 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-654044546 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div
AmplabJenkins removed a comment on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-654042478 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29008: [SPARK-31579][SQL] replaced floorDiv to Div
AmplabJenkins commented on pull request #29008: URL: https://github.com/apache/spark/pull/29008#issuecomment-654044389 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
viirya commented on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654044476 retest this please... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties
AmplabJenkins removed a comment on pull request #28964: URL: https://github.com/apache/spark/pull/28964#issuecomment-654049253 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125034/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties
SparkQA commented on pull request #28964: URL: https://github.com/apache/spark/pull/28964#issuecomment-654048965 **[Test build #125034 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125034/testReport)** for PR 28964 at commit [`91c7c42`](https://github.com/apache/spark/commit/91c7c428429a20f491c6d0be8b8cb83549f09e4d). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties
AmplabJenkins removed a comment on pull request #28964: URL: https://github.com/apache/spark/pull/28964#issuecomment-654049241 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties
AmplabJenkins commented on pull request #28964: URL: https://github.com/apache/spark/pull/28964#issuecomment-654049241 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28964: [SPARK-32144][SQL] Retain EXTERNAL property in hive table properties
SparkQA removed a comment on pull request #28964: URL: https://github.com/apache/spark/pull/28964#issuecomment-654037792 **[Test build #125034 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125034/testReport)** for PR 28964 at commit [`91c7c42`](https://github.com/apache/spark/commit/91c7c428429a20f491c6d0be8b8cb83549f09e4d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29003: [SPARK-32177][WEBUI] Remove the weird line from near the Spark logo on mouseover in the WebUI
AmplabJenkins removed a comment on pull request #29003: URL: https://github.com/apache/spark/pull/29003#issuecomment-654048895 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28808: [SPARK-31975][SQL] Show AnalysisException when WindowFunction is used without WindowExpression
SparkQA commented on pull request #28808: URL: https://github.com/apache/spark/pull/28808#issuecomment-654054385 **[Test build #125024 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125024/testReport)** for PR 28808 at commit [`3e116f5`](https://github.com/apache/spark/commit/3e116f5f3284b22eec712990d271751b799f4ece). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28986: [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors.
SparkQA commented on pull request #28986: URL: https://github.com/apache/spark/pull/28986#issuecomment-654054369 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654054494 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-654054354 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27428: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT
SparkQA commented on pull request #27428: URL: https://github.com/apache/spark/pull/27428#issuecomment-654054386 **[Test build #125032 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125032/testReport)** for PR 27428 at commit [`73dc600`](https://github.com/apache/spark/commit/73dc600015d98a2a64b76293fa17bed89254dd2c). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28998: [SPARK-32173][SQL] Deduplicate code in FromUTCTimestamp and ToUTCTimestamp
SparkQA commented on pull request #28998: URL: https://github.com/apache/spark/pull/28998#issuecomment-654054348 **[Test build #125026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125026/testReport)** for PR 28998 at commit [`e06fb0c`](https://github.com/apache/spark/commit/e06fb0cb1fccee2e0159dbbe5618dcf53d1a4304). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread
SparkQA commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-654054383 **[Test build #124999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124999/testReport)** for PR 29002 at commit [`d724179`](https://github.com/apache/spark/commit/d72417972ef3d49891f6cc34c5468113ca3abdb7). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
SparkQA commented on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654054352 **[Test build #125035 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125035/testReport)** for PR 28988 at commit [`04f6bb6`](https://github.com/apache/spark/commit/04f6bb640d449d9a17bb34ea8f5f04faf3697b0a). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28988: [SPARK-32163][SQL] Nested pruning should work even with cosmetic variations
AmplabJenkins commented on pull request #28988: URL: https://github.com/apache/spark/pull/28988#issuecomment-654054472 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5
SparkQA commented on pull request #28957: URL: https://github.com/apache/spark/pull/28957#issuecomment-654054389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning
SparkQA commented on pull request #28676: URL: https://github.com/apache/spark/pull/28676#issuecomment-654054368 **[Test build #125025 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125025/testReport)** for PR 28676 at commit [`c5f4803`](https://github.com/apache/spark/commit/c5f48032907dc5b550d410984254015e4e3ae235). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29006: [SPARK-32058][BUILD][SQL][test-hive1.2][FOLLOWUP] Set hadoop 2.7.4 for hive 1.2 profile
SparkQA commented on pull request #29006: URL: https://github.com/apache/spark/pull/29006#issuecomment-654054355 **[Test build #125016 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125016/testReport)** for PR 29006 at commit [`5fed5a2`](https://github.com/apache/spark/commit/5fed5a2c5f74672414123d7e221e0c11b90356c2). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-654054397 **[Test build #124985 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124985/testReport)** for PR 28904 at commit [`8034ca4`](https://github.com/apache/spark/commit/8034ca4bb8401f2070b8bb8048722255d145fdec). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
SparkQA commented on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-654054370 **[Test build #125013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125013/testReport)** for PR 27983 at commit [`f52f376`](https://github.com/apache/spark/commit/f52f376cb890043145517d36d05bbbe7cf4937cd). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28912: [SPARK-32057][SQL][test-hive1.2][test-hadoop2.7] ExecuteStatement: cancel and close should not transiently ERROR
SparkQA commented on pull request #28912: URL: https://github.com/apache/spark/pull/28912#issuecomment-654054351 **[Test build #125019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125019/testReport)** for PR 28912 at commit [`4aaa34b`](https://github.com/apache/spark/commit/4aaa34bb85e9d702514978ce31bc740f325294f7). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls
SparkQA commented on pull request #28996: URL: https://github.com/apache/spark/pull/28996#issuecomment-654054390 **[Test build #125022 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125022/testReport)** for PR 28996 at commit [`5e4f670`](https://github.com/apache/spark/commit/5e4f67002955fa0536498df6657b5db5b17b0d56). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org