[GitHub] spark pull request #22070: Fix typos detected by github.com/client9/misspell
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22070#discussion_r209242281 --- Diff: examples/src/main/python/sql/arrow.py --- @@ -95,12 +95,12 @@ def grouped_map_pandas_udf_example(spark): ("id", "v")) @pandas_udf("id long, v double", PandasUDFType.GROUPED_MAP) -def substract_mean(pdf): +def subtract_mean(pdf): --- End diff -- And this method name change should be OK as it's just in an example --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22070: Fix typos detected by github.com/client9/misspell
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22070#discussion_r209242073 --- Diff: common/network-common/src/main/java/org/apache/spark/network/sasl/SaslEncryption.java --- @@ -231,17 +231,17 @@ public boolean release(int decrement) { * data into memory at once, and can avoid ballooning memory usage when transferring large * messages such as shuffle blocks. * - * The {@link #transfered()} counter also behaves a little funny, in that it won't go forward + * The {@link #transferred()} counter also behaves a little funny, in that it won't go forward * until a whole chunk has been written. This is done because the code can't use the actual * number of bytes written to the channel as the transferred count (see {@link #count()}). * Instead, once an encrypted chunk is written to the output (including its header), the - * size of the original block will be added to the {@link #transfered()} amount. + * size of the original block will be added to the {@link #transferred()} amount. */ @Override public long transferTo(final WritableByteChannel target, final long position) throws IOException { - Preconditions.checkArgument(position == transfered(), "Invalid position."); + Preconditions.checkArgument(position == transferred(), "Invalid position."); --- End diff -- Although these are method name changes here, you're right that the old method is deprecated and new correctly-spelled one can be freely used now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #4237 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4237/testReport)** for PR 22068 at commit [`74aa80c`](https://github.com/apache/spark/commit/74aa80cb63c6ea98f0b9106f0724748931317c05). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22064: [MINOR][BUILD] Add ECCN notice required by http://www.ap...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22064 **[Test build #4236 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4236/testReport)** for PR 22064 at commit [`878e5ca`](https://github.com/apache/spark/commit/878e5ca274a3b9e5fe37f4e0c2ed4b499bc81676). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94550/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22011 **[Test build #94549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94549/testReport)** for PR 22011 at commit [`ea2330b`](https://github.com/apache/spark/commit/ea2330baa61e427665ba824c3c42d1e4ec1a7934). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22037 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94549/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94558/testReport)** for PR 22037 at commit [`24dbada`](https://github.com/apache/spark/commit/24dbada0823e47b50892a34d19e1b8e2a63af7c3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2040/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #94550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94550/testReport)** for PR 22068 at commit [`74aa80c`](https://github.com/apache/spark/commit/74aa80cb63c6ea98f0b9106f0724748931317c05). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22066: [SPARK-25084][SQL] "distribute by" on multiple columns m...
Github user yucai commented on the issue: https://github.com/apache/spark/pull/22066 @cloud-fan @gatorsmile PR has been ready, kindly help review. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22066: [SPARK-25084][SQL] "distribute by" on multiple columns m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22066 **[Test build #94557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94557/testReport)** for PR 22066 at commit [`931fa28`](https://github.com/apache/spark/commit/931fa28861f15ef1c31a51787f3bd59f2284de89). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22010: [SPARK-21436][CORE] Take advantage of known parti...
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22010#discussion_r209230417 --- Diff: core/src/test/scala/org/apache/spark/rdd/RDDSuite.scala --- @@ -95,6 +95,18 @@ class RDDSuite extends SparkFunSuite with SharedSparkContext { assert(!deserial.toString().isEmpty()) } + test("distinct with known partioner does not cause shuffle") { --- End diff -- nite: partioner -> partitioner --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22010: [SPARK-21436][CORE] Take advantage of known parti...
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22010#discussion_r209230438 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -396,7 +396,16 @@ abstract class RDD[T: ClassTag]( * Return a new RDD containing the distinct elements in this RDD. */ def distinct(numPartitions: Int)(implicit ord: Ordering[T] = null): RDD[T] = withScope { -map(x => (x, null)).reduceByKey((x, y) => x, numPartitions).map(_._1) +// If the data is already approriately partioned with a known partioner we can work locally. --- End diff -- nit: partioned -> partitioned, partioner -> partitioner --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22053 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22053 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94548/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22053 **[Test build #94548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94548/testReport)** for PR 22053 at commit [`d95d357`](https://github.com/apache/spark/commit/d95d35794528702a2de5523ca00334d479598c57). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22064: [MINOR][BUILD] Add ECCN notice required by http://www.ap...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/22064 LGTM FWIW --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94552/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94552/testReport)** for PR 22037 at commit [`24dbada`](https://github.com/apache/spark/commit/24dbada0823e47b50892a34d19e1b8e2a63af7c3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94546/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94546 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94546/testReport)** for PR 22037 at commit [`9eefbe5`](https://github.com/apache/spark/commit/9eefbe5dc58bba272dedce7ae0174be89a0a9b28). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22067 **[Test build #94556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94556/testReport)** for PR 22067 at commit [`0a6bccc`](https://github.com/apache/spark/commit/0a6bccc9e6a308d0b064bc0f2f37f7b19294df20). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22070: Fix typos detected by github.com/client9/misspell
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22070 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22070: Fix typos detected by github.com/client9/misspell
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22070 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22067 @jerryshao Could you help to trigger test build please? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22067 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22070: Fix typos detected by github.com/client9/misspell
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22070 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22070: Fix typos detected by github.com/client9/misspell
GitHub user seratch opened a pull request: https://github.com/apache/spark/pull/22070 Fix typos detected by github.com/client9/misspell ## What changes were proposed in this pull request? Fixing typos is sometimes very hard. It's not so easy to visually review them. Recently, I discovered a very useful tool for it, [misspell](https://github.com/client9/misspell). This pull request fixes minor typos detected by [misspell](https://github.com/client9/misspell) except for the false positives. If you would like me to work on other files as well, let me know. ## How was this patch tested? ### before ``` $ misspell . | grep -v '.js' R/pkg/R/SQLContext.R:354:43: "definiton" is a misspelling of "definition" R/pkg/R/SQLContext.R:424:43: "definiton" is a misspelling of "definition" R/pkg/R/SQLContext.R:445:43: "definiton" is a misspelling of "definition" R/pkg/R/SQLContext.R:495:43: "definiton" is a misspelling of "definition" NOTICE-binary:454:16: "containd" is a misspelling of "contained" R/pkg/R/context.R:46:43: "definiton" is a misspelling of "definition" R/pkg/R/context.R:74:43: "definiton" is a misspelling of "definition" R/pkg/R/DataFrame.R:591:48: "persistance" is a misspelling of "persistence" R/pkg/R/streaming.R:166:44: "occured" is a misspelling of "occurred" R/pkg/inst/worker/worker.R:65:22: "ouput" is a misspelling of "output" R/pkg/tests/fulltests/test_utils.R:106:25: "environemnt" is a misspelling of "environment" common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java:38:39: "existant" is a misspelling of "existent" common/kvstore/src/test/java/org/apache/spark/util/kvstore/LevelDBSuite.java:83:39: "existant" is a misspelling of "existent" common/network-common/src/main/java/org/apache/spark/network/crypto/TransportCipher.java:243:46: "transfered" is a misspelling of "transferred" common/network-common/src/main/java/org/apache/spark/network/sasl/SaslEncryption.java:234:19: "transfered" is a misspelling of "transferred" common/network-common/src/main/java/org/apache/spark/network/sasl/SaslEncryption.java:238:63: "transfered" is a misspelling of "transferred" common/network-common/src/main/java/org/apache/spark/network/sasl/SaslEncryption.java:244:46: "transfered" is a misspelling of "transferred" common/network-common/src/main/java/org/apache/spark/network/sasl/SaslEncryption.java:276:39: "transfered" is a misspelling of "transferred" common/network-common/src/main/java/org/apache/spark/network/util/AbstractFileRegion.java:27:20: "transfered" is a misspelling of "transferred" common/unsafe/src/test/scala/org/apache/spark/unsafe/types/UTF8StringPropertyCheckSuite.scala:195:15: "orgin" is a misspelling of "origin" core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala:621:39: "gauranteed" is a misspelling of "guaranteed" core/src/main/scala/org/apache/spark/status/storeTypes.scala:113:29: "ect" is a misspelling of "etc" core/src/main/scala/org/apache/spark/storage/DiskStore.scala:282:18: "transfered" is a misspelling of "transferred" core/src/main/scala/org/apache/spark/util/ListenerBus.scala:64:17: "overriden" is a misspelling of "overridden" core/src/test/scala/org/apache/spark/ShuffleSuite.scala:211:7: "substracted" is a misspelling of "subtracted" core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:1922:49: "agriculteur" is a misspelling of "agriculture" core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala:2468:84: "truely" is a misspelling of "truly" core/src/test/scala/org/apache/spark/storage/FlatmapIteratorSuite.scala:25:18: "persistance" is a misspelling of "persistence" core/src/test/scala/org/apache/spark/storage/FlatmapIteratorSuite.scala:26:69: "persistance" is a misspelling of "persistence" data/streaming/AFINN-111.txt:1219:0: "humerous" is a misspelling of "humorous" dev/run-pip-tests:55:28: "enviroments" is a misspelling of "environments" dev/run-pip-tests:91:37: "virutal" is a misspelling of "virtual" dev/merge_spark_pr.py:377:72: "accross" is a misspelling of "across" dev/merge_spark_pr.py:378:66: "accross" is a misspelling of "across" dev/run-pip-tests:126:25: "enviroments" is a misspelling of "environments" docs/configuration.md:1830:82: "overriden" is a misspelling of "overridden" docs/structured-streaming-programming-guide.md:525:45: "processs" is a misspelling of "processes" docs/structured-streaming-programming-guide.md:1165:61: "BETWEN" is a misspelling of "BETWEEN" docs/sql-programming-guide.md:1891:810: "behaivor" is a misspelling of "behavior" examples/src/main/python/sql/arrow.py:98:8: "substract" is a misspelling of "subtract" examples/src/main/python/sql/arrow.py:103:27: "substract" is a misspelling of "subtract" licenses/LICENSE-h
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22069 **[Test build #94555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94555/testReport)** for PR 22069 at commit [`8520df8`](https://github.com/apache/spark/commit/8520df899a3364f2bb41d4155d2bed9e68772a07). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22069 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22008: [SPARK-24928][SQL] Optimize cross join according to stat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22008 cc @wzhfy --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21977#discussion_r209209021 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -60,14 +61,20 @@ private[spark] object PythonEvalType { */ private[spark] abstract class BasePythonRunner[IN, OUT]( funcs: Seq[ChainedPythonFunctions], -bufferSize: Int, -reuseWorker: Boolean, evalType: Int, -argOffsets: Array[Array[Int]]) +argOffsets: Array[Array[Int]], +conf: SparkConf) extends Logging { require(funcs.length == argOffsets.length, "argOffsets should have the same length as funcs") + private val bufferSize = conf.getInt("spark.buffer.size", 65536) + private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", true) + // each python worker gets an equal part of the allocation. the worker pool will grow to the + // number of concurrent tasks, which is determined by the number of cores in this executor. + private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY) + .map(_ / conf.getInt("spark.executor.cores", 1)) --- End diff -- tiny nit: indentation --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21977#discussion_r209209726 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/AggregateInPandasExec.scala --- @@ -137,13 +135,12 @@ case class AggregateInPandasExec( val columnarBatchIter = new ArrowPythonRunner( pyFuncs, -bufferSize, -reuseWorker, PythonEvalType.SQL_GROUPED_AGG_PANDAS_UDF, argOffsets, aggInputSchema, sessionLocalTimeZone, -pythonRunnerConf).compute(projectedRowIter, context.partitionId(), context) +pythonRunnerConf, +sparkContext.conf).compute(projectedRowIter, context.partitionId(), context) --- End diff -- Yea, same question. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2039/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21732 **[Test build #94554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94554/testReport)** for PR 21732 at commit [`80506f4`](https://github.com/apache/spark/commit/80506f4e98184ccd66dbaac14ec52d69c358020d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21732 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22007 **[Test build #94553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94553/testReport)** for PR 22007 at commit [`618de1e`](https://github.com/apache/spark/commit/618de1e71e5ce38b6f9a640a538bdfbf95b3ae7e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21868: [SPARK-24906][SQL] Adaptively enlarge split / partition ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21868 ??? why does this still target branch-2.3? is this a backport? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22007 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16677 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22048: Fix the show method to display the wide character alignm...
Github user xuejianbest commented on the issue: https://github.com/apache/spark/pull/22048 After testing, it is found that regular expressions are changed to the following. `val regex = """[^\x00-\u2e39]""".r` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16677 Merging to master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22069 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94552/testReport)** for PR 22037 at commit [`24dbada`](https://github.com/apache/spark/commit/24dbada0823e47b50892a34d19e1b8e2a63af7c3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2038/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22069 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22069: [MINOR][DOC] Fix Java example code in Column's comments
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22069 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22069: [MINOR][DOC] Fix Java example code in Column's co...
GitHub user sadhen opened a pull request: https://github.com/apache/spark/pull/22069 [MINOR][DOC] Fix Java example code in Column's comments ## What changes were proposed in this pull request? Fix scaladoc in Column ## How was this patch tested? None You can merge this pull request into a Git repository by running: $ git pull https://github.com/sadhen/spark fix_doc_minor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22069.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22069 commit 8520df899a3364f2bb41d4155d2bed9e68772a07 Author: å¿å¬ Date: 2018-08-10T09:24:08Z Fix Java example code in Column's comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22017: [SPARK-23938][SQL] Add map_zip_with function
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/22017#discussion_r209188342 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -442,3 +442,186 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } + +/** + * Merges two given maps into a single map by applying function to the pair of values with + * the same key. + */ +@ExpressionDescription( + usage = +""" + _FUNC_(map1, map2, function) - Merges two given maps into a single map by applying + function to the pair of values with the same key. For keys only presented in one map, + NULL will be passed as the value for the missing key. If an input map contains duplicated + keys, only the first entry of the duplicated key is passed into the lambda function. +""", + examples = """ +Examples: + > SELECT _FUNC_(map(1, 'a', 2, 'b'), map(1, 'x', 2, 'y'), (k, v1, v2) -> concat(v1, v2)); + {1:"ax",2:"by"} + """, + since = "2.4.0") +case class MapZipWith(left: Expression, right: Expression, function: Expression) + extends HigherOrderFunction with CodegenFallback { + + @transient lazy val functionForEval: Expression = functionsForEval.head + + @transient lazy val (leftKeyType, leftValueType, leftValueContainsNull) = +HigherOrderFunction.mapKeyValueArgumentType(left.dataType) + + @transient lazy val (rightKeyType, rightValueType, rightValueContainsNull) = +HigherOrderFunction.mapKeyValueArgumentType(right.dataType) + + @transient lazy val keyType = +TypeCoercion.findTightestCommonType(leftKeyType, rightKeyType).getOrElse(NullType) --- End diff -- Even though there is a coercion rule for unification of key types. The key types may differ in nullability flags if they are complex. In theory, we could use ```==``` and ```findTightestCommonType``` in the coercion rule since there is no codegen to be optimized for ```null``` checks. But unfortunatelly, ```bind``` gets called once before execution of coercion rules, so ```findTightestCommonType``` is important for setting up a correct input type for lamda function. Maybe, we could play with order of analysis rules, but I'm not sure about all the consequences. @ueshin could shad some light on analysis rules ordering? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22065 **[Test build #94551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94551/testReport)** for PR 22065 at commit [`a99769d`](https://github.com/apache/spark/commit/a99769dd1aac779e972ed2e23aa7598e6d7c7105). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2037/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/22065 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22067 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22067 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94547/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22067 **[Test build #94547 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94547/testReport)** for PR 22067 at commit [`9e6941c`](https://github.com/apache/spark/commit/9e6941cfc89b16980bd5d4470baf21550ffd0877). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2036/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22068 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22068: [MINOR][DOC]Add missing compression codec .
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22068 **[Test build #94550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94550/testReport)** for PR 22068 at commit [`74aa80c`](https://github.com/apache/spark/commit/74aa80cb63c6ea98f0b9106f0724748931317c05). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22068: [MINOR][DOC]Add missing compression codec .
GitHub user 10110346 opened a pull request: https://github.com/apache/spark/pull/22068 [MINOR][DOC]Add missing compression codec . ## What changes were proposed in this pull request? Parquet file provides six codecs: "snappy", "gzip", "lzo", "lz4", "brotli", "zstd". This pr add missing compression codec :"lz4", "brotli", "zstd" . ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/10110346/spark nosupportlz4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22068.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22068 commit 74aa80cb63c6ea98f0b9106f0724748931317c05 Author: liuxian Date: 2018-08-09T07:22:01Z fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22011 **[Test build #94549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94549/testReport)** for PR 22011 at commit [`ea2330b`](https://github.com/apache/spark/commit/ea2330baa61e427665ba824c3c42d1e4ec1a7934). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22011: [SPARK-24822][PySpark] Python support for barrier execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22011 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2035/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20637#discussion_r209180525 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -43,25 +43,29 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro case _ => false } - // TODO: if the nullability of field is correct, we can use it to save null check. private def writeStructToBuffer( ctx: CodegenContext, input: String, index: String, - fieldTypes: Seq[DataType], + fieldTypeAndNullables: Seq[(DataType, Boolean)], --- End diff -- I think that it would be good since it is used at `JavaTypeInference` and `higherOrderFunctions`. cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20637#discussion_r209178573 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala --- @@ -170,6 +174,23 @@ object GenerateUnsafeProjection extends CodeGenerator[Seq[Expression], UnsafePro val element = CodeGenerator.getValue(tmpInput, et, index) +val primitiveTypeName = if (CodeGenerator.isPrimitiveType(jt)) { --- End diff -- good catch --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22066: [WIP][SPARK-25084][SQL] "distribute by" on multiple colu...
Github user yucai commented on the issue: https://github.com/apache/spark/pull/22066 @cloud-fan I am refining and adding tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21199 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22067 **[Test build #94547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94547/testReport)** for PR 22067 at commit [`9e6941c`](https://github.com/apache/spark/commit/9e6941cfc89b16980bd5d4470baf21550ffd0877). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21199: [SPARK-24127][SS] Continuous text socket source
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21199 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22053 **[Test build #94548 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94548/testReport)** for PR 22053 at commit [`d95d357`](https://github.com/apache/spark/commit/d95d35794528702a2de5523ca00334d479598c57). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22053 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22053 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2034/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22053: [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/22053 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22067 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22067 @cloud-fan @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22037 **[Test build #94546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94546/testReport)** for PR 22037 at commit [`9eefbe5`](https://github.com/apache/spark/commit/9eefbe5dc58bba272dedce7ae0174be89a0a9b28). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22037: [SPARK-24774][SQL] Avro: Support logical decimal type
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22037 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2033/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22036: [SPARK-25028][SQL] Avoid NPE when analyzing partition wi...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22036 cc @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22067 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22066: [WIP][SPARK-25084][SQL] "distribute by" on multiple colu...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/22066 I offer other fix way. #22067 It doesn't need "input" as a global variable (If distribute by random) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22067 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22067: [SPARK-25084][SQL] distribute by on multiple columns may...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22067 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21439 @gatorsmile Could you look at the PR, please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22067: [SPARK-25084][SQL] distribute by on multiple colu...
GitHub user LantaoJin opened a pull request: https://github.com/apache/spark/pull/22067 [SPARK-25084][SQL] distribute by on multiple columns may lead to code⦠â¦gen issue ## What changes were proposed in this pull request? "distribute by" on multiple columns may lead to codegen issue ## How was this patch tested? manual test You can merge this pull request into a Git repository by running: $ git pull https://github.com/LantaoJin/spark SPARK-25084 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22067.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22067 commit 9e6941cfc89b16980bd5d4470baf21550ffd0877 Author: LantaoJin Date: 2018-08-10T07:12:32Z [SPARK-25084][SQL] distribute by on multiple columns may lead to codegen issue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22038: [SPARK-25056][SQL] Unify the InConversion and Bin...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22038#discussion_r209163143 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala --- @@ -1378,8 +1378,8 @@ class TypeCoercionSuite extends AnalysisTest { ) ruleTest(inConversion, In(Literal("a"), Seq(Literal(1), Literal("b"))), - In(Cast(Literal("a"), StringType), -Seq(Cast(Literal(1), StringType), Cast(Literal("b"), StringType))) + In(Cast(Literal("a"), IntegerType), --- End diff -- mmmh...honestly in this case I'd rather say that string is a better type for the cast than int. I am not sure which is the result of casting "a" and "b" to int... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22064: [MINOR][BUILD] Add ECCN notice required by http://www.ap...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22064 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22064: [MINOR][BUILD] Add ECCN notice required by http://www.ap...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94537/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94542/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22065: [SPARK-23992][CORE] ShuffleDependency does not need to b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22065 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94541/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org