[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88688/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88688 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88688/testReport)** for PR 20926 at commit [`851a5ef`](https://github.com/apache/spark/commit/851a5efa87a9f10843ec9d45437d6a5d94cc0816). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20812: [SPARK-23669] Executors fetch jars and name the jars wit...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20812 @jinxing64 , I think using same name jars which contains different classes seems practically is not a best practice. Ideally different udfs should be packaged in different jars with different name/version. That will be easy for user to manage. Also same name jars could easily cause classpath issue usually. As you always has a workaround for this issue out of Spark. So I would suggest not to fix it, since this is a quite user specific issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1823/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20860 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20860 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1822/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #88693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88693/testReport)** for PR 19222 at commit [`b69cb64`](https://github.com/apache/spark/commit/b69cb6430d71fe6ce7a39f9d6a13bdcfa8704ccf). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20860 **[Test build #88692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88692/testReport)** for PR 20860 at commit [`2ea9b7a`](https://github.com/apache/spark/commit/2ea9b7a58279d0e5d7cdfad8d67ab9227983be1a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20860 LGTM. I'm also playing around with isolated hive classloader these days. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20860: [SPARK-23743][SQL] Changed a comparison logic from conta...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20860 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20876: [SPARK-23653][SQL] Capture sql statements user input and...
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20876 Hi, @jerryshao @cloud-fan, may I have some update? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20920 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20920 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r177958142 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/ByteArrayMemoryBlock.java --- @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.unsafe.memory; + +import com.google.common.primitives.Ints; + +import org.apache.spark.unsafe.Platform; + +/** + * A consecutive block of memory with a byte array on Java heap. + */ +public final class ByteArrayMemoryBlock extends MemoryBlock { + + private final byte[] array; + + public ByteArrayMemoryBlock(byte[] obj, long offset, long size) { +super(obj, offset, size); +this.array = obj; +assert(offset + size <= Platform.BYTE_ARRAY_OFFSET + obj.length) : --- End diff -- To add this assertion cause a new failure at [`UTF8StringSuite.writeToOutputStreamUnderflow()`](https://github.com/apache/spark/pull/19222/files#diff-321a62638d3ef7bbc9c35842967c868bR515). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20928: Fix small typo in configuration doc
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20928 It would be better to check other docs, not only in configurations here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20931: [SPARK-23815][Core]Spark writer dynamic partition overwr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20931 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20931: [SPARK-23815][Core]Spark writer dynamic partition overwr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20931 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20931: [SPARK-23815][Core]Spark writer dynamic partition...
GitHub user fangshil opened a pull request: https://github.com/apache/spark/pull/20931 [SPARK-23815][Core]Spark writer dynamic partition overwrite mode may fail to write output on multi level partition ## What changes were proposed in this pull request? Spark introduced new writer mode to overwrite only related partitions in SPARK-20236. While we are using this feature in our production cluster, we found a bug when writing multi-level partitions on HDFS. A simple test case to reproduce this issue: val df = Seq(("1","2","3")).toDF("col1", "col2","col3") df.write.partitionBy("col1","col2").mode("overwrite").save("/my/hdfs/location") If HDFS location "/my/hdfs/location" does not exist, there will be no output. This seems to be caused by the job commit change in SPARK-20236 in HadoopMapReduceCommitProtocol. In the commit job process, the output has been written into staging dir /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2, and then the code calls fs.rename to rename /my/hdfs/location/.spark-staging.xxx/col1=1/col2=2 to /my/hdfs/location/col1=1/col2=2. However, in our case the operation will fail on HDFS because /my/hdfs/location/col1=1 does not exists. HDFS rename can not create directory for more than one level. This does not happen in unit test covered with SPARK-20236 with local file system. We are proposing a fix. When cleaning current partition dir /my/hdfs/location/col1=1/col2=2 before the rename op, if the delete op fails (because /my/hdfs/location/col1=1/col2=2 may not exist), we call mkdirs op to create the parent dir /my/hdfs/location/col1=1 (if the parent dir does not exist) so the following rename op can succeed. ## How was this patch tested? We have tested this patch on our production cluster and it fixed the problem You can merge this pull request into a Git repository by running: $ git pull https://github.com/fangshil/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20931.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20931 commit da63c17d7ae7fbf04cc474d946d61a098b3e1ade Author: Fangshi Li Date: 2018-03-28T04:25:54Z Spark writer dynamic partition overwrite mode may fail to write output on multi level partition --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1821/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20920 **[Test build #88691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88691/testReport)** for PR 20920 at commit [`35ecbf9`](https://github.com/apache/spark/commit/35ecbf983b675b7fa5643c4c395995e4dca2647e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20920 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20922: Roll forward "[SPARK-23096][SS] Migrate rate sour...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/20922#discussion_r177953871 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.sources + +import java.util.Optional + +import org.apache.spark.network.util.JavaUtils +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader +import org.apache.spark.sql.sources.DataSourceRegister +import org.apache.spark.sql.sources.v2._ +import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousReader, MicroBatchReader} +import org.apache.spark.sql.types._ + +/** + * A source that generates increment long values with timestamps. Each generated row has two + * columns: a timestamp column for the generated time and an auto increment long column starting + * with 0L. + * + * This source supports the following options: + * - `rowsPerSecond` (e.g. 100, default: 1): How many rows should be generated per second. + * - `rampUpTime` (e.g. 5s, default: 0s): How long to ramp up before the generating speed + *becomes `rowsPerSecond`. Using finer granularities than seconds will be truncated to integer + *seconds. + * - `numPartitions` (e.g. 10, default: Spark's default parallelism): The partition number for the + *generated rows. The source will try its best to reach `rowsPerSecond`, but the query may + *be resource constrained, and `numPartitions` can be tweaked to help reach the desired speed. + */ +class RateStreamProvider extends DataSourceV2 + with MicroBatchReadSupport with ContinuousReadSupport with DataSourceRegister { + import RateStreamProvider._ + + override def createMicroBatchReader( + schema: Optional[StructType], + checkpointLocation: String, + options: DataSourceOptions): MicroBatchReader = { --- End diff -- Thanks for the explanation @jose-torres . This seems like a quite common usage scenario, I also see that socket source and console sink require SparkSession, also in my customized hive streaming sink (https://github.com/jerryshao/spark-hive-streaming-sink/blob/7b3afcee280d2e70ffb12dde24184726b618829d/core/src/main/scala/com/hortonworks/spark/hive/HiveSourceProvider.scala#L46). If we add that parameter back, things might be much easier. What's your opinion @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20930: [SPARK-23811][Core] Same tasks' FetchFailed event comes ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20930 **[Test build #88690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88690/testReport)** for PR 20930 at commit [`2907075`](https://github.com/apache/spark/commit/2907075b43eac26c7efbe4aca5f2c037bb5934c2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20930: [SPARK-23811][Core] Same tasks' FetchFailed event comes ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r177953620 --- Diff: common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java --- @@ -515,7 +518,8 @@ public void writeToOutputStreamUnderflow() throws IOException { final byte[] test = "01234567".getBytes(StandardCharsets.UTF_8); for (int i = 1; i <= Platform.BYTE_ARRAY_OFFSET; ++i) { - UTF8String.fromAddress(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i) + new UTF8String( +new ByteArrayMemoryBlock(test, Platform.BYTE_ARRAY_OFFSET - i, test.length + i)) --- End diff -- I thought this is what you said [here](https://github.com/apache/spark/pull/19222#discussion_r176986304). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20930: [SPARK-23811][Core] Same tasks' FetchFailed event comes ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1820/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20930: [SPARK-23811][Core] Same tasks' FetchFailed event comes ...
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 The scenario can be reproduced by below test case added in `DAGSchedulerSuite` ```scala /** * This tests the case where origin task success after speculative task got FetchFailed * before. */ test("[SPARK-23811] Fetch failed task should kill other attempt") { // Create 3 RDDs with shuffle dependencies on each other: rddA <--- rddB <--- rddC val rddA = new MyRDD(sc, 2, Nil) val shuffleDepA = new ShuffleDependency(rddA, new HashPartitioner(2)) val shuffleIdA = shuffleDepA.shuffleId val rddB = new MyRDD(sc, 2, List(shuffleDepA), tracker = mapOutputTracker) val shuffleDepB = new ShuffleDependency(rddB, new HashPartitioner(2)) val rddC = new MyRDD(sc, 2, List(shuffleDepB), tracker = mapOutputTracker) submit(rddC, Array(0, 1)) // Complete both tasks in rddA. assert(taskSets(0).stageId === 0 && taskSets(0).stageAttemptId === 0) complete(taskSets(0), Seq( (Success, makeMapStatus("hostA", 2)), (Success, makeMapStatus("hostB", 2 // The first task success runEvent(makeCompletionEvent( taskSets(1).tasks(0), Success, makeMapStatus("hostB", 2))) // The second task's speculative attempt fails first, but task self still running. // This may caused by ExecutorLost. runEvent(makeCompletionEvent( taskSets(1).tasks(1), FetchFailed(makeBlockManagerId("hostA"), shuffleIdA, 0, 0, "ignored"), null)) // Check currently missing partition assert(mapOutputTracker.findMissingPartitions(shuffleDepB.shuffleId).get.size === 1) val missingPartition = mapOutputTracker.findMissingPartitions(shuffleDepB.shuffleId).get(0) // The second result task self success soon runEvent(makeCompletionEvent( taskSets(1).tasks(1), Success, makeMapStatus("hostB", 2))) // No missing partitions here, this will cause child stage never succeed assert(mapOutputTracker.findMissingPartitions(shuffleDepB.shuffleId).get.size === 0) } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20930: [SPARK-23811][Core] Same tasks' FetchFailed event...
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/20930 [SPARK-23811][Core] Same tasks' FetchFailed event comes before Success will cause child stage never succeed ## What changes were proposed in this pull request? This is a bug caused by abnormal scenario describe below: ShuffleMapTask 1.0 running, this task will fetch data from ExecutorA ExecutorA Lost, trigger `mapOutputTracker.removeOutputsOnExecutor(execId)` , shuffleStatus changed. Speculative ShuffleMapTask 1.1 start, got a FetchFailed immediately. ShuffleMapTask 1 is the last task of its stage, so this stage will never succeed because of there's no missing task DAGScheduler can get. I apply the detailed screenshots in jira comments. ## How was this patch tested? Add a new UT in `TaskSetManagerSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuanyuanking/spark SPARK-23811 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20930.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20930 commit 2907075b43eac26c7efbe4aca5f2c037bb5934c2 Author: Yuanjian Li Date: 2018-03-29T04:50:16Z [SPARK-23811][Core] Same tasks' FetchFailed event comes before Success will cause child stage never succeed --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88686/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20797 **[Test build #88686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88686/testReport)** for PR 20797 at commit [`4493909`](https://github.com/apache/spark/commit/4493909f3b66c74e488e57ffb6e89fc048a81a8d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88687/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20753 **[Test build #88687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88687/testReport)** for PR 20753 at commit [`09cdf5e`](https://github.com/apache/spark/commit/09cdf5e9920a4f896fd34fc361cf6c4382fd09e5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1819/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88685/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20920 **[Test build #88685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88685/testReport)** for PR 20920 at commit [`35ecbf9`](https://github.com/apache/spark/commit/35ecbf983b675b7fa5643c4c395995e4dca2647e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #88689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88689/testReport)** for PR 19222 at commit [`59fd393`](https://github.com/apache/spark/commit/59fd393cb4e378550f90aaa5f5ceb2c9e3d85fef). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88688/testReport)** for PR 20926 at commit [`851a5ef`](https://github.com/apache/spark/commit/851a5efa87a9f10843ec9d45437d6a5d94cc0816). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88683/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88683/testReport)** for PR 20926 at commit [`d0988f7`](https://github.com/apache/spark/commit/d0988f7378152b576844c4ae11b1761fa9a3bde2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class TestSparkSessionSuite extends SparkFunSuite with SharedSparkSession ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20922: Roll forward "[SPARK-23096][SS] Migrate rate sour...
Github user jose-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/20922#discussion_r177943116 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.sources + +import java.util.Optional + +import org.apache.spark.network.util.JavaUtils +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader +import org.apache.spark.sql.sources.DataSourceRegister +import org.apache.spark.sql.sources.v2._ +import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousReader, MicroBatchReader} +import org.apache.spark.sql.types._ + +/** + * A source that generates increment long values with timestamps. Each generated row has two + * columns: a timestamp column for the generated time and an auto increment long column starting + * with 0L. + * + * This source supports the following options: + * - `rowsPerSecond` (e.g. 100, default: 1): How many rows should be generated per second. + * - `rampUpTime` (e.g. 5s, default: 0s): How long to ramp up before the generating speed + *becomes `rowsPerSecond`. Using finer granularities than seconds will be truncated to integer + *seconds. + * - `numPartitions` (e.g. 10, default: Spark's default parallelism): The partition number for the + *generated rows. The source will try its best to reach `rowsPerSecond`, but the query may + *be resource constrained, and `numPartitions` can be tweaked to help reach the desired speed. + */ +class RateStreamProvider extends DataSourceV2 + with MicroBatchReadSupport with ContinuousReadSupport with DataSourceRegister { + import RateStreamProvider._ + + override def createMicroBatchReader( + schema: Optional[StructType], + checkpointLocation: String, + options: DataSourceOptions): MicroBatchReader = { --- End diff -- I agree that there's a mismatch here. The reason it doesn't currently have this parameter is that one of the DataSourceV2 design goals (https://docs.google.com/document/d/1n_vUVbF4KD3gxTmkNEon5qdQ-Z8qU5Frf6WMQZ6jJVM/edit#heading=h.mi1fbff5f8f9) was to avoid API dependencies on upper level APIs like SparkSession. (IIRC Wenchen and I discussed SparkSession specifically in the design stage.) In this story, SparkSession.get{Active/Default}Session is just a way to keep our existing sources working rather than an encouraged development practice. I agree that there's a mismatch which could be worth some discussion, but I think it's out of scope for this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177939445 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java --- @@ -32,141 +30,133 @@ */ public final class UnsafeArrayWriter extends UnsafeWriter { - private BufferHolder holder; - - // The offset of the global buffer where we start to write this array. - private int startingOffset; - // The number of elements in this array private int numElements; + // The element size in this array + private int elementSize; + private int headerInBytes; private void assertIndexIsValid(int index) { assert index >= 0 : "index (" + index + ") should >= 0"; assert index < numElements : "index (" + index + ") should < " + numElements; } - public void initialize(BufferHolder holder, int numElements, int elementSize) { + public UnsafeArrayWriter(UnsafeWriter writer, int elementSize) { +super(writer.getBufferHolder()); +this.elementSize = elementSize; + } + + public void initialize(int numElements) { // We need 8 bytes to store numElements in header this.numElements = numElements; this.headerInBytes = calculateHeaderPortionInBytes(numElements); -this.holder = holder; -this.startingOffset = holder.cursor; +this.startingOffset = cursor(); // Grows the global buffer ahead for header and fixed size data. int fixedPartInBytes = ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements); holder.grow(headerInBytes + fixedPartInBytes); // Write numElements and clear out null bits to header -Platform.putLong(holder.buffer, startingOffset, numElements); +Platform.putLong(buffer(), startingOffset, numElements); for (int i = 8; i < headerInBytes; i += 8) { - Platform.putLong(holder.buffer, startingOffset + i, 0L); + Platform.putLong(buffer(), startingOffset + i, 0L); } // fill 0 into reminder part of 8-bytes alignment in unsafe array for (int i = elementSize * numElements; i < fixedPartInBytes; i++) { - Platform.putByte(holder.buffer, startingOffset + headerInBytes + i, (byte) 0); + Platform.putByte(buffer(), startingOffset + headerInBytes + i, (byte) 0); } -holder.cursor += (headerInBytes + fixedPartInBytes); +incrementCursor(headerInBytes + fixedPartInBytes); } - private void zeroOutPaddingBytes(int numBytes) { -if ((numBytes & 0x07) > 0) { - Platform.putLong(holder.buffer, holder.cursor + ((numBytes >> 3) << 3), 0L); -} + protected long getOffset(int ordinal, int elementSize) { +return getElementOffset(ordinal, elementSize); } private long getElementOffset(int ordinal, int elementSize) { return startingOffset + headerInBytes + ordinal * elementSize; } - public void setOffsetAndSize(int ordinal, int currentCursor, int size) { + @Override + public void setOffsetAndSizeFromMark(int ordinal, int mark) { assertIndexIsValid(ordinal); -final long relativeOffset = currentCursor - startingOffset; -final long offsetAndSize = (relativeOffset << 32) | (long)size; - -write(ordinal, offsetAndSize); +_setOffsetAndSizeFromMark(ordinal, mark); } private void setNullBit(int ordinal) { assertIndexIsValid(ordinal); -BitSetMethods.set(holder.buffer, startingOffset + 8, ordinal); +BitSetMethods.set(buffer(), startingOffset + 8, ordinal); } public void setNull1Bytes(int ordinal) { setNullBit(ordinal); // put zero into the corresponding field when set null -Platform.putByte(holder.buffer, getElementOffset(ordinal, 1), (byte)0); +Platform.putByte(buffer(), getElementOffset(ordinal, 1), (byte)0); } public void setNull2Bytes(int ordinal) { setNullBit(ordinal); // put zero into the corresponding field when set null -Platform.putShort(holder.buffer, getElementOffset(ordinal, 2), (short)0); +Platform.putShort(buffer(), getElementOffset(ordinal, 2), (short)0); } public void setNull4Bytes(int ordinal) { setNullBit(ordinal); // put zero into the corresponding field when set null -Platform.putInt(holder.buffer, getElementOffset(ordinal, 4), 0); +Platform.putInt(buffer(), getElementOffset(ordinal, 4), 0); } public void setNull8Bytes(int ordinal) { setNullBit(ordinal); // put zero into the corresponding field when set null -Platform.putLong(holder.buffer, getEleme
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177940602 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/BufferHolder.java --- @@ -31,24 +31,24 @@ * for each incoming record, we should call `reset` of BufferHolder instance before write the record * and reuse the data buffer. * - * Generally we should call `UnsafeRow.setTotalSize` and pass in `BufferHolder.totalSize` to update + * Generally we should call `UnsafeRowWriter.setTotalSize` using `BufferHolder.totalSize` to update --- End diff -- Not sure if this description is still here or better to move to `UnsafeRowWriter`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177941093 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java --- @@ -17,17 +17,86 @@ package org.apache.spark.sql.catalyst.expressions.codegen; import org.apache.spark.sql.types.Decimal; +import org.apache.spark.unsafe.Platform; +import org.apache.spark.unsafe.array.ByteArrayMethods; import org.apache.spark.unsafe.types.CalendarInterval; import org.apache.spark.unsafe.types.UTF8String; /** * Base class for writing Unsafe* structures. */ public abstract class UnsafeWriter { + // Keep internal buffer holder + protected final BufferHolder holder; + + // The offset of the global buffer where we start to write this structure. + protected int startingOffset; + + protected UnsafeWriter(BufferHolder holder) { +this.holder = holder; + } + + /** + * Accessor methods are delegated from BufferHolder class + */ + public final BufferHolder getBufferHolder() { +return holder; + } + + public final byte[] buffer() { +return holder.buffer(); + } + + public final void reset() { +holder.reset(); + } + + public final int totalSize() { +return holder.totalSize(); + } + + public final void grow(int neededSize) { +holder.grow(neededSize); + } + + public final int cursor() { +return holder.getCursor(); + } + + public final void incrementCursor(int val) { +holder.incrementCursor(val); + } + + public abstract void setOffsetAndSizeFromMark(int ordinal, int mark); --- End diff -- `Mark` is an ambiguous term. It is not clear what it means here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177941414 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java --- @@ -17,17 +17,86 @@ package org.apache.spark.sql.catalyst.expressions.codegen; import org.apache.spark.sql.types.Decimal; +import org.apache.spark.unsafe.Platform; +import org.apache.spark.unsafe.array.ByteArrayMethods; import org.apache.spark.unsafe.types.CalendarInterval; import org.apache.spark.unsafe.types.UTF8String; /** * Base class for writing Unsafe* structures. */ public abstract class UnsafeWriter { + // Keep internal buffer holder + protected final BufferHolder holder; + + // The offset of the global buffer where we start to write this structure. + protected int startingOffset; + + protected UnsafeWriter(BufferHolder holder) { +this.holder = holder; + } + + /** + * Accessor methods are delegated from BufferHolder class + */ + public final BufferHolder getBufferHolder() { +return holder; + } + + public final byte[] buffer() { +return holder.buffer(); + } + + public final void reset() { +holder.reset(); + } + + public final int totalSize() { +return holder.totalSize(); + } + + public final void grow(int neededSize) { +holder.grow(neededSize); + } + + public final int cursor() { +return holder.getCursor(); + } + + public final void incrementCursor(int val) { +holder.incrementCursor(val); + } + + public abstract void setOffsetAndSizeFromMark(int ordinal, int mark); --- End diff -- Btw, why we have `_setOffsetAndSizeFromMark` and `setOffsetAndSizeFromMark`? Seems `setOffsetAndSizeFromMark` just call `_setOffsetAndSizeFromMark`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177939830 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java --- @@ -17,17 +17,86 @@ package org.apache.spark.sql.catalyst.expressions.codegen; import org.apache.spark.sql.types.Decimal; +import org.apache.spark.unsafe.Platform; +import org.apache.spark.unsafe.array.ByteArrayMethods; import org.apache.spark.unsafe.types.CalendarInterval; import org.apache.spark.unsafe.types.UTF8String; /** * Base class for writing Unsafe* structures. */ public abstract class UnsafeWriter { + // Keep internal buffer holder + protected final BufferHolder holder; + + // The offset of the global buffer where we start to write this structure. + protected int startingOffset; + + protected UnsafeWriter(BufferHolder holder) { +this.holder = holder; + } + + /** + * Accessor methods are delegated from BufferHolder class + */ + public final BufferHolder getBufferHolder() { +return holder; + } + + public final byte[] buffer() { +return holder.buffer(); + } + + public final void reset() { +holder.reset(); + } + + public final int totalSize() { +return holder.totalSize(); + } + + public final void grow(int neededSize) { +holder.grow(neededSize); + } + + public final int cursor() { +return holder.getCursor(); + } + + public final void incrementCursor(int val) { +holder.incrementCursor(val); + } + + public abstract void setOffsetAndSizeFromMark(int ordinal, int mark); + + protected void _setOffsetAndSizeFromMark(int ordinal, int mark) { +setOffsetAndSize(ordinal, mark, cursor() - mark); + } + + protected void setOffsetAndSize(int ordinal, int size) { +setOffsetAndSize(ordinal, cursor(), size); + } + + protected void setOffsetAndSize(int ordinal, int currentCursor, int size) { +final long relativeOffset = currentCursor - startingOffset; +final long offsetAndSize = (relativeOffset << 32) | (long)size; + +write(ordinal, offsetAndSize); + } + + protected final void zeroOutPaddingBytes(int numBytes) { +if ((numBytes & 0x07) > 0) { + Platform.putLong(buffer(), cursor() + ((numBytes >> 3) << 3), 0L); +} + } + + protected abstract long getOffset(int ordinal, int elementSize); --- End diff -- Can this just be `getOffset(int ordinal)`? One reason is only `UnsafeArrayWriter` has `elementSize`, another reason is `elementSize` is given at constructing `UnsafeArrayWriter`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88682/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20929 **[Test build #88682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88682/testReport)** for PR 20929 at commit [`876da84`](https://github.com/apache/spark/commit/876da84a7da9dbdc408e153b9e3dc17776a0c9db). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20926 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88684/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88684/testReport)** for PR 20926 at commit [`7be16a9`](https://github.com/apache/spark/commit/7be16a93da1efb86c69aa74f2c352ccbb66e5d4a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20850: [SPARK-23713][SQL] Cleanup UnsafeWriter and Buffe...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20850#discussion_r177939433 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeArrayWriter.java --- @@ -32,141 +30,133 @@ */ public final class UnsafeArrayWriter extends UnsafeWriter { - private BufferHolder holder; - - // The offset of the global buffer where we start to write this array. - private int startingOffset; - // The number of elements in this array private int numElements; + // The element size in this array + private int elementSize; + private int headerInBytes; private void assertIndexIsValid(int index) { assert index >= 0 : "index (" + index + ") should >= 0"; assert index < numElements : "index (" + index + ") should < " + numElements; } - public void initialize(BufferHolder holder, int numElements, int elementSize) { + public UnsafeArrayWriter(UnsafeWriter writer, int elementSize) { +super(writer.getBufferHolder()); +this.elementSize = elementSize; + } + + public void initialize(int numElements) { // We need 8 bytes to store numElements in header this.numElements = numElements; this.headerInBytes = calculateHeaderPortionInBytes(numElements); -this.holder = holder; -this.startingOffset = holder.cursor; +this.startingOffset = cursor(); // Grows the global buffer ahead for header and fixed size data. int fixedPartInBytes = ByteArrayMethods.roundNumberOfBytesToNearestWord(elementSize * numElements); holder.grow(headerInBytes + fixedPartInBytes); // Write numElements and clear out null bits to header -Platform.putLong(holder.buffer, startingOffset, numElements); +Platform.putLong(buffer(), startingOffset, numElements); for (int i = 8; i < headerInBytes; i += 8) { - Platform.putLong(holder.buffer, startingOffset + i, 0L); + Platform.putLong(buffer(), startingOffset + i, 0L); } // fill 0 into reminder part of 8-bytes alignment in unsafe array for (int i = elementSize * numElements; i < fixedPartInBytes; i++) { - Platform.putByte(holder.buffer, startingOffset + headerInBytes + i, (byte) 0); + Platform.putByte(buffer(), startingOffset + headerInBytes + i, (byte) 0); } -holder.cursor += (headerInBytes + fixedPartInBytes); +incrementCursor(headerInBytes + fixedPartInBytes); } - private void zeroOutPaddingBytes(int numBytes) { -if ((numBytes & 0x07) > 0) { - Platform.putLong(holder.buffer, holder.cursor + ((numBytes >> 3) << 3), 0L); -} + protected long getOffset(int ordinal, int elementSize) { +return getElementOffset(ordinal, elementSize); } private long getElementOffset(int ordinal, int elementSize) { --- End diff -- Isn't `elementSize` a given parameter when constructing `UnsafeArrayWriter` now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't check st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20893 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't check st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20893 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88679/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20893: [SPARK-23785][LAUNCHER] LauncherBackend doesn't check st...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20893 **[Test build #88679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88679/testReport)** for PR 20893 at commit [`4ca8a32`](https://github.com/apache/spark/commit/4ca8a32e2a518f3c7ccecd406a8b03eac06f860b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20784: [SPARK-23639][SQL]Obtain token before init metastore cli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20784 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20784: [SPARK-23639][SQL]Obtain token before init metastore cli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88680/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20784: [SPARK-23639][SQL]Obtain token before init metastore cli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20784 **[Test build #88680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88680/testReport)** for PR 20784 at commit [`cd8056c`](https://github.com/apache/spark/commit/cd8056c3ad40afc08ac251a7ce502626fb9dd3c4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20924: [SPARK-23806] Broadcast.unpersist can cause fatal except...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20924: [SPARK-23806] Broadcast.unpersist can cause fatal except...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88678/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20924: [SPARK-23806] Broadcast.unpersist can cause fatal except...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20924 **[Test build #88678 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88678/testReport)** for PR 20924 at commit [`54cab78`](https://github.com/apache/spark/commit/54cab78296c7e09777ba9989e9be620928801a51). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20891: [SPARK-23782][CORE][UI] SHS should list only application...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20891 @mgaido91 what is the status of Hadoop, for example like YARN RM UI, will it show apps which is run by other users, while this user doesn't have permission to see? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1818/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20753 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20797 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1817/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20922: Roll forward "[SPARK-23096][SS] Migrate rate sour...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/20922#discussion_r177933081 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.sources + +import java.util.Optional + +import org.apache.spark.network.util.JavaUtils +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader +import org.apache.spark.sql.sources.DataSourceRegister +import org.apache.spark.sql.sources.v2._ +import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousReader, MicroBatchReader} +import org.apache.spark.sql.types._ + +/** + * A source that generates increment long values with timestamps. Each generated row has two + * columns: a timestamp column for the generated time and an auto increment long column starting + * with 0L. + * + * This source supports the following options: + * - `rowsPerSecond` (e.g. 100, default: 1): How many rows should be generated per second. + * - `rampUpTime` (e.g. 5s, default: 0s): How long to ramp up before the generating speed + *becomes `rowsPerSecond`. Using finer granularities than seconds will be truncated to integer + *seconds. + * - `numPartitions` (e.g. 10, default: Spark's default parallelism): The partition number for the + *generated rows. The source will try its best to reach `rowsPerSecond`, but the query may + *be resource constrained, and `numPartitions` can be tweaked to help reach the desired speed. + */ +class RateStreamProvider extends DataSourceV2 + with MicroBatchReadSupport with ContinuousReadSupport with DataSourceRegister { + import RateStreamProvider._ + + override def createMicroBatchReader( + schema: Optional[StructType], + checkpointLocation: String, + options: DataSourceOptions): MicroBatchReader = { --- End diff -- What do you think @jose-torres @tdas @gatorsmile ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1816/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20922: Roll forward "[SPARK-23096][SS] Migrate rate sour...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/20922#discussion_r177932994 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming.sources + +import java.util.Optional + +import org.apache.spark.network.util.JavaUtils +import org.apache.spark.sql.AnalysisException +import org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader +import org.apache.spark.sql.sources.DataSourceRegister +import org.apache.spark.sql.sources.v2._ +import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousReader, MicroBatchReader} +import org.apache.spark.sql.types._ + +/** + * A source that generates increment long values with timestamps. Each generated row has two + * columns: a timestamp column for the generated time and an auto increment long column starting + * with 0L. + * + * This source supports the following options: + * - `rowsPerSecond` (e.g. 100, default: 1): How many rows should be generated per second. + * - `rampUpTime` (e.g. 5s, default: 0s): How long to ramp up before the generating speed + *becomes `rowsPerSecond`. Using finer granularities than seconds will be truncated to integer + *seconds. + * - `numPartitions` (e.g. 10, default: Spark's default parallelism): The partition number for the + *generated rows. The source will try its best to reach `rowsPerSecond`, but the query may + *be resource constrained, and `numPartitions` can be tweaked to help reach the desired speed. + */ +class RateStreamProvider extends DataSourceV2 + with MicroBatchReadSupport with ContinuousReadSupport with DataSourceRegister { + import RateStreamProvider._ + + override def createMicroBatchReader( + schema: Optional[StructType], + checkpointLocation: String, + options: DataSourceOptions): MicroBatchReader = { --- End diff -- Here if `MicrobatchReadSupport` could pass in `SparkSession` parameter like `StreamSourceProvider#createSource` (sqlContext), then it is not required to get session from thread local variable or default variable, also the UT doesn't required to `setDefaultSession`. That's what I thought when I did this refactoring work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20920 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20920 **[Test build #88685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88685/testReport)** for PR 20920 at commit [`35ecbf9`](https://github.com/apache/spark/commit/35ecbf983b675b7fa5643c4c395995e4dca2647e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20797 **[Test build #88686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88686/testReport)** for PR 20797 at commit [`4493909`](https://github.com/apache/spark/commit/4493909f3b66c74e488e57ffb6e89fc048a81a8d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20753 **[Test build #88687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88687/testReport)** for PR 20753 at commit [`09cdf5e`](https://github.com/apache/spark/commit/09cdf5e9920a4f896fd34fc361cf6c4382fd09e5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20920: [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result ...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20920 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20797: [SPARK-23583][SQL] Invoke should support interpreted exe...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20797 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20753 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20922: Roll forward "[SPARK-23096][SS] Migrate rate source to V...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20922 Thanks for the help @jose-torres . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20928: Fix small typo in configuration doc
Github user dsakuma commented on the issue: https://github.com/apache/spark/pull/20928 @HyukjinKwon Great idea! I've found and fixed some other issues using a spell checker. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20818: [SPARK-23675][WEB-UI]Title add spark logo, use sp...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20818 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20925 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20925 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88675/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20925 **[Test build #88675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88675/testReport)** for PR 20925 at commit [`466f84a`](https://github.com/apache/spark/commit/466f84a558dcfe9b6944dcc3a62a8cdadf871d02). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` logInfo(s\"Failed to load main class $childMainClass.\")` * ` error(s\"Cannot load main class from JAR $primaryResource\")` * ` error(\"No main class set in JAR; please specify one with --class\")` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20818: [SPARK-23675][WEB-UI]Title add spark logo, use spark log...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20818 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88684/testReport)** for PR 20926 at commit [`7be16a9`](https://github.com/apache/spark/commit/7be16a93da1efb86c69aa74f2c352ccbb66e5d4a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20927: [SPARK-23809][SQL] Active SparkSession should be set by ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20927 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20927: [SPARK-23809][SQL] Active SparkSession should be set by ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20927 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88681/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20928: Fix small typo in configuration doc
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20928 that's fine but mind taking a look for other typoes while we are here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20927: [SPARK-23809][SQL] Active SparkSession should be set by ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20927 **[Test build #88681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88681/testReport)** for PR 20927 at commit [`8f3cbf3`](https://github.com/apache/spark/commit/8f3cbf3399420a14f5ebe74b99b2739437fe3647). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20926: [SPARK-23808][SQL] Set default Spark session in test-onl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20926 **[Test build #88683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88683/testReport)** for PR 20926 at commit [`d0988f7`](https://github.com/apache/spark/commit/d0988f7378152b576844c4ae11b1761fa9a3bde2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1815/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20929: [SPARK-23772][SQL][WIP] Provide an option to igno...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20929#discussion_r177925982 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -624,6 +624,42 @@ class FileStreamSourceSuite extends FileStreamSourceTest { } } + test("SPARK-23772 Ignore column of all null values or empty array during JSON schema inference") { --- End diff -- @mengxr This test matches your intention described in the jira? (I just want to confirm this before I brush up the code). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org