Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23275#discussion_r240234583
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
---
@@ -88,68 +88,49 @@ sealed trait UserDefinedFunction
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23228
thanks, merging to master!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23275#discussion_r240231883
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -4255,11 +4255,11 @@ object functions {
*
* @group udf_funcs
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23275
cc @maryannxue @gatorsmile @srowen
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23275#discussion_r240230970
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUDF.scala
---
@@ -47,25 +47,13 @@ case class ScalaUDF
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23275
[SPARK-26323][SQL] Scala UDF should still check input types even if some
inputs are of type Any
## What changes were proposed in this pull request?
For Scala UDF, when checking input
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240189245
--- Diff:
core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java
---
@@ -667,4 +669,54 @@ public void testPeakMemoryUsed
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23251
cc @zsxwing
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23262
LGTM, can you update the PR title and description?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23262#discussion_r240180713
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
---
@@ -416,7 +416,12 @@ case class
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23272
have you seen any bug report caused by this dead lock?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23272#discussion_r240178993
--- Diff:
core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java ---
@@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23204#discussion_r240104812
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ---
@@ -213,10 +213,6 @@ trait HashJoin {
s
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23255
thanks, merging to master/2.4/2.3!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23211
to make the PR smaller, can we add an individual rule
`PushdownLeftSemiOrAntiJoin` first?
---
-
To unsubscribe, e-mail
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23211#discussion_r240097479
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -984,6 +1002,28 @@ object PushDownPredicate extends
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23211#discussion_r240097255
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -649,13 +664,16 @@ object CollapseProject extends
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23204
can we follow
https://github.com/apache/spark/pull/23204#issuecomment-445510026 and create a
new ticket
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23211#discussion_r240092936
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
---
@@ -267,6 +267,17 @@ object ScalarSubquery
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23248
If it's fine for 2.4, I think it's also fine for master as a temporary fix?
We can create another ticket to clean up the subquery optimization hack. IIUC
https://github.com/apache/spark/pull
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23258#discussion_r240090371
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
---
@@ -182,10 +182,13 @@ class SQLMetricsSuite extends
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r240090192
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan closed the pull request at:
https://github.com/apache/spark/pull/23265
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23228
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23204
If we can quickly finish #23214 (within several days), let's go for it. But
if we can't, I'd suggest we do the partial revert first to fix the perf
regression, and add back the metrics later
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23228
LGTM, cc @jiangxb1987
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23228#discussion_r240036698
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala ---
@@ -33,10 +33,10 @@ import org.apache.spark.shuffle
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23253
LGTM except a code style comment
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23253#discussion_r240036498
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
---
@@ -347,17 +347,28 @@ class JacksonParser
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23253#discussion_r240036489
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
---
@@ -347,17 +347,28 @@ class JacksonParser
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r240036225
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23208
Let's move the high level discussion to the doc:
https://docs.google.com/document/d/1vI26UEuDpVuOjWw4WPoH2T6y8WAekwtI7qoowhOFnI4/edit?usp=sharing
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23266#discussion_r240029373
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchRead.java
---
@@ -20,14 +20,27 @@
import
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r240028574
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,52 +17,49 @@
package
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r240028515
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23266
[SPARK-26313][SQL] move read related methods from Table to read related
mix-in traits
## What changes were proposed in this pull request?
As discussed in https://github.com/apache/spark
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23266
cc @rdblue @HyukjinKwon @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23265
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23259
thanks @maropu for starting it!
> Which SQL standard does Spark SQL follow (e.g., 2011 or 2016)?
I think SQL 2011 is good, but if we can't find a public version, maybe it's
also
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23258#discussion_r240026727
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
---
@@ -182,10 +182,13 @@ class SQLMetricsSuite extends
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23249#discussion_r240026485
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -118,10 +115,12 @@ case class
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23255
LGTM
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23255#discussion_r240026441
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -752,6 +752,17 @@ class InsertSuite extends QueryTest
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23262#discussion_r240026394
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -53,7 +53,7 @@ object RDDConversions
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23262#discussion_r240026388
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -33,7 +33,7 @@ object RDDConversions
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23248#discussion_r240026330
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -131,8 +131,20 @@ object ExtractPythonUDFs
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23253#discussion_r240026245
--- Diff: docs/sql-migration-guide-upgrade.md ---
@@ -35,7 +35,9 @@ displayTitle: Spark SQL Upgrading Guide
- Since Spark 3.0, CSV datasource
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23253#discussion_r240026237
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/TestJsonData.scala
---
@@ -229,6 +229,11 @@ private[json] trait
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23204
+1
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23265
[2.4][SPARK-26021][SQL][FOLLOWUP] only deal with NaN and -0.0 in
UnsafeWriter
backport https://github.com/apache/spark/pull/23239 to 2.4
-
## What changes were proposed
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23265
cc @dongjoon-hyun
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r240022552
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23207
thanks, merging to master!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23204
according to
https://github.com/apache/spark/pull/23214#issuecomment-443999282 , the hash
join metrics is wrongly implemented. I think it's fine to revert it and
re-implement it later
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23249
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22104#discussion_r239738437
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala ---
@@ -31,7 +31,8 @@ class SparkOptimizer(
override
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239736660
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
---
@@ -78,6 +80,7 @@ object SQLMetrics {
private val
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239735814
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
---
@@ -78,6 +80,7 @@ object SQLMetrics {
private val
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239735425
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
---
@@ -78,6 +80,7 @@ object SQLMetrics {
private val
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239735015
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239734920
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22514#discussion_r239733875
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala
---
@@ -95,9 +77,116 @@ case class
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23239
I checked the original PR that handles NaN:
https://github.com/apache/spark/commit/c032b0bf92130dc4facb003f0deaeb1228aefded
It didn't add end-to-end tests, so I added 2 new tests
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23239
Yes it is. `UnsafeProjection` always normalize NaN and -0.0, and Spark uses
`UnsafeProjection` to produce output. So users can't distinguish them
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23249#discussion_r239690226
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -22,13 +22,12 @@ import
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r239687264
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r239687213
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23248#discussion_r239686156
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
---
@@ -131,8 +131,20 @@ object ExtractPythonUDFs
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23215
thanks, merging to master!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23239
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23249#discussion_r239684697
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -22,13 +22,12 @@ import
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239684490
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java
---
@@ -25,14 +25,14 @@
import
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23208
@rdblue I tried to add `WriteBuilder`, but there is a difference between
read and write:
1. for read, the `ScanBuilder` can collect many information, like column
pruning, filter pushdown, etc
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239683592
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,52 +17,49 @@
package
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239682984
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -241,32 +241,28 @@ final class DataFrameWriter[T] private[sql](ds
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239682239
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java ---
@@ -25,7 +25,10 @@
* The base interface for v2 data sources
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23207
the code looks much cleaner now!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239677846
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala
---
@@ -78,6 +80,7 @@ object SQLMetrics {
private val
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239677653
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
---
@@ -333,8 +343,19 @@ object ShuffleExchangeExec
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239677477
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/ShuffleWriterProcessor.scala ---
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23207#discussion_r239677325
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/ShuffleWriterProcessor.scala ---
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23244
thanks, merging to master!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23244#discussion_r239675382
--- Diff:
core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -209,23 +205,14 @@ public BytesToBytesMap
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r239539848
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23201#discussion_r239534668
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
---
@@ -121,7 +122,26 @@ private[sql] class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23249#discussion_r239508488
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -118,10 +116,13 @@ case class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23249#discussion_r239508437
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
---
@@ -118,10 +116,13 @@ case class
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23239#discussion_r239507673
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java
---
@@ -198,11 +198,45 @@ protected final void
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23249
cc @maryannxue @hvanhovell @gatorsmile @viirya
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23249
[SPARK-26297][SQL] improve the doc of Distribution/Partitioning
## What changes were proposed in this pull request?
Some documents of `Distribution/Partitioning` are stale and misleading
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23248
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23208#discussion_r239469368
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SupportsBatchWrite.java
---
@@ -25,14 +25,14 @@
import
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23215#discussion_r239453312
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1396,6 +1396,16 @@ object SQLConf {
.booleanConf
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23215#discussion_r239453026
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala
---
@@ -95,6 +95,31 @@ class FileIndexSuite extends
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23213
these 3 combinations LGTM.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23244#discussion_r239451274
--- Diff:
core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java ---
@@ -209,23 +205,14 @@ public BytesToBytesMap
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23248
cc @icexelloss @HyukjinKwon @ueshin @viirya @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23248#discussion_r239430315
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala
---
@@ -60,8 +60,12 @@ private class BatchIterator[T
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/23248#discussion_r239430084
--- Diff: python/pyspark/sql/tests/test_udf.py ---
@@ -23,7 +23,7 @@
from pyspark import SparkContext
from pyspark.sql import SparkSession
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/23248
[SPARK-26293][SQL] Cast exception when having python udf in subquery
## What changes were proposed in this pull request?
This is a regression introduced by
https://github.com/apache
1 - 100 of 17635 matches
Mail list logo