Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166394747
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166386661
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@cloud-fan, this is a single commit on purpose because predicate push-down
makes plan changes. I think it's best to do these at once to avoid unnecessary
work. That's why I started looking more
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20490#discussion_r166381800
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
---
@@ -117,20 +118,43 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20490#discussion_r166360278
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
---
@@ -117,20 +118,43 @@ object
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20495#discussion_r166358420
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
---
@@ -1655,15 +1655,17 @@ case class Left(str
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166165255
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166165005
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20490
I've updated this to no longer require #20387. It wasn't relying on those
changes at all. @gatorsmile, @cloud-fan, what do you think about getting this
into 2.3.0
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166081367
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> For safety, I wanna keep it unchanged, and start something new for data
source v2 only.
I disagree.
* **#20476 addresses a bug caused by the new implementat
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r166030958
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,17 +17,151 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> Why pushdown is happening in logical optimization and not during query
planning. My first instinct would be to have the optimizer get operators as
close to the leaves as possible and then f
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20490
@dongjoon-hyun, @cloud-fan, @gatorsmile. Once the immutable plan PR is in,
this can be reviewed.
@steveloughran, I think this is what you were asking
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/20490
[SPARK-23323][SQL]: Add support for commit coordinator for DataSourceV2
writes
## What changes were proposed in this pull request?
DataSourceV2 batch writes should use the output commit
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/20488
[SPARK-23321][SQL]: Validate datasource v2 writes
## What changes were proposed in this pull request?
DataSourceV2 does not currently apply any validation rules when writing.
Other write
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@cloud-fan, @dongjoon-hyun, @gatorsmile, I've rebased this on master and
removed the support for SPARK-23204 that parses table identifiers. If you need
other changes to get this in, let me know
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20477#discussion_r165728696
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -36,11 +38,14 @@ import
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20485
Sounds fine to me, then.
My focus is on the long-term design issues. I still think that the changes
to make plans immutable and to use the existing push-down code as much as
possible
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@cloud-fan, I'll update this PR and we can talk about passing configuration
on the dev list.
And as a reminder, please close #20445
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> I tried and can't figure out how to do it with PhysicalOperation, that's
why I build something new for data source v2 pushdown.
The problem is that we should get DSv2 working independen
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20485
To be clear, the purpose of this commit, like #20476, is just to get
something working for the 2.3.0 release?
I just want to make sure since I think we should be approaching these
problems
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20476
Yeah, I did review it, but at the time I wasn't familiar with how the other
code paths worked and assumed that it was necessary to introduce this. I wasn't
very familiar with how it *should* work
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20476
@gatorsmile, Do you mean this?
> Extensibility is not good and operator push-down capabilities are limited.
If so, that's very open to interpretation. I would assume it me
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20476
@gatorsmile, thanks for the context. If we need to redesign push-down, then
I think we should do that separately and with a design plan.
**I don't think it's a good idea to bundle
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20466
+1
Good to get this in before changes to the relation.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20476
@cloud-fan, @gatorsmile, this PR demonstrates why we should use
PhysicalOperation. I ported the tests from this PR over to our branch and they
pass without modifying the push-down code. That's
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@dongjoon-hyun, @gatorsmile, could you guys weigh in on some this
discussion? I'd like to get additional perspectives on the changes I'm
proposing
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> Let's keep it general and let the data source to interprete it.
I think this is the wrong approach. The reason why we are using a special
`DataSourceOptions` object is to ensure that d
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@cloud-fan, to your point about push-down order, I'm not saying that order
doesn't matter at all, I'm saying that the push-down can run more than once and
it should push the closest operators
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> `spark.read.format("iceberg").table("db.table").load()`
I'm fine with this if you think it is confusing to parse the path as a
table name in load. I think it is r
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@felixcheung, yes, we do already have a `table` option. That creates an
`UnresolvedRelation` with the parsed table name as a `TableIdentifier`, which
is not currently compatible with `DataSourceV2
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20454
+1
I'd rather not add features without a known use case, but this
implementation looks good to me.
---
-
To unsubscribe
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20454#discussion_r165141464
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
---
@@ -62,6 +62,15 @@
*/
DataWriterFactory
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165138574
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20448
@cloud-fan, **please close this PR**. There is already a pull request for
these changes, #20387, and ongoing discussion there.
If you want the proposed implementation to change, please ask
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20448#discussion_r165132718
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,36 +17,84 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165131292
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165121965
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165119560
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
---
@@ -63,32 +68,42 @@
DataWriterFactory
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165119427
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r165117779
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20386
> I assume this API is necessary . . . it sounds reasonable to provide a
callback for task commit.
I agree it sounds reasonable, but we shouldn't add methods to a new API
blin
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20386
@gengliangwang, what is the use case supported by this? In other words, how
is `onTaskCommit(taskCommit: TaskCommitMessage)` currently used that requires
this change?
In general, I'm more
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164909970
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
---
@@ -63,32 +68,42 @@
DataWriterFactory
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164909653
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164909225
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164908529
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceWriter.java
---
@@ -63,32 +68,42 @@
DataWriterFactory
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164907626
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20386#discussion_r164907397
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/streaming/writer/StreamWriter.java
---
@@ -32,40 +32,44
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20386
@cloud-fan, is the intent to get this into 2.3.0? If so, I'll make time to
review it today.
---
-
To unsubscribe, e-mail
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20427#discussion_r164807449
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
---
@@ -25,7 +25,7 @@
* session
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20427#discussion_r164806676
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
---
@@ -25,7 +25,7 @@
* session
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20427#discussion_r164621094
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
---
@@ -25,7 +25,7 @@
* session
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> It's hard to improve PhysicalOperation to support more operators and
specific push down orders, so I created the new one
I'm concerned about the new one. The projection support se
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> [The push-down rule may be run more than once if filters are not pushed
through projections] looks weird, do you have a query to reproduce this issue?
One of the DataSourceV2 tests
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> I'd suggest that we just propogate the paths parameter to options, and
data source implementations are free to interprete the path option to whatever
they want, e.g. table and database na
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
> I'm ok to make it immutable if there is an significant benefit.
Mutable nodes violate a basic assumption of catalyst, that trees are
immutable. Here's a good quote from the SIGMOD pa
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20427#discussion_r164544288
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
---
@@ -25,7 +25,7 @@
* session
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20427#discussion_r164543216
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java ---
@@ -18,23 +18,23 @@
package org.apache.spark.sql.sources.v2
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20397
This is more confusing, not less. Look at @jiangxb1987's comment above: "We
shall create only one DataReaderFactory, and have that create multiple data
readers." It is not clear why the AP
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20397
> I think the renaming is worth to remove future confusions.
What future confusion?
I understand that the difference isn't obvious, but making the names less
accurate isn't a g
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20397
One last point: should significant changes to public APIs like this go in
just before or just after a release? 2.3.0 candidates have used ReadTask up to
now
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20397
@cloud-fan, thanks for pinging me on this.
-1: I don't think there's a compelling benefit to justify this change, and
I think it makes the API more confusing. I think we should revert
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r164169060
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,15 +17,149 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r163913161
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,15 +17,149 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r163909751
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,15 +17,149 @@
package
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20387#discussion_r163908263
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -17,15 +17,149 @@
package
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20387
@cloud-fan, please have a look at these changes. This will require
follow-up for the Streaming side. I have yet to review the streaming interfaces
for `DataSourceV2`, so I haven't made any changes
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/20387
SPARK-22386: DataSourceV2: Use immutable logical plans.
## What changes were proposed in this pull request?
DataSourceV2 should use immutable catalyst trees instead of wrapping a
mutable
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19861
Thanks for the example, @cloud-fan.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19861
@jiangxb1987, I understand what this does. I just wanted an example use
case where it was necessary. What was the motivating *use case
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19861
@jiangxb1987, @cloud-fan, what was the use case you needed to add this for?
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/20201
@cloud-fan, please ping me to review PRs for DataSourceV2. Our new table
format uses it and we're preparing some changes, so I want to make sure we're
heading in the same direction
Github user rdblue closed the pull request at:
https://github.com/apache/spark/pull/19568
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
GitHub user rdblue reopened a pull request:
https://github.com/apache/spark/pull/19568
SPARK-22345: Fix sort-merge joins with conditions and codegen.
## What changes were proposed in this pull request?
This adds a joined row to sort-merge join codegen. That joined row
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19568
@gatorsmile, I think it would be better to fix codegen than to prevent it
from happening with an assertion. If `CodegenFallback` can produce fallback
code, why not allow it to when necessary
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/incubator-toree/pull/147#discussion_r152119598
--- Diff:
kernel/src/main/scala/org/apache/toree/boot/CommandLineOptions.scala ---
@@ -193,6 +199,19 @@ class CommandLineOptions(args: Seq[String
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19623#discussion_r148875607
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -50,28 +53,34 @@
/**
* Creates
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19623#discussion_r148849054
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/ReadTask.java ---
@@ -36,14 +36,24 @@
/**
* The preferred locations
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19623#discussion_r148848790
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/DataReader.java
---
@@ -34,11 +35,17 @@
/**
* Proceed to next
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19623#discussion_r148848619
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java ---
@@ -30,6 +30,9 @@
/**
* Creates a {@link
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19568
@DonnyZone, I don't know of any cases that use codgen after the fix for
`CodegenFallback`, but I think this is still a good idea.
If Spark is going to generate code, it should generate
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/13206
We still maintain a version of this for our Spark builds to avoid an extra
sort in Hive. If someone is willing to review it, I can probably find the time
to rebase it on master. I think the year
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r147471530
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
---
@@ -124,7 +125,8 @@ class InnerJoinSuite extends
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r147471054
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
---
@@ -228,6 +230,27 @@ class InnerJoinSuite extends
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r14733
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -585,21 +585,26 @@ case class SortMergeJoinExec
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146975643
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -585,21 +585,26 @@ case class SortMergeJoinExec
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146974005
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -585,21 +585,26 @@ case class SortMergeJoinExec
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146970055
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -615,6 +620,7 @@ case class SortMergeJoinExec
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/17813#discussion_r146939350
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -589,8 +605,18 @@ class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146928894
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -585,21 +585,26 @@ case class SortMergeJoinExec
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19568#discussion_r146928318
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala
---
@@ -615,6 +620,7 @@ case class SortMergeJoinExec
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19568
@dongjoon-hyun, yes, I'm currently working on it. I just wanted to get the
rest up.
---
-
To unsubscribe, e-mail: reviews
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/19568
SPARK-22345: Fix sort-merge joins with conditions and codegen.
## What changes were proposed in this pull request?
This adds a joined row to sort-merge join codegen. That joined row is used
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19448
I have a lot of sympathy for the argument that infrastructure software
shouldn't have too many backports and that those should be generally bug fixes.
But, if I were working on a Spark distribution
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144388430
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/19448
Still +1 from me as well.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19448#discussion_r144331909
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
---
@@ -138,6 +138,10 @@ class
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/19269#discussion_r144097061
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/DataSourceV2Writer.java
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache
801 - 900 of 1330 matches
Mail list logo