Yea it is for read-side only. I think for the write-side, implementations
can provide some options to allow users to set partitioning/ordering, or
the data source has a natural partitioning/ordering which doesn't require
any interface.
On Mon, Mar 26, 2018 at 7:59 PM, Patrick Woody
wrote:
> Hey
Hey Ryan, Ted, Wenchen
Thanks for the quick replies.
@Ryan - the sorting portion makes sense, but I think we'd have to ensure
something similar to requiredChildDistribution in SparkPlan where we have
the number of partitions as well if we'd want to further report to
SupportsReportPartitioning, ye
Hmm. Ryan seems to be right.
Looking
at
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/SupportsReportPartitioning.java
:
import org.apache.spark.sql.sources.v2.reader.partitioning.Partitioning;
...
Partitioning outputPartitioning();
On Mon, Mar 26, 2018 at 6:58 PM, Wenchen Fan
Wenchen, I thought SupportsReportPartitioning was for the read side. It
works with the write side as well?
On Mon, Mar 26, 2018 at 6:58 PM, Wenchen Fan wrote:
> Actually clustering is already supported, please take a look at
> SupportsReportPartitioning
>
> Ordering is not proposed yet, might be
Actually clustering is already supported, please take a look at
SupportsReportPartitioning
Ordering is not proposed yet, might be similar to what Ryan proposed.
On Mon, Mar 26, 2018 at 6:11 PM, Ted Yu wrote:
> Interesting.
>
> Should requiredClustering return a Set of Expression's ?
> This way,
Interesting.
Should requiredClustering return a Set of Expression's ?
This way, we can determine the order of Expression's by looking at
what requiredOrdering()
returns.
On Mon, Mar 26, 2018 at 5:45 PM, Ryan Blue
wrote:
> Hi Pat,
>
> Thanks for starting the discussion on this, we’re really inte
Hi Pat,
Thanks for starting the discussion on this, we’re really interested in it
as well. I don’t think there is a proposed API yet, but I was thinking
something like this:
interface RequiresClustering {
List requiredClustering();
}
interface RequiresSort {
List requiredOrdering();
}
The r
Hey all,
I saw in some of the discussions around DataSourceV2 writes that we might
have the data source inform Spark of requirements for the input data's
ordering and partitioning. Has there been a proposed API for that yet?
Even one level up it would be helpful to understand how I should be
thin
Hi,
Using concat is one of the way.
But the + is more intuitive and easy to understand.
1427357...@qq.com
From: Shmuel Blitz
Date: 2018-03-26 15:31
To: 1427357...@qq.com
CC: spark?users; dev
Subject: Re: the issue about the + in column,can we support the string please?
Hi,
you can get the sa
10 matches
Mail list logo