I just looked at the PR. I think there are some follow up work that needs to be
done, e.g. we shouldn't create a top level packageĀ
org.apache.spark.sql.dynamicpruning.
On Wed, Oct 02, 2019 at 1:52 PM, Maryann Xue < maryann@databricks.com >
wrote:
>
> There is no internal write up, but I
There is no internal write up, but I think we should at least give some
up-to-date description on that JIRA entry.
On Wed, Oct 2, 2019 at 3:13 PM Reynold Xin wrote:
> No there is no separate write up internally.
>
> On Wed, Oct 2, 2019 at 12:29 PM Ryan Blue wrote:
>
>> Thanks for the pointers,
No there is no separate write up internally.
On Wed, Oct 2, 2019 at 12:29 PM Ryan Blue wrote:
> Thanks for the pointers, but what I'm looking for is information about the
> design of this implementation, like what requires this to be in spark-sql
> instead of spark-catalyst.
>
> Even a
The reason why it's in spark-sql is simply because HadoopFsRelation which
the rule tries to match is in spark-sql.
We should probably update the high-level description in the JIRA. I'll work
on that shortly.
On Wed, Oct 2, 2019 at 2:29 PM Ryan Blue wrote:
> Thanks for the pointers, but what
Thanks for the pointers, but what I'm looking for is information about the
design of this implementation, like what requires this to be in spark-sql
instead of spark-catalyst.
Even a high-level description, like what the optimizer rules are and what
they do would be great. Was there one written
> It lists 3 cases for how a filter is built, but nothing about the overall
approach or design that helps when trying to find out where it should be
placed in the optimizer rules.
The overall idea/design of DPP can be simply put as using the result of one
side of the join to prune partitions of a
Whoever created the JIRA years ago didn't describe dpp correctly, but the
linked jira in Hive was correct (which unfortunately is much more terse than
any of the patches we have in SparkĀ
https://issues.apache.org/jira/browse/HIVE-9152 ). Henry R's description was
also correct.
On Wed, Oct 02,
Where can I find a design doc for dynamic partition pruning that explains
how it works?
The JIRA issue, SPARK-11150, doesn't seem to describe dynamic partition
pruning (as pointed out by Henry R.) and doesn't have any comments about
the implementation's approach. And the PR description also
dynamic partition pruning rule generates "hidden" filters that will be
converted to real predicates at runtime, so it doesn't matter where we run
the rule.
For PruneFileSourcePartitions, I'm not quite sure. Seems to me it's better
to run it before join reorder.
On Sun, Sep 29, 2019 at 5:51 AM
Hi everyone,
I have been working on a PR that moves filter and projection pushdown into
the optimizer for DSv2, instead of when converting to physical plan. This
will make DSv2 work with optimizer rules that depend on stats, like join
reordering.
While adding the optimizer rule, I found that
10 matches
Mail list logo