Re: Row Lineage - implementation advice

2017-02-18 Thread jordan.halter...@gmail.com
two columns in one table. > On Feb 18, 2017, at 7:49 PM, Sarnath K wrote: > > Just curious...How can a column have multiple origins? Join key type > scenarios where they have the same value regardless of where they originate > from? > > On Feb 19, 2017 09:18, "

Re: Row Lineage - implementation advice

2017-02-18 Thread jordan.halter...@gmail.com
You can often get the original of a column via RelMetadataQuery.getColumnOrigin(), but keep in mind columns can have multiple origins or no origin at all. > On Feb 18, 2017, at 5:47 PM, barry squire wrote: > > Hi everyone, > > Calcite's SQL parsing, planning and execution using the enumerator

Re: Calcite vs Catalyst

2017-02-16 Thread jordan.halter...@gmail.com
Calcite differs from Catalyst in many ways. First of all, Catalyst is essentially a heuristic optimizer, while Calcite optimizers often combine heuristics and cost-based optimization. Catalyst pushes down predicates and projections to most data sources, while Calcite can often push down full qu

Re: Spark and Calcite

2017-01-21 Thread jordan.halter...@gmail.com
Fri, Jan 20, 2017 at 4:24 PM, Jacques Nadeau wrote: >> Jordan, super interesting work you've shared. It would be very cool to get >> this incorporated back into Spark mainline. That would continue to broaden >> Calcite's reach :) >> >> On Fri, Jan 20, 2017

Re: Spark and Calcite

2017-01-20 Thread jordan.halter...@gmail.com
So, AFAIK the Spark adapter that's inside Calcite is in an unusable state right now. It's still using Spark 1.x and last time I tried it I couldn't get it to run. It probably needs to either be removed or completely rewritten. But I can certainly offer some guidance on working with Spark and Cal

Re: ApacheCon CFP closing soon (11 February)

2017-01-20 Thread jordan.halter...@gmail.com
Riccardo Tommasini > Master Degree Computer Science > PhD Student at Politecnico di Milano (Italy) > streamreasoning.org<http://streamreasoning.org/> > > Submitted from an iPhone, I apologise for typos. > > From: jordan.halter...@gmail.com > <mailto:jordan.

Re: ApacheCon CFP closing soon (11 February)

2017-01-20 Thread jordan.halter...@gmail.com
k listed at > http://calcite.apache.org/community/#upcoming-talks. > > On Thu, Jan 19, 2017 at 12:18 PM, jordan.halter...@gmail.com > wrote: >> I will be submitting a proposal for a Calcite talk. Also doing one other >> Calcite related talk at Strata and submitting anothe

Re: ApacheCon CFP closing soon (11 February)

2017-01-19 Thread jordan.halter...@gmail.com
I will be submitting a proposal for a Calcite talk. Also doing one other Calcite related talk at Strata and submitting another for Spark Summit on optimizing Spark SQL queries with Calcite. > On Jan 18, 2017, at 11:46 AM, Julian Hyde wrote: > > There was 1 Calcite talk at ApacheCon: Big Data N

Re: Rule to expand tha JdbcTableScan expression

2016-11-30 Thread jordan.halter...@gmail.com
Which planner are you using? If the rule is being fired, what you may be missing is that the cost of the converted expression is more than the cost of the input expression, resulting in the VolcanoPlanner throwing out the converted expression. You should use the HepPlanner for this. > On Nov 2

Re: RFC: Adding Double Colon Cast Syntax

2016-09-27 Thread jordan.halter...@gmail.com
We added the double-colon syntax to our own fork of the Calcite grammar to placate our analysts and their addiction to Redshift. TBH it was not easy, and our implementation still doesn't support things like casting from a scalar subquery. Essentially, you can cast identifiers and function result

Re: Using Metadata in Query Optimization

2016-09-26 Thread jordan.halter...@gmail.com
I think you should just be able to override getStatistic() in your table implementations and return a Statistic object that has an accurate row count. The table scan should compute its cost from that, and uses 100d as a default IIRC. > On Sep 25, 2016, at 1:56 PM, Γιώργος Θεοδωράκης > wrote:

Re: How to create a cost-optimized plan from the Framework

2016-09-20 Thread jordan.halter...@gmail.com
You got most of the way there, but to optimize the plan you need to add programs to your framework configuration. See the programs() method of the framework config. A Program is essentially a RelOptPlanner and a set of rules to apply. You can add several Programs to your Planner by using the va

Re: is there any tool class to generate RexNodes from a SQL command?

2016-09-14 Thread jordan.halter...@gmail.com
hone:+86-10-58812516 > mobile:+86-13671116520 > > > > > > > > >> On 9/15/16, 1:07 AM, "jordan.halter...@gmail.com" >> > jordan.halter...@gmail.com> wrote: >> >> You have to run planner.validate after parse, otherwise the state

Re: is there any tool class to generate RexNodes from a SQL command?

2016-09-14 Thread jordan.halter...@gmail.com
You have to run planner.validate after parse, otherwise the state in PlannerImpl will be incorrect. You can also go into the PlannerImpl and steal some code if you need to circumvent those states, but I agree this is probably the easiest way to go about it. The alternative is just creating a par

Re: Difference between sqlnode and relnode and rexnode

2016-08-16 Thread jordan.halter...@gmail.com
SqlNode is the abstract syntax tree that represents the actual structure of the query a user input. When a query is first parsed, it's parsed into a SqlNode. For example, a SELECT query will be parsed into a SqlSelect with a list of fields, a table, a join, etc. Calcite is also capable of genera

Re: Question on partial filter push down

2016-08-03 Thread jordan.halter...@gmail.com
ge predicates if the partition > key is already restricted by an equality predicate and the range predicate > is part o the clustering key. > > Cheers, > -- > Michael Mior > michael.m...@gmail.com > > 2016-08-03 14:21 GMT-04:00 jordan.halter...@gmail.com < > jordan.

Re: Question on partial filter push down

2016-08-03 Thread jordan.halter...@gmail.com
It's just a matter of splitting out equality predicates. What I would do is create a Filter rule that splits the filter based on whether a predicate is an equality. If the Filter is split, the rule returns a Filter with the equality predicates as inputs to the non-equality predicates. That shoul

Re: Computing Partial Aggregates for UNION-ALL

2016-06-29 Thread jordan.halter...@gmail.com
I'm no Calcite expert (yet) but I have a few suggestions based on my own experience with using Planner and digging through its code. Keep in mind that there are surely better people to explain this around here, but I'll do my best based on what I've learned... When using Planner, you shouldn't