Column Ordinal Support in GROUP BY

2016-12-14 Thread Kapil Ghodawat
Hi Folks,

I am using Apache Calcite to add SQL support for my custom data source. I
receive SQLs from external SQL clients like Tableau and I have written an
ODBC client for connecting to my data source, which basically delegates
these SQLs (generated by Tableau) to my Java program where I use Calcite to
execute them.

I am stuck at a place where I receive SQL which has GROUP BY clause that
uses column ordinals instead of columns names and when I try to validate
(parsing works fine) such SQLs through Calcite it fails.

E.g. SELECT prod_name, store_name, sum(sales) from sales_table GROUP BY 1, 2

(The SQL says GROUP BY on prod_name and store_name fields which are column
ordinal 1 and 2 respectively in the SELECT clause)


I tried delving into the code of calcite and per my understanding I believe
that usage of column ordinals is supported in ORDER BY clause but not in
the GROUP BY.

Primarily, I want to know Is my understanding about no support of column
ordinals in GROUP BY correct? Or is there something I am missing?

Secondarily, if anyone knows about ODBC, is there a way I can tell/force
the Tableau to send me column names instead of column ordinals in the SQL?


-- 
Best Regards,
Kapil Ghodawat
contact: +91 94254 86638


Re: Column Ordinal Support in GROUP BY

2016-12-14 Thread Gian Merlino
With Tableau you can use a TDC file to tweak the SQL generation. If you set
CAP_QUERY_GROUP_BY_ALIAS and CAP_QUERY_GROUP_BY_DEGREE to "no" then Tableau
will avoid using aliases and ordinals for GROUP BY. See
http://kb.tableau.com/articles/knowledgebase/customizing-odbc-connections
for more info.

The relevant Calcite issue is
https://issues.apache.org/jira/browse/CALCITE-1306, but if all you care
about is getting Tableau to work then the TDC approach should do it.

Gian

On Wed, Dec 14, 2016 at 4:48 AM, Kapil Ghodawat 
wrote:

> Hi Folks,
>
> I am using Apache Calcite to add SQL support for my custom data source. I
> receive SQLs from external SQL clients like Tableau and I have written an
> ODBC client for connecting to my data source, which basically delegates
> these SQLs (generated by Tableau) to my Java program where I use Calcite to
> execute them.
>
> I am stuck at a place where I receive SQL which has GROUP BY clause that
> uses column ordinals instead of columns names and when I try to validate
> (parsing works fine) such SQLs through Calcite it fails.
>
> E.g. SELECT prod_name, store_name, sum(sales) from sales_table GROUP BY 1,
> 2
>
> (The SQL says GROUP BY on prod_name and store_name fields which are column
> ordinal 1 and 2 respectively in the SELECT clause)
>
>
> I tried delving into the code of calcite and per my understanding I believe
> that usage of column ordinals is supported in ORDER BY clause but not in
> the GROUP BY.
>
> Primarily, I want to know Is my understanding about no support of column
> ordinals in GROUP BY correct? Or is there something I am missing?
>
> Secondarily, if anyone knows about ODBC, is there a way I can tell/force
> the Tableau to send me column names instead of column ordinals in the SQL?
>
>
> --
> Best Regards,
> Kapil Ghodawat
> contact: +91 94254 86638
>


Accessing database metadata

2016-12-14 Thread Riccardo Tommasini
Hi all,

I am trying to access db metadata through calcite. Playing around with the 
sqline interface and checking the code, I understood that calcite accesses 
those some metadata using the dedicate jdbc driver. However, some of them are 
non
available yet.

Is there a way to access primary keys for a table for instance?

thanks in advance!

Riccardo Tommasini
Master Degree Computer Science
PhD Student at Politecnico di Milano (Italy)
streamreasoning.org

Submitted from an iPhone, I apologise for typos.


Re: Accessing database metadata

2016-12-14 Thread Julian Hyde
There isn’t a way of doing this currently. But I can see that federating 
metadata is useful.

The only information that Calcite accesses via its schema SPI is what it needs 
to prepare queries. I could see us adding an API so that 
Schema.unwrap(DatabaseMetaData.class) yields a DatabaseMetadata. Then Calcite’s 
JDBC driver could delegate to it. Log a JIRA case if this is of interest to you.


> On Dec 14, 2016, at 7:54 AM, Riccardo Tommasini 
>  wrote:
> 
> Hi all,
> 
> I am trying to access db metadata through calcite. Playing around with the 
> sqline interface and checking the code, I understood that calcite accesses 
> those some metadata using the dedicate jdbc driver. However, some of them are 
> non
> available yet.
> 
> Is there a way to access primary keys for a table for instance?
> 
> thanks in advance!
> 
> Riccardo Tommasini
> Master Degree Computer Science
> PhD Student at Politecnico di Milano (Italy)
> streamreasoning.org
> 
> Submitted from an iPhone, I apologise for typos.



Re: Accessing database metadata

2016-12-14 Thread Riccardo Tommasini
Hi Julian,
thanks for the quick answer.

I will do it.



Riccardo Tommasini
Master Degree Computer Science
PhD Student at Politecnico di Milano (Italy)
streamreasoning.org

Submitted from an iPhone, I apologise for typos.

From: Julian Hyde 
Reply: dev@calcite.apache.org 

Date: 14 December 2016 at 17:36:27
To: dev@calcite.apache.org 

Subject:  Re: Accessing database metadata

There isn’t a way of doing this currently. But I can see that federating 
metadata is useful.

The only information that Calcite accesses via its schema SPI is what it needs 
to prepare queries. I could see us adding an API so that 
Schema.unwrap(DatabaseMetaData.class) yields a DatabaseMetadata. Then Calcite’s 
JDBC driver could delegate to it. Log a JIRA case if this is of interest to you.


> On Dec 14, 2016, at 7:54 AM, Riccardo Tommasini 
>  wrote:
>
> Hi all,
>
> I am trying to access db metadata through calcite. Playing around with the 
> sqline interface and checking the code, I understood that calcite accesses 
> those some metadata using the dedicate jdbc driver. However, some of them are 
> non
> available yet.
>
> Is there a way to access primary keys for a table for instance?
>
> thanks in advance!
>
> Riccardo Tommasini
> Master Degree Computer Science
> PhD Student at Politecnico di Milano (Italy)
> streamreasoning.org
>
> Submitted from an iPhone, I apologise for typos.



Need help regarding ordinal expressions in GROUP BY clause

2016-12-14 Thread Kushal Chavada
Hi,


I have written simple query planner class in java which takes string SQL as an 
input & generates logical plan using Calcite.

Input SQLs contain queries having ordinal expressions in GROUP BY clause. e.g.

"SELECT x, sum(y) FROM t GROUP BY 1"
On validation, I get this validation exception:
SEVERE: org.apache.calcite.runtime.CalciteContextException:  
Expression 't.x' is not being grouped
When I replace "GROUP BY 1" with "GROUP BY x", it works fine. Whereas ordinal 
expressions works fine with ORDER BY clause.
So, am I missing anything like configuration properties, etc.
I am using ORACLE as lexical policy.


With Regards,
Kushal









NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: Need help regarding ordinal expressions in GROUP BY clause

2016-12-14 Thread Julian Hyde
I’ll refer you to the “Column Ordinal Support in GROUP BY” thread, and Gian’s 
excellent answer[1].

Julian

[1] 
https://mail-archives.apache.org/mod_mbox/calcite-dev/201612.mbox/%3CCACZNdYAupYPCLLCV8aDWydc8p3gch4fgt%2B8-VvCLrZQxp1BpXg%40mail.gmail.com%3E
 


> On Dec 13, 2016, at 2:15 AM, Kushal Chavada  
> wrote:
> 
> Hi,
> 
> 
> I have written simple query planner class in java which takes string SQL as 
> an input & generates logical plan using Calcite.
> 
> Input SQLs contain queries having ordinal expressions in GROUP BY clause. e.g.
> 
>"SELECT x, sum(y) FROM t GROUP BY 1"
> On validation, I get this validation exception:
>SEVERE: org.apache.calcite.runtime.CalciteContextException:  
> Expression 't.x' is not being grouped
> When I replace "GROUP BY 1" with "GROUP BY x", it works fine. Whereas ordinal 
> expressions works fine with ORDER BY clause.
> So, am I missing anything like configuration properties, etc.
> I am using ORACLE as lexical policy.
> 
> 
> With Regards,
> Kushal
> 
> 
> 
> 
> 
> 
> 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, 
> privileged or otherwise protected by law. The message is intended solely for 
> the named addressee. If received in error, please destroy and notify the 
> sender. Any use of this email is prohibited when received in error. Impetus 
> does not represent, warrant and/or guarantee, that the integrity of this 
> communication has been maintained nor that the communication is free of 
> errors, virus, interception or interference.



Re: User defined functions with non scalar return type

2016-12-14 Thread Julian Hyde
Calcite doesn’t support that currently. Here’s what it would take: First, 
complete the support for user-defined types. We already allow user-defined 
types internally (see how MockCatalogReader creates a structured type “Address” 
and how SqlValidatorTest.testStructuredTypes uses it) but they’re not fully 
supported. Second, define a mapping between Java types and SQL types so that 
you can declare a user-defined function or user-defined aggregate function, the 
validator will see that function’s Java return type and deduce the SQL type. 
Maybe that mapping could be done using Java annotations.

If that’s interesting to you please log a JIRA case.

Julian

> On Dec 9, 2016, at 1:22 AM, Julian Feinauer  
> wrote:
> 
> Hey Guys,
> 
> 
> 
> first of all, great work!
> 
> I’m following the Calcite project since I worked a lot with Drill about 8 
> months ago and I’m very pleased about how everything has developed.
> 
> We are currently trying to replace our self-made SQL Parser and Engine with 
> Calcite.
> 
> But I have one question as we need to create UDFs which do not only return a 
> single scalar value.
> 
> 
> 
> In fact, we do some kind of mapping from one table (that is processed in a 
> streaming fashion like the AggregateFunctions do) to another Table (or even 
> more complex nested structure).
> 
> Currently I do not see a way to do this as the AggregateFunctions do only 
> allow to return scalar primitive types, or?
> 
> All my trials in returning different types resulted in Compile Errors from 
> the Code generation.
> 
> 
> 
> On the other hand, the TableFunctions do not seem appropriate to me as they 
> can only use parameters as input and not Columns that are passed row by row, 
> or?
> 
> 
> 
> I would really appreciate any suggestions or help for this problem.
> 
> I’m also open for a discussion about how to implement such a feature in 
> Calcite.
> 
> 
> 
> Thank you already!
> 
> Julian
> 



Re: Embedding Calcite, adjusting convertlets

2016-12-14 Thread Gian Merlino
I spent some more time looking into (3) and found that when I had things
going through the Planner rather than the JDBC driver, SortRemoveRule was
removing sorts when it shouldn't have been. This happens even for simple
queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
DESC". Removing SortRemoveRule from the planner fixed the broken tests on
my end.

I dug into that a bit and saw that the call to "convert(sort.getInput(),
traits)" in SortRemoveRule was returning a RelSubset that looked a bit
funny in the debugger:

- The RelSubset's "traitSet" _does_ have the proper collation trait.
- But its "set" field points to a RelSet with "rels" that _don't_ have
_any_ collation traits.

>From what I understand that causes Calcite to treat the unsorted and sorted
rels as equivalent when they in fact aren't. I'm still not sure if this is
a Calcite bug or user error on my part… I'll keep looking into it unless
someone has any bright ideas.

fwiw, my Planner construction looks like this:

final FrameworkConfig frameworkConfig = Frameworks
.newConfigBuilder()
.parserConfig(
SqlParser.configBuilder()
 .setCaseSensitive(true)
 .setUnquotedCasing(Casing.UNCHANGED)
 .build()
)
.defaultSchema(rootSchema)
.traitDefs(ConventionTraitDef.INSTANCE,
RelCollationTraitDef.INSTANCE)
.programs(Programs.ofRules(myRules))
.executor(new RexExecutorImpl(Schemas.createDataContext(null)))
.context(Contexts.EMPTY_CONTEXT)
.build();

return Frameworks.getPlanner(frameworkConfig);

Gian

On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino  wrote:

> Sure, I added those first two to the ticket.
>
> I don't think those are happening with (3) but I'll double check next time
> I take a look at using the Planner.
>
> Gian
>
> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde  wrote:
>
>> Can you please add (1) and (2) to https://issues.apache.org/jira
>> /browse/CALCITE-1525 ,
>> which deals with the whole issue of using “Planner” within the JDBC driver,
>> so we can be consistent.
>>
>> (3) doesn’t look likely to be related. Do your queries have UNION or
>> other set-ops? Are you sorting on columns that do not appear in the final
>> result?
>>
>> Julian
>>
>>
>> > On Nov 28, 2016, at 10:45 AM, Gian Merlino  wrote:
>> >
>> > I traveled a bit down the Frameworks/Planner road and got most of my
>> tests
>> > passing, but ran into some problems getting them all to work:
>> >
>> > (1) "EXPLAIN PLAN FOR" throws NullPointerException during
>> Planner.validate.
>> > It looks like CalcitePrepareImpl has some special code to handle
>> validation
>> > of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is something I
>> > should be doing on my end, or if it's a bug in PlannerImpl.
>> > (2) I don't see a way to do ?-style prepared statements with bound
>> > variables, which _is_ possible with the JDBC driver route.
>> > (3) Not sure why this is happening, but for some reason ORDER BY / LIMIT
>> > clauses are getting ignored sometimes, even when they work with the JDBC
>> > driver route. This may be something messed up with my rules though and
>> may
>> > not be Calcite's fault.
>> >
>> > Julian, do any of these look like bugs that should be raised in jira, or
>> > are they just stuff I should be dealing with on my side?
>> >
>> > Btw, I do like that the Frameworks/Planner route gives me back the
>> RelNode
>> > itself, since that means I can make the Druid queries directly without
>> > needing to go through the extra layers of the JDBC driver. That part is
>> > nice.
>> >
>> > Gian
>> >
>> > On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde  wrote:
>> >
>> >> I don’t know how it’s used outside Calcite. Maybe some others can
>> chime in.
>> >>
>> >> Thanks for the PR. I logged https://issues.apache.org/jira
>> >> /browse/CALCITE-1509 > a/browse/CALCITE-1509>
>> >> for it, and will commit shortly.
>> >>
>> >> Julian
>> >>
>> >>> On Nov 23, 2016, at 12:32 PM, Gian Merlino  wrote:
>> >>>
>> >>> Do you know examples of projects that use Planner or PlannerImpl
>> >> currently
>> >>> (from "outside")? As far as I can tell, within Calcite itself it's
>> only
>> >>> used in test code. Maybe that'd be a better entry point.
>> >>>
>> >>> In the meantime I raised a PR here for allowing a convertlet table
>> >> override
>> >>> in a CalcitePrepareImpl: https://github.com/apache/calcite/pull/330.
>> >> That
>> >>> was enough to get the JDBC driver on my end to behave how I want it
>> to.
>> >>>
>> >>> Gian
>> >>>
>> >>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde 
>> wrote:
>> >>>
>>  I was wrong earlier… FrameworkConfig already has a getConvertletTable
>>  method. But regarding using FrameworkConfig from within the JDBC
>> driver,
>>  It’s complicated. FrameworkConfig only works if you are “outside”
>> >> Calcite,
>>  wh

Re: Embedding Calcite, adjusting convertlets

2016-12-14 Thread Julian Hyde
> - But its "set" field points to a RelSet with "rels" that _don't_ have
> _any_ collation traits.

That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are 
logically and physically equivalent (same results, same physical properties) 
whereas a “set” (RelSet) is a collection of RelNodes that are logically 
equivalent.

A set can therefore be considered to be a collection of subsets, each of which 
contains RelNodes. And it used to be implemented that way, but in 
https://issues.apache.org/jira/browse/CALCITE-88 
 we introduced collation as a 
trait, and that made subsets non-disjoint (a RelNode can be sorted on (x, y), 
and also on (x), and also on (), and also on (z)) so we made RelSubset just a 
view onto a RelSet, filtering the list of RelNodes according to the ones that 
have (“subsume”) the desired traits.

Julian


> On Dec 14, 2016, at 10:45 AM, Gian Merlino  wrote:
> 
> I spent some more time looking into (3) and found that when I had things
> going through the Planner rather than the JDBC driver, SortRemoveRule was
> removing sorts when it shouldn't have been. This happens even for simple
> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
> DESC". Removing SortRemoveRule from the planner fixed the broken tests on
> my end.
> 
> I dug into that a bit and saw that the call to "convert(sort.getInput(),
> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
> funny in the debugger:
> 
> - The RelSubset's "traitSet" _does_ have the proper collation trait.
> - But its "set" field points to a RelSet with "rels" that _don't_ have
> _any_ collation traits.
> 
> From what I understand that causes Calcite to treat the unsorted and sorted
> rels as equivalent when they in fact aren't. I'm still not sure if this is
> a Calcite bug or user error on my part… I'll keep looking into it unless
> someone has any bright ideas.
> 
> fwiw, my Planner construction looks like this:
> 
>final FrameworkConfig frameworkConfig = Frameworks
>.newConfigBuilder()
>.parserConfig(
>SqlParser.configBuilder()
> .setCaseSensitive(true)
> .setUnquotedCasing(Casing.UNCHANGED)
> .build()
>)
>.defaultSchema(rootSchema)
>.traitDefs(ConventionTraitDef.INSTANCE,
> RelCollationTraitDef.INSTANCE)
>.programs(Programs.ofRules(myRules))
>.executor(new RexExecutorImpl(Schemas.createDataContext(null)))
>.context(Contexts.EMPTY_CONTEXT)
>.build();
> 
>return Frameworks.getPlanner(frameworkConfig);
> 
> Gian
> 
> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino  wrote:
> 
>> Sure, I added those first two to the ticket.
>> 
>> I don't think those are happening with (3) but I'll double check next time
>> I take a look at using the Planner.
>> 
>> Gian
>> 
>> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde  wrote:
>> 
>>> Can you please add (1) and (2) to https://issues.apache.org/jira
>>> /browse/CALCITE-1525 ,
>>> which deals with the whole issue of using “Planner” within the JDBC driver,
>>> so we can be consistent.
>>> 
>>> (3) doesn’t look likely to be related. Do your queries have UNION or
>>> other set-ops? Are you sorting on columns that do not appear in the final
>>> result?
>>> 
>>> Julian
>>> 
>>> 
 On Nov 28, 2016, at 10:45 AM, Gian Merlino  wrote:
 
 I traveled a bit down the Frameworks/Planner road and got most of my
>>> tests
 passing, but ran into some problems getting them all to work:
 
 (1) "EXPLAIN PLAN FOR" throws NullPointerException during
>>> Planner.validate.
 It looks like CalcitePrepareImpl has some special code to handle
>>> validation
 of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is something I
 should be doing on my end, or if it's a bug in PlannerImpl.
 (2) I don't see a way to do ?-style prepared statements with bound
 variables, which _is_ possible with the JDBC driver route.
 (3) Not sure why this is happening, but for some reason ORDER BY / LIMIT
 clauses are getting ignored sometimes, even when they work with the JDBC
 driver route. This may be something messed up with my rules though and
>>> may
 not be Calcite's fault.
 
 Julian, do any of these look like bugs that should be raised in jira, or
 are they just stuff I should be dealing with on my side?
 
 Btw, I do like that the Frameworks/Planner route gives me back the
>>> RelNode
 itself, since that means I can make the Druid queries directly without
 needing to go through the extra layers of the JDBC driver. That part is
 nice.
 
 Gian
 
 On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde  wrote:
 
> I don’t know how it’s used outside Calcite. Maybe some others can
>>> chime in.
> 
> Thanks for the PR. I logged https://issues.apache.org/jira

Re: Embedding Calcite, adjusting convertlets

2016-12-14 Thread Gian Merlino
Ah, thanks. So if that sort of thing is not a smoking gun, do you have an
idea about where I should look next? If not I'll keep poking around.

Gian

On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde  wrote:

> > - But its "set" field points to a RelSet with "rels" that _don't_ have
> > _any_ collation traits.
>
> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
> logically and physically equivalent (same results, same physical
> properties) whereas a “set” (RelSet) is a collection of RelNodes that are
> logically equivalent.
>
> A set can therefore be considered to be a collection of subsets, each of
> which contains RelNodes. And it used to be implemented that way, but in
> https://issues.apache.org/jira/browse/CALCITE-88 <
> https://issues.apache.org/jira/browse/CALCITE-88> we introduced collation
> as a trait, and that made subsets non-disjoint (a RelNode can be sorted on
> (x, y), and also on (x), and also on (), and also on (z)) so we made
> RelSubset just a view onto a RelSet, filtering the list of RelNodes
> according to the ones that have (“subsume”) the desired traits.
>
> Julian
>
>
> > On Dec 14, 2016, at 10:45 AM, Gian Merlino  wrote:
> >
> > I spent some more time looking into (3) and found that when I had things
> > going through the Planner rather than the JDBC driver, SortRemoveRule was
> > removing sorts when it shouldn't have been. This happens even for simple
> > queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
> > DESC". Removing SortRemoveRule from the planner fixed the broken tests on
> > my end.
> >
> > I dug into that a bit and saw that the call to "convert(sort.getInput(),
> > traits)" in SortRemoveRule was returning a RelSubset that looked a bit
> > funny in the debugger:
> >
> > - The RelSubset's "traitSet" _does_ have the proper collation trait.
> > - But its "set" field points to a RelSet with "rels" that _don't_ have
> > _any_ collation traits.
> >
> > From what I understand that causes Calcite to treat the unsorted and
> sorted
> > rels as equivalent when they in fact aren't. I'm still not sure if this
> is
> > a Calcite bug or user error on my part… I'll keep looking into it unless
> > someone has any bright ideas.
> >
> > fwiw, my Planner construction looks like this:
> >
> >final FrameworkConfig frameworkConfig = Frameworks
> >.newConfigBuilder()
> >.parserConfig(
> >SqlParser.configBuilder()
> > .setCaseSensitive(true)
> > .setUnquotedCasing(Casing.UNCHANGED)
> > .build()
> >)
> >.defaultSchema(rootSchema)
> >.traitDefs(ConventionTraitDef.INSTANCE,
> > RelCollationTraitDef.INSTANCE)
> >.programs(Programs.ofRules(myRules))
> >.executor(new RexExecutorImpl(Schemas.createDataContext(null)))
> >.context(Contexts.EMPTY_CONTEXT)
> >.build();
> >
> >return Frameworks.getPlanner(frameworkConfig);
> >
> > Gian
> >
> > On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino  wrote:
> >
> >> Sure, I added those first two to the ticket.
> >>
> >> I don't think those are happening with (3) but I'll double check next
> time
> >> I take a look at using the Planner.
> >>
> >> Gian
> >>
> >> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde  wrote:
> >>
> >>> Can you please add (1) and (2) to https://issues.apache.org/jira
> >>> /browse/CALCITE-1525  a/browse/CALCITE-1525>,
> >>> which deals with the whole issue of using “Planner” within the JDBC
> driver,
> >>> so we can be consistent.
> >>>
> >>> (3) doesn’t look likely to be related. Do your queries have UNION or
> >>> other set-ops? Are you sorting on columns that do not appear in the
> final
> >>> result?
> >>>
> >>> Julian
> >>>
> >>>
>  On Nov 28, 2016, at 10:45 AM, Gian Merlino  wrote:
> 
>  I traveled a bit down the Frameworks/Planner road and got most of my
> >>> tests
>  passing, but ran into some problems getting them all to work:
> 
>  (1) "EXPLAIN PLAN FOR" throws NullPointerException during
> >>> Planner.validate.
>  It looks like CalcitePrepareImpl has some special code to handle
> >>> validation
>  of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is
> something I
>  should be doing on my end, or if it's a bug in PlannerImpl.
>  (2) I don't see a way to do ?-style prepared statements with bound
>  variables, which _is_ possible with the JDBC driver route.
>  (3) Not sure why this is happening, but for some reason ORDER BY /
> LIMIT
>  clauses are getting ignored sometimes, even when they work with the
> JDBC
>  driver route. This may be something messed up with my rules though and
> >>> may
>  not be Calcite's fault.
> 
>  Julian, do any of these look like bugs that should be raised in jira,
> or
>  are they just stuff I should be dealing with on my side?
> 
>  Btw, I do like that the Frameworks/Planner route gives me back the
> >>> Rel

Query preparation lifecycle

2016-12-14 Thread Julian Hyde
I’m doing some thinking (and prototyping) about the query preparation 
lifecycle. Preparing a query requires lots of pieces of information (type 
factory, cost factory, a list of planner rules, metadata providers, executor 
for reducing scalar expressions, configuration, and many more) and depending on 
how you use the Calcite framework (through a JDBC driver or using Frameworks or 
otherwise) you provide those pieces of information in different ways and 
different times.

Can people chime in on https://issues.apache.org/jira/browse/CALCITE-1536 
.

Issue https://issues.apache.org/jira/browse/CALCITE-1499 
 precedes 1536 (i.e. is 
less complicated) and https://issues.apache.org/jira/browse/CALCITE-1525 
 depends on it (i.e. is 
more ambitious). But answering questions such as “what is the latest point in 
the planning process that I can change the list of active traits?” will help.

Julian



Re: Embedding Calcite, adjusting convertlets

2016-12-14 Thread Julian Hyde
Are you running into some variant of the problems that inspired 
https://issues.apache.org/jira/browse/CALCITE-819: 
 at the root of the tree, 
columns that are not projected are removed, and if the desired sort order 
involves non-projected columns, the desired sort order is forgotten).
 
> On Dec 14, 2016, at 11:19 AM, Gian Merlino  wrote:
> 
> Ah, thanks. So if that sort of thing is not a smoking gun, do you have an
> idea about where I should look next? If not I'll keep poking around.
> 
> Gian
> 
> On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde  wrote:
> 
>>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> _any_ collation traits.
>> 
>> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
>> logically and physically equivalent (same results, same physical
>> properties) whereas a “set” (RelSet) is a collection of RelNodes that are
>> logically equivalent.
>> 
>> A set can therefore be considered to be a collection of subsets, each of
>> which contains RelNodes. And it used to be implemented that way, but in
>> https://issues.apache.org/jira/browse/CALCITE-88 <
>> https://issues.apache.org/jira/browse/CALCITE-88> we introduced collation
>> as a trait, and that made subsets non-disjoint (a RelNode can be sorted on
>> (x, y), and also on (x), and also on (), and also on (z)) so we made
>> RelSubset just a view onto a RelSet, filtering the list of RelNodes
>> according to the ones that have (“subsume”) the desired traits.
>> 
>> Julian
>> 
>> 
>>> On Dec 14, 2016, at 10:45 AM, Gian Merlino  wrote:
>>> 
>>> I spent some more time looking into (3) and found that when I had things
>>> going through the Planner rather than the JDBC driver, SortRemoveRule was
>>> removing sorts when it shouldn't have been. This happens even for simple
>>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
>>> DESC". Removing SortRemoveRule from the planner fixed the broken tests on
>>> my end.
>>> 
>>> I dug into that a bit and saw that the call to "convert(sort.getInput(),
>>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
>>> funny in the debugger:
>>> 
>>> - The RelSubset's "traitSet" _does_ have the proper collation trait.
>>> - But its "set" field points to a RelSet with "rels" that _don't_ have
>>> _any_ collation traits.
>>> 
>>> From what I understand that causes Calcite to treat the unsorted and
>> sorted
>>> rels as equivalent when they in fact aren't. I'm still not sure if this
>> is
>>> a Calcite bug or user error on my part… I'll keep looking into it unless
>>> someone has any bright ideas.
>>> 
>>> fwiw, my Planner construction looks like this:
>>> 
>>>   final FrameworkConfig frameworkConfig = Frameworks
>>>   .newConfigBuilder()
>>>   .parserConfig(
>>>   SqlParser.configBuilder()
>>>.setCaseSensitive(true)
>>>.setUnquotedCasing(Casing.UNCHANGED)
>>>.build()
>>>   )
>>>   .defaultSchema(rootSchema)
>>>   .traitDefs(ConventionTraitDef.INSTANCE,
>>> RelCollationTraitDef.INSTANCE)
>>>   .programs(Programs.ofRules(myRules))
>>>   .executor(new RexExecutorImpl(Schemas.createDataContext(null)))
>>>   .context(Contexts.EMPTY_CONTEXT)
>>>   .build();
>>> 
>>>   return Frameworks.getPlanner(frameworkConfig);
>>> 
>>> Gian
>>> 
>>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino  wrote:
>>> 
 Sure, I added those first two to the ticket.
 
 I don't think those are happening with (3) but I'll double check next
>> time
 I take a look at using the Planner.
 
 Gian
 
 On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde  wrote:
 
> Can you please add (1) and (2) to https://issues.apache.org/jira
> /browse/CALCITE-1525 > a/browse/CALCITE-1525>,
> which deals with the whole issue of using “Planner” within the JDBC
>> driver,
> so we can be consistent.
> 
> (3) doesn’t look likely to be related. Do your queries have UNION or
> other set-ops? Are you sorting on columns that do not appear in the
>> final
> result?
> 
> Julian
> 
> 
>> On Nov 28, 2016, at 10:45 AM, Gian Merlino  wrote:
>> 
>> I traveled a bit down the Frameworks/Planner road and got most of my
> tests
>> passing, but ran into some problems getting them all to work:
>> 
>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during
> Planner.validate.
>> It looks like CalcitePrepareImpl has some special code to handle
> validation
>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is
>> something I
>> should be doing on my end, or if it's a bug in PlannerImpl.
>> (2) I don't see a way to do ?-style prepared statements with bound
>> variables, which _is_ possible with the JDBC driver route.
>> (3) Not sure why this is happening, but for some reason ORDER BY /
>> L

Towards Calcite release 1.11

2016-12-14 Thread Julian Hyde
When should we be targeting to release Calcite version 1.11?

Last release (1.10) was 12th October. Usually we release every 2 months, so 
we’re already overdue. Since 1.10 we’ve had 76 commits, and about 60 bug fixes 
(not counting Avatica work).

What bug fixes/features would people like to contribute to the release? What 
are your time constraints?

Julian



Re: Embedding Calcite, adjusting convertlets

2016-12-14 Thread Gian Merlino
I don't think so, since my test query (SELECT dim1 FROM s.foo GROUP BY dim1
ORDER BY dim1 DESC) is sorting on a column that is also projected.

Gian

On Wed, Dec 14, 2016 at 11:58 AM, Julian Hyde  wrote:

> Are you running into some variant of the problems that inspired
> https://issues.apache.org/jira/browse/CALCITE-819: <
> https://issues.apache.org/jira/browse/CALCITE-819:> at the root of the
> tree, columns that are not projected are removed, and if the desired sort
> order involves non-projected columns, the desired sort order is forgotten).
>
> > On Dec 14, 2016, at 11:19 AM, Gian Merlino  wrote:
> >
> > Ah, thanks. So if that sort of thing is not a smoking gun, do you have an
> > idea about where I should look next? If not I'll keep poking around.
> >
> > Gian
> >
> > On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde  wrote:
> >
> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
> >>> _any_ collation traits.
> >>
> >> That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are
> >> logically and physically equivalent (same results, same physical
> >> properties) whereas a “set” (RelSet) is a collection of RelNodes that
> are
> >> logically equivalent.
> >>
> >> A set can therefore be considered to be a collection of subsets, each of
> >> which contains RelNodes. And it used to be implemented that way, but in
> >> https://issues.apache.org/jira/browse/CALCITE-88 <
> >> https://issues.apache.org/jira/browse/CALCITE-88> we introduced
> collation
> >> as a trait, and that made subsets non-disjoint (a RelNode can be sorted
> on
> >> (x, y), and also on (x), and also on (), and also on (z)) so we made
> >> RelSubset just a view onto a RelSet, filtering the list of RelNodes
> >> according to the ones that have (“subsume”) the desired traits.
> >>
> >> Julian
> >>
> >>
> >>> On Dec 14, 2016, at 10:45 AM, Gian Merlino  wrote:
> >>>
> >>> I spent some more time looking into (3) and found that when I had
> things
> >>> going through the Planner rather than the JDBC driver, SortRemoveRule
> was
> >>> removing sorts when it shouldn't have been. This happens even for
> simple
> >>> queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1
> >>> DESC". Removing SortRemoveRule from the planner fixed the broken tests
> on
> >>> my end.
> >>>
> >>> I dug into that a bit and saw that the call to
> "convert(sort.getInput(),
> >>> traits)" in SortRemoveRule was returning a RelSubset that looked a bit
> >>> funny in the debugger:
> >>>
> >>> - The RelSubset's "traitSet" _does_ have the proper collation trait.
> >>> - But its "set" field points to a RelSet with "rels" that _don't_ have
> >>> _any_ collation traits.
> >>>
> >>> From what I understand that causes Calcite to treat the unsorted and
> >> sorted
> >>> rels as equivalent when they in fact aren't. I'm still not sure if this
> >> is
> >>> a Calcite bug or user error on my part… I'll keep looking into it
> unless
> >>> someone has any bright ideas.
> >>>
> >>> fwiw, my Planner construction looks like this:
> >>>
> >>>   final FrameworkConfig frameworkConfig = Frameworks
> >>>   .newConfigBuilder()
> >>>   .parserConfig(
> >>>   SqlParser.configBuilder()
> >>>.setCaseSensitive(true)
> >>>.setUnquotedCasing(Casing.UNCHANGED)
> >>>.build()
> >>>   )
> >>>   .defaultSchema(rootSchema)
> >>>   .traitDefs(ConventionTraitDef.INSTANCE,
> >>> RelCollationTraitDef.INSTANCE)
> >>>   .programs(Programs.ofRules(myRules))
> >>>   .executor(new RexExecutorImpl(Schemas.createDataContext(null)))
> >>>   .context(Contexts.EMPTY_CONTEXT)
> >>>   .build();
> >>>
> >>>   return Frameworks.getPlanner(frameworkConfig);
> >>>
> >>> Gian
> >>>
> >>> On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino  wrote:
> >>>
>  Sure, I added those first two to the ticket.
> 
>  I don't think those are happening with (3) but I'll double check next
> >> time
>  I take a look at using the Planner.
> 
>  Gian
> 
>  On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde 
> wrote:
> 
> > Can you please add (1) and (2) to https://issues.apache.org/jira
> > /browse/CALCITE-1525  >> a/browse/CALCITE-1525>,
> > which deals with the whole issue of using “Planner” within the JDBC
> >> driver,
> > so we can be consistent.
> >
> > (3) doesn’t look likely to be related. Do your queries have UNION or
> > other set-ops? Are you sorting on columns that do not appear in the
> >> final
> > result?
> >
> > Julian
> >
> >
> >> On Nov 28, 2016, at 10:45 AM, Gian Merlino  wrote:
> >>
> >> I traveled a bit down the Frameworks/Planner road and got most of my
> > tests
> >> passing, but ran into some problems getting them all to work:
> >>
> >> (1) "EXPLAIN PLAN FOR" throws NullPointerException during
> > Planner.validate.
> >> It looks like CalcitePrep

[jira] [Created] (CALCITE-1539) Enable proxy access to Avatica server for third party on behalf of end users

2016-12-14 Thread Jerry He (JIRA)
Jerry He created CALCITE-1539:
-

 Summary: Enable proxy access to Avatica server for third party on 
behalf of end users
 Key: CALCITE-1539
 URL: https://issues.apache.org/jira/browse/CALCITE-1539
 Project: Calcite
  Issue Type: Improvement
Reporter: Jerry He
Assignee: Julian Hyde


We want to enable proxy access to Avatica server from an end user, but the end 
user comes in via a third party impersonation.  For example, Knox and Hue.
The Knox server user conveys the end user to Avatica.

Similar things have been done for HBase Rest Sever HBASE-9866 and Hive Server 
HIVE-5155




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CALCITE-1540) window function does not support multiple partition columns

2016-12-14 Thread hongbin ma (JIRA)
hongbin ma created CALCITE-1540:
---

 Summary: window function does not support multiple partition 
columns
 Key: CALCITE-1540
 URL: https://issues.apache.org/jira/browse/CALCITE-1540
 Project: Calcite
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: hongbin ma
Assignee: Julian Hyde


When I add multiple partition columns to a window function:

{code:xml}

  @Test public void testSelect() throws SQLException {
checkSql("model", "select NAME,DEPTNO,count(*) over (partition by 
NAME,DEPTNO) from EMPS");
  }

{code}

Following exception is thrown:

java.sql.SQLException: Error while executing SQL "select NAME,DEPTNO,count(*) 
over (partition by NAME,DEPTNO) from EMPS": Error while compiling generated 
Java code:

However the document from calcite.apache.org/docs/reference.html states 
multiple partition columns should be allowed:

{code}
windowSpec:
  [ windowName ]
  '('
  [ ORDER BY orderItem [, orderItem ]* ]
  [ PARTITION BY expression [, expression ]* ]
  [
  RANGE numericOrIntervalExpression { PRECEDING | FOLLOWING }
  |   ROWS numericExpression { PRECEDING | FOLLOWING }
  ]
  ')'

{code}

After searching it seems no one is reporting the same issue. I'm opening this 
JIAR as a placeholder, will try to fix this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Column Ordinal Support in GROUP BY

2016-12-14 Thread Kapil Ghodawat
Thanks a lot Gian. I think this would make Tableau work.

But I am worried about connectivity from other SQL clients as i am planning
in near future connectivity from Microsoft Power BI

On Wed, Dec 14, 2016 at 9:00 PM, Gian Merlino  wrote:

> With Tableau you can use a TDC file to tweak the SQL generation. If you set
> CAP_QUERY_GROUP_BY_ALIAS and CAP_QUERY_GROUP_BY_DEGREE to "no" then Tableau
> will avoid using aliases and ordinals for GROUP BY. See
> http://kb.tableau.com/articles/knowledgebase/customizing-odbc-connections
> for more info.
>
> The relevant Calcite issue is
> https://issues.apache.org/jira/browse/CALCITE-1306, but if all you care
> about is getting Tableau to work then the TDC approach should do it.
>
> Gian
>
> On Wed, Dec 14, 2016 at 4:48 AM, Kapil Ghodawat 
> wrote:
>
> > Hi Folks,
> >
> > I am using Apache Calcite to add SQL support for my custom data source. I
> > receive SQLs from external SQL clients like Tableau and I have written an
> > ODBC client for connecting to my data source, which basically delegates
> > these SQLs (generated by Tableau) to my Java program where I use Calcite
> to
> > execute them.
> >
> > I am stuck at a place where I receive SQL which has GROUP BY clause that
> > uses column ordinals instead of columns names and when I try to validate
> > (parsing works fine) such SQLs through Calcite it fails.
> >
> > E.g. SELECT prod_name, store_name, sum(sales) from sales_table GROUP BY
> 1,
> > 2
> >
> > (The SQL says GROUP BY on prod_name and store_name fields which are
> column
> > ordinal 1 and 2 respectively in the SELECT clause)
> >
> >
> > I tried delving into the code of calcite and per my understanding I
> believe
> > that usage of column ordinals is supported in ORDER BY clause but not in
> > the GROUP BY.
> >
> > Primarily, I want to know Is my understanding about no support of column
> > ordinals in GROUP BY correct? Or is there something I am missing?
> >
> > Secondarily, if anyone knows about ODBC, is there a way I can tell/force
> > the Tableau to send me column names instead of column ordinals in the
> SQL?
> >
> >
> > --
> > Best Regards,
> > Kapil Ghodawat
> > contact: +91 94254 86638
> >
>



-- 
Best Regards,
Kapil Ghodawat
contact: +91 94254 86638


Re: Towards Calcite release 1.11

2016-12-14 Thread Arun Mahadevan
+1 for releasing 1.11 ASAP. It has some fixes that Apache Storm (storm-sql) is 
waiting on.

Thanks,
Arun

On 12/15/16, 3:21 AM, "Julian Hyde"  wrote:

>When should we be targeting to release Calcite version 1.11?
>
>Last release (1.10) was 12th October. Usually we release every 2 months, so 
>we’re already overdue. Since 1.10 we’ve had 76 commits, and about 60 bug fixes 
>(not counting Avatica work).
>
>What bug fixes/features would people like to contribute to the release? What 
>are your time constraints?
>
>Julian
>
>