Ah, thanks. So if that sort of thing is not a smoking gun, do you have an idea about where I should look next? If not I'll keep poking around.
Gian On Wed, Dec 14, 2016 at 11:06 AM, Julian Hyde <jh...@apache.org> wrote: > > - But its "set" field points to a RelSet with "rels" that _don't_ have > > _any_ collation traits. > > That’s OK. A “subset” (RelSubset) is a collection of RelNodes that are > logically and physically equivalent (same results, same physical > properties) whereas a “set” (RelSet) is a collection of RelNodes that are > logically equivalent. > > A set can therefore be considered to be a collection of subsets, each of > which contains RelNodes. And it used to be implemented that way, but in > https://issues.apache.org/jira/browse/CALCITE-88 < > https://issues.apache.org/jira/browse/CALCITE-88> we introduced collation > as a trait, and that made subsets non-disjoint (a RelNode can be sorted on > (x, y), and also on (x), and also on (), and also on (z)) so we made > RelSubset just a view onto a RelSet, filtering the list of RelNodes > according to the ones that have (“subsume”) the desired traits. > > Julian > > > > On Dec 14, 2016, at 10:45 AM, Gian Merlino <g...@imply.io> wrote: > > > > I spent some more time looking into (3) and found that when I had things > > going through the Planner rather than the JDBC driver, SortRemoveRule was > > removing sorts when it shouldn't have been. This happens even for simple > > queries like "SELECT dim1 FROM s.foo GROUP BY dim1 ORDER BY dim1 > > DESC". Removing SortRemoveRule from the planner fixed the broken tests on > > my end. > > > > I dug into that a bit and saw that the call to "convert(sort.getInput(), > > traits)" in SortRemoveRule was returning a RelSubset that looked a bit > > funny in the debugger: > > > > - The RelSubset's "traitSet" _does_ have the proper collation trait. > > - But its "set" field points to a RelSet with "rels" that _don't_ have > > _any_ collation traits. > > > > From what I understand that causes Calcite to treat the unsorted and > sorted > > rels as equivalent when they in fact aren't. I'm still not sure if this > is > > a Calcite bug or user error on my part… I'll keep looking into it unless > > someone has any bright ideas. > > > > fwiw, my Planner construction looks like this: > > > > final FrameworkConfig frameworkConfig = Frameworks > > .newConfigBuilder() > > .parserConfig( > > SqlParser.configBuilder() > > .setCaseSensitive(true) > > .setUnquotedCasing(Casing.UNCHANGED) > > .build() > > ) > > .defaultSchema(rootSchema) > > .traitDefs(ConventionTraitDef.INSTANCE, > > RelCollationTraitDef.INSTANCE) > > .programs(Programs.ofRules(myRules)) > > .executor(new RexExecutorImpl(Schemas.createDataContext(null))) > > .context(Contexts.EMPTY_CONTEXT) > > .build(); > > > > return Frameworks.getPlanner(frameworkConfig); > > > > Gian > > > > On Sat, Dec 3, 2016 at 5:53 PM, Gian Merlino <g...@imply.io> wrote: > > > >> Sure, I added those first two to the ticket. > >> > >> I don't think those are happening with (3) but I'll double check next > time > >> I take a look at using the Planner. > >> > >> Gian > >> > >> On Fri, Dec 2, 2016 at 12:20 PM, Julian Hyde <jh...@apache.org> wrote: > >> > >>> Can you please add (1) and (2) to https://issues.apache.org/jira > >>> /browse/CALCITE-1525 <https://issues.apache.org/jir > a/browse/CALCITE-1525>, > >>> which deals with the whole issue of using “Planner” within the JDBC > driver, > >>> so we can be consistent. > >>> > >>> (3) doesn’t look likely to be related. Do your queries have UNION or > >>> other set-ops? Are you sorting on columns that do not appear in the > final > >>> result? > >>> > >>> Julian > >>> > >>> > >>>> On Nov 28, 2016, at 10:45 AM, Gian Merlino <g...@imply.io> wrote: > >>>> > >>>> I traveled a bit down the Frameworks/Planner road and got most of my > >>> tests > >>>> passing, but ran into some problems getting them all to work: > >>>> > >>>> (1) "EXPLAIN PLAN FOR" throws NullPointerException during > >>> Planner.validate. > >>>> It looks like CalcitePrepareImpl has some special code to handle > >>> validation > >>>> of EXPLAIN, but PlannerImpl doesn't. I'm not sure if this is > something I > >>>> should be doing on my end, or if it's a bug in PlannerImpl. > >>>> (2) I don't see a way to do ?-style prepared statements with bound > >>>> variables, which _is_ possible with the JDBC driver route. > >>>> (3) Not sure why this is happening, but for some reason ORDER BY / > LIMIT > >>>> clauses are getting ignored sometimes, even when they work with the > JDBC > >>>> driver route. This may be something messed up with my rules though and > >>> may > >>>> not be Calcite's fault. > >>>> > >>>> Julian, do any of these look like bugs that should be raised in jira, > or > >>>> are they just stuff I should be dealing with on my side? > >>>> > >>>> Btw, I do like that the Frameworks/Planner route gives me back the > >>> RelNode > >>>> itself, since that means I can make the Druid queries directly without > >>>> needing to go through the extra layers of the JDBC driver. That part > is > >>>> nice. > >>>> > >>>> Gian > >>>> > >>>> On Wed, Nov 23, 2016 at 10:11 PM, Julian Hyde <jh...@apache.org> > wrote: > >>>> > >>>>> I don’t know how it’s used outside Calcite. Maybe some others can > >>> chime in. > >>>>> > >>>>> Thanks for the PR. I logged https://issues.apache.org/jira > >>>>> /browse/CALCITE-1509 <https://issues.apache.org/jir > >>> a/browse/CALCITE-1509> > >>>>> for it, and will commit shortly. > >>>>> > >>>>> Julian > >>>>> > >>>>>> On Nov 23, 2016, at 12:32 PM, Gian Merlino <g...@imply.io> wrote: > >>>>>> > >>>>>> Do you know examples of projects that use Planner or PlannerImpl > >>>>> currently > >>>>>> (from "outside")? As far as I can tell, within Calcite itself it's > >>> only > >>>>>> used in test code. Maybe that'd be a better entry point. > >>>>>> > >>>>>> In the meantime I raised a PR here for allowing a convertlet table > >>>>> override > >>>>>> in a CalcitePrepareImpl: https://github.com/apache/calcite/pull/330 > . > >>>>> That > >>>>>> was enough to get the JDBC driver on my end to behave how I want it > >>> to. > >>>>>> > >>>>>> Gian > >>>>>> > >>>>>> On Thu, Nov 17, 2016 at 5:23 PM, Julian Hyde <jh...@apache.org> > >>> wrote: > >>>>>> > >>>>>>> I was wrong earlier… FrameworkConfig already has a > getConvertletTable > >>>>>>> method. But regarding using FrameworkConfig from within the JDBC > >>> driver, > >>>>>>> It’s complicated. FrameworkConfig only works if you are “outside” > >>>>> Calcite, > >>>>>>> whereas CalcitePrepare is when you are customizing from the inside, > >>> and > >>>>>>> sadly CalcitePrepare does not use a FrameworkConfig. > >>>>>>> > >>>>>>> Compare and contrast: > >>>>>>> * CalcitePrepareImpl.getSqlToRelConverter [ > >>> https://github.com/apache/ > >>>>>>> calcite/blob/3f92157d5742dd10f3b828d22d7a753e0a2899cc/core/ > >>>>> src/main/java/ > >>>>>>> org/apache/calcite/prepare/CalcitePrepareImpl.java#L1114 < > >>>>>>> https://github.com/apache/calcite/blob/3f92157d5742dd10f3b82 > >>> 8d22d7a75 > >>>>>>> 3e0a2899cc/core/src/main/java/org/apache/calcite/prepare/ > >>>>>>> CalcitePrepareImpl.java#L1114> ] > >>>>>>> * PlannerImpl.rel [ https://github.com/apache/calcite/blob/ > >>>>>>> 105bba1f83cd9631e8e1211d262e4886a4a863b7/core/src/main/java/ > >>>>>>> org/apache/calcite/prepare/PlannerImpl.java#L225 < > >>>>>>> https://github.com/apache/calcite/blob/105bba1f83cd9631e8e12 > >>> 11d262e48 > >>>>>>> 86a4a863b7/core/src/main/java/org/apache/calcite/prepare/ > >>>>>>> PlannerImpl.java#L225> ] > >>>>>>> > >>>>>>> The latter uses a convertletTable sourced from a FrameworkConfig. > >>>>>>> > >>>>>>> The ideal thing would be to get CalcitePrepareImpl to use a > >>> PlannerImpl > >>>>> to > >>>>>>> do its dirty work. Then “inside” and “outside” would work the same. > >>>>> Would > >>>>>>> definitely appreciate that as a patch. > >>>>>>> > >>>>>>> If you choose to go the JDBC driver route, you could override > >>>>>>> Driver.createPrepareFactory to produce a sub-class of > CalcitePrepare > >>>>> that > >>>>>>> works for your environment, one with an explicit convertletTable > >>> rather > >>>>>>> than just using the default. > >>>>>>> > >>>>>>> Julian > >>>>>>> > >>>>>>> > >>>>>>>> On Nov 17, 2016, at 5:01 PM, Gian Merlino <g...@imply.io> wrote: > >>>>>>>> > >>>>>>>> Hey Julian, > >>>>>>>> > >>>>>>>> If the convertlets were customizable with a FrameworkConfig, how > >>> would > >>>>> I > >>>>>>>> use that configure the JDBC driver (given that I'm doing it with > the > >>>>> code > >>>>>>>> upthread)? Or would that suggest using a different approach to > >>>>> embedding > >>>>>>>> Calcite? > >>>>>>>> > >>>>>>>> Gian > >>>>>>>> > >>>>>>>> On Thu, Nov 17, 2016 at 4:02 PM, Julian Hyde <jh...@apache.org> > >>> wrote: > >>>>>>>> > >>>>>>>>> Convertlets have a similar effect to planner rules (albeit they > >>> act on > >>>>>>>>> scalar expressions, not relational expressions) so people should > be > >>>>>>> able to > >>>>>>>>> change the set of active convertlets. > >>>>>>>>> > >>>>>>>>> Would you like to propose a change that makes the convertlet > table > >>>>>>>>> pluggable? Maybe as part of FrameworkConfig? Regardless, please > >>> log a > >>>>>>> JIRA > >>>>>>>>> to track this. > >>>>>>>>> > >>>>>>>>> And by the way, RexImpTable, which defines how operators are > >>>>> implemented > >>>>>>>>> by generating java code, should also be pluggable. It’s been on > my > >>>>> mind > >>>>>>> for > >>>>>>>>> a long time to allow the “engine” — related to the data format, > and > >>>>> how > >>>>>>>>> code is generated to access fields and evaluate expressions and > >>>>>>> operators — > >>>>>>>>> to be pluggable. > >>>>>>>>> > >>>>>>>>> Regarding whether the JDBC driver is the right way to embed > >>> Calcite. > >>>>>>>>> There’s no easy answer. You might want to embed Calcite as a > >>> library > >>>>> in > >>>>>>>>> your own server (as Drill and Hive do). Or you might want to make > >>>>>>> yourself > >>>>>>>>> just an adapter that runs inside a Calcite JDBC server (as the > CSV > >>>>>>> adapter > >>>>>>>>> does). Or something in the middle, like what Phoenix does: using > >>>>> Calcite > >>>>>>>>> for JDBC, SQL, planning, but with your own metadata and runtime > >>>>> engine. > >>>>>>>>> > >>>>>>>>> As long as you build the valuable stuff into planner rules, new > >>>>>>> relational > >>>>>>>>> operators (if necessary) and use the schema SPI, you should be > >>> able to > >>>>>>>>> change packaging in the future. > >>>>>>>>> > >>>>>>>>> Julian > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On Nov 17, 2016, at 1:59 PM, Gian Merlino <g...@imply.io> > wrote: > >>>>>>>>>> > >>>>>>>>>> Hey Calcites, > >>>>>>>>>> > >>>>>>>>>> I'm working on embedding Calcite into Druid (http://druid.io/, > >>>>>>>>>> https://github.com/druid-io/druid/pull/3682) and am running > into > >>> a > >>>>>>>>> problem > >>>>>>>>>> that is making me wonder if the approach I'm using makes sense. > >>>>>>>>>> > >>>>>>>>>> Consider the expression EXTRACT(YEAR FROM __time). Calcite has a > >>>>>>> standard > >>>>>>>>>> convertlet rule "convertExtract" that changes this into some > >>>>> arithmetic > >>>>>>>>> on > >>>>>>>>>> __time casted to an int type. But Druid has some builtin > >>> functions to > >>>>>>> do > >>>>>>>>>> this, and I'd rather use those than arithmetic (for a bunch of > >>>>>>> reasons). > >>>>>>>>>> Ideally, in my RelOptRules that convert Calcite rels to Druid > >>>>> queries, > >>>>>>>>> I'd > >>>>>>>>>> see the EXTRACT as a normal RexCall with the time flag and an > >>>>>>> expression > >>>>>>>>> to > >>>>>>>>>> apply it to. That's a lot easier to translate than the > arithmetic > >>>>>>> stuff, > >>>>>>>>>> which I'd have to pattern match and undo first before > translating. > >>>>>>>>>> > >>>>>>>>>> So the problem I have is that I want to disable convertExtract, > >>> but I > >>>>>>>>> don't > >>>>>>>>>> see a way to do that or to swap out the convertlet table. > >>>>>>>>>> > >>>>>>>>>> The code I'm using to set up a connection is: > >>>>>>>>>> > >>>>>>>>>> public CalciteConnection createCalciteConnection( > >>>>>>>>>> final DruidSchema druidSchema > >>>>>>>>>> ) throws SQLException > >>>>>>>>>> { > >>>>>>>>>> final Properties props = new Properties(); > >>>>>>>>>> props.setProperty("caseSensitive", "true"); > >>>>>>>>>> props.setProperty("unquotedCasing", "UNCHANGED"); > >>>>>>>>>> final Connection connection = > >>>>>>>>>> DriverManager.getConnection("jdbc:calcite:", props); > >>>>>>>>>> final CalciteConnection calciteConnection = > >>>>>>>>>> connection.unwrap(CalciteConnection.class); > >>>>>>>>>> calciteConnection.getRootSchema().setCacheEnabled(false); > >>>>>>>>>> calciteConnection.getRootSchema().add(DRUID_SCHEMA_NAME, > >>>>>>>>> druidSchema); > >>>>>>>>>> return calciteConnection; > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> This CalciteConnection is then used by the Druid HTTP server to > >>>>> offer a > >>>>>>>>> SQL > >>>>>>>>>> API. > >>>>>>>>>> > >>>>>>>>>> Is there some way to swap out the convertlet table that I'm > >>> missing? > >>>>>>>>>> > >>>>>>>>>> Also, just in general, am I going about this the right way? Is > >>> using > >>>>>>> the > >>>>>>>>>> JDBC driver the right way to embed Calcite? Or should I be > calling > >>>>> into > >>>>>>>>> it > >>>>>>>>>> at some lower level? > >>>>>>>>>> > >>>>>>>>>> Thanks! > >>>>>>>>>> > >>>>>>>>>> Gian > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>> > >>> > >> > >