Re: PR Review Request

2022-06-22 Thread Timo Walther

Hi everyone,

This is a really great discussion. Thanks for starting it Martijn and 
your input Jacques! I have been fighting against forking Calcite in 
Flink for years already. Even when merging forks of Flink that 
transitively forked Calcite, in the end we were able to resolve 
conflicts / contribute blockers back into Calcite. And I strongly 
believe that this is the better approach for long-term success for both 
projects.


I would like to get more involved in the Calcite community. I have been 
implementing and managing Flink SQL based on Calcite since 2016. Thus, I 
feel confident to say that I know the code base and some quirks in the 
stack very well.


Capacity-wise I will try to reserve some time for helping the Calcite 
community. Happy to get some pointers where and how I can help.


I will take a look at https://github.com/apache/calcite/pull/2606 this 
week to get the ball rolling. As this is an important addition and 
prepares for "customer SQL operators" in Flink SQL.


Regards,
Timo

On 21.06.22 22:18, Charles Givre wrote:

As the PMC for Apache Drill, I'd echo everyone's comments here Don't fork.  
 Don't do it.

Apache Drill forked Calcite several years ago which Calcite was on version 1.20 or 1.21.  
While this meant that some bugs were easily fixed, what it also meant that as our fork 
diverged from "regular" Calcite, it became harder and harder to maintain.  It 
also meant that we were chasing bugs that had since been fixed.

Drill is in the process of "de-forking" Calcite, meaning that we're ditching 
our fork and re-integrating with standard Calcite.  It has been A TON of work and we have 
contributed (and will continue to contribute) bug fixes and PRs to Calcite. In the long 
run, I think this will be beneficial for both communities.

Best,
-- C



On Jun 21, 2022, at 1:57 PM, Julian Hyde  wrote:

Please don’t fork Calcite.

Calcite suffers from the tragedy of the commons. Unlike many open source data 
projects, there is no commercial project that directly maps to Calcite (even 
though Calcite is an essential part of many projects). As a result no engineers 
work full-time on Calcite.

It takes more than pull requests to keep a project going. We need reviewers, 
people to work on releases, people to fix bugs (such as security bugs) that are 
important to everyone but urgent to no one.

We have plenty of committers in Calcite, and add several more per year. We rely 
on those committers taking on their share of the housework, but the burden 
falls on too few people.

Engineering managers need to start paying a little more for the “free lunch” 
that they enjoy when Calcite “just works” in their project. Sadly, most 
engineering managers are not subscribed to this list.

Julian



On Jun 21, 2022, at 9:49 AM, Jacques Nadeau  wrote:

Martijn, thanks for sharing that thread in the Flink community.

I'm someone who has forked Calcite twice: once in Apache Drill and again in
Dremio. In both cases, it was all about trading short term benefits against
long term costs. In both cases, I think the net amount of work was probably
5x as much as what it would have been if we had just done a better job
engaging the community. If I were to state the curve of behavior over six
years, I'd guess that in both cases the numbers of effort looked like this:

estimated effort doing high intensity integration with calcite (years 1-6)
fork: 1, 5, 10, 50, 100, 200, total = 366
non-fork: 10, 10, 10, 10, 10, total = 50

So yes, the first couple years you're ahead. But you pay a massive
technical debt premium long term. Early in a project (Drill) or company's
life (Dremio), it can make sense to sacrifice long term for short term but
it's important people do it with their eyes open.

The reason that this pain is so high is that as your codebases diverge, you
start having to do everything the Calcite community does by yourself.
Backports become harder and things that you need (e.g. new sql syntax, etc)
have to be reimplemented (even if someone else already implemented them in
some post-fork Calcite version. Ultimately, at some point you realize that
your path is untenable and you unfork. This becomes the biggest expense of
them all and I believe both of those teams are still trying to un-fork. The
additional thing that becomes an even bigger problem is your absence from
the Calcite community means that people may take the project or APIs in
ways that are in direct conflict to how you use the library. Since you're
not active in the project, you fail to provide a counterpoint and then
you're basically just in a miserable place. The Hive project did this best
by ensuring that releases of Calcite were also run pre-release against Hive
to make sure no major regressions occurred. By being in the community and
active, this is the best state from my pov. (It makes your project better
and Calcite better.)

Two last notes:
- I'm not sure the rocks fork is comparable to forking Calcite. The api
surface area and 

Re: [DISCUSS] Deprecate grouped window functions

2020-04-30 Thread Timo Walther

Thanks for considering our needs.

I'm pretty sure that windows are in almost every streaming pipeline with 
aggregations. Unlike regular Java API, SQL syntax is very difficult to 
deprecate.


We usually give Flink user 1-2 releases time to update their code. Once 
Calcite supports polymorphic table functions, I think 6 months would be 
helpful otherwise we need to maintain our own fork which we could mostly 
prevent so far.


Regards,
Timo

On 29.04.20 00:49, Rui Wang wrote:

Agreed. I would like to get more feedback to have a
reasonable accommodation for users.


-Rui

On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde  wrote:


Changing my +1 to +0. We have to make reasonable accommodations for our
users. Glad we had this discussion.


On Apr 24, 2020, at 11:10 AM, Rui Wang  wrote:

Hi Timo,

My intention is to fully drop concepts such as SqlGroupedWindowFunction

and

auxiliary group functions, which include relevant code in parser/syntax,
operator, planner, etc.

Since you mentioned the need for more time to migrate. How many Calcite
releases that you think can probably leave enough buffer time? (Calcite
schedules 4 releases a year. So say 2 releases will give 6 months)


-Rui

On Fri, Apr 24, 2020 at 1:50 AM Timo Walther  wrote:


Hi everyone,

so far Apache Flink depends on this feature. We are fine with improving
the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION
in the future. However, we would like to give our users some time to
migrate their existing pipelines.

What does dropping mean for Calcite? Will users of Calcite be able to
still support this syntax? In particular, are you intending to also drop
concepts such as SqlGroupedWindowFunction and auxiliary group functions?
Or are you intending to just remove entries from Calcite's default
operator table?

Regards,
Timo


On 24.04.20 10:30, Julian Hyde wrote:

+1

Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL

change, not an API change, I don’t we need to give notice. Let’s just

do it.


Julian


On Apr 22, 2020, at 4:05 PM, Rui Wang  wrote:

Made a mistake on the example above, and update it as follows:

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
GROUP BY product_id, window_start


On Wed, Apr 22, 2020 at 2:31 PM Rui Wang 

wrote:


Hi community,

I want to kick off a discussion about deprecating grouped window

functions

(GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support
becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current

stage of

table function windowing is TUMBLE support is checked in. HOP and

SESSION

support is likely to be merged in 1.23.0.

A briefly example of two different windowing syntax:

// Grouped window functions.
SELECT
   product_id, count(*), TUMBLE_START() as window_start
FROM order
GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour

long

fixed window size.

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
GROUP BY product_id

I am giving a short, selective comparison as the following:

The places that table function windowing behaves better
1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming

JOIN.

For example, one use case is for each hour, apply a JOIN on two

streams. In

this case, no GROUP BY is needed.
2) grouped window functions allow multiple calls in GROUP BY. For

example,

from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...),

SESSION(...)

is not wrong, but it is an illegal query.
3) Calcite includes an Enumerable implementation of table function
windowing, while grouped window functions do not have that.


The places that table function windowing behaves worse
1) table function windowing adds "window_start", "window_end" into

table

directly, which increases the volume of data (number of rows *
sizeof(timestamp) * 2).


I want to focus on discussing two questions in this thread:
1) Do people support deprecating grouped window functions?
2) By which version people prefer to make grouped window functions
completely removed?(if 1) is yes).



[1]: https://jira.apache.org/jira/browse/CALCITE-3271


-Rui













Re: [DISCUSS] Deprecate grouped window functions

2020-04-24 Thread Timo Walther

Hi everyone,

so far Apache Flink depends on this feature. We are fine with improving 
the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION 
in the future. However, we would like to give our users some time to 
migrate their existing pipelines.


What does dropping mean for Calcite? Will users of Calcite be able to 
still support this syntax? In particular, are you intending to also drop 
concepts such as SqlGroupedWindowFunction and auxiliary group functions? 
Or are you intending to just remove entries from Calcite's default 
operator table?


Regards,
Timo


On 24.04.20 10:30, Julian Hyde wrote:

+1

Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL change, 
not an API change, I don’t we need to give notice. Let’s just do it.

Julian


On Apr 22, 2020, at 4:05 PM, Rui Wang  wrote:

Made a mistake on the example above, and update it as follows:

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
GROUP BY product_id, window_start


On Wed, Apr 22, 2020 at 2:31 PM Rui Wang  wrote:

Hi community,

I want to kick off a discussion about deprecating grouped window functions
(GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support
becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current stage of
table function windowing is TUMBLE support is checked in. HOP and SESSION
support is likely to be merged in 1.23.0.

A briefly example of two different windowing syntax:

// Grouped window functions.
SELECT
   product_id, count(*), TUMBLE_START() as window_start
FROM order
GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour long
fixed window size.

// Table function windowing syntax.
SELECT
product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
GROUP BY product_id

I am giving a short, selective comparison as the following:

The places that table function windowing behaves better
1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming JOIN.
For example, one use case is for each hour, apply a JOIN on two streams. In
this case, no GROUP BY is needed.
2) grouped window functions allow multiple calls in GROUP BY. For example,
from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...), SESSION(...)
is not wrong, but it is an illegal query.
3) Calcite includes an Enumerable implementation of table function
windowing, while grouped window functions do not have that.


The places that table function windowing behaves worse
1) table function windowing adds "window_start", "window_end" into table
directly, which increases the volume of data (number of rows *
sizeof(timestamp) * 2).


I want to focus on discussing two questions in this thread:
1) Do people support deprecating grouped window functions?
2) By which version people prefer to make grouped window functions
completely removed?(if 1) is yes).



[1]: https://jira.apache.org/jira/browse/CALCITE-3271


-Rui





Re: [ANNOUNCE] New committer: Sergey Nuyanzin

2018-07-23 Thread Timo Walther

Congratulations Sergey!


Am 23.07.18 um 13:33 schrieb Sergey Nuyanzin:

Thanks everyone for kind words!
It is a great honor for me to become a Calcite committer.

A little bit about myself
Past few months I have been focusing on Calcite and Flink. I am still
learning Calcite's and Avatica's internals and having fun with their
different parts.
I look forward to continue contribution to the project and I hope I could
expand my contributions as I am getting more and more familiar with
Avatica, Calcite.


On Sat, Jul 21, 2018 at 4:51 AM, Francis Chuang 
wrote:


Congrats, Sergey!


On 21/07/2018 10:45 AM, Michael Mior wrote:


Congratulations Sergey and thank you for your continued contributions!
--
Michael Mior
mm...@apache.org
Le ven. 20 juil. 2018 à 19:19, Julian Hyde  a écrit :


The PMC has just invited Sergey Nuyanzin to be a committer, and he has
accepted. Over the past few months, Sergey has made several
contributions toe Avatica, to Calcite's built-in functions, and to the
JDBC adapter.

Sergey, welcome! Feel free to tell us a little about yourself.

Julian









Re: Confusion about the GeoFunctions

2018-05-24 Thread Timo Walther

Hi Julian,

the Flink community is very thankful for the OpenGIS efforts done by the 
Calcite community and I think both project can benefit from it. As 
Xingcan mentioned, we are thinking about contributing a GeoOperatorTable 
similar to SqlStdOperatorTable. We don't want to reimplement the 
functions. The only concern that I raised was about exposing the data 
type through Flink. Instead of exposing 
`org.apache.calcite.runtime.GeoFunctions.Geom` as a type to our users, I 
was thinking about a Flink specific type that implements the interface. 
Flink also needs to implement serializers for this type.


We saw that GeoFunctions has a `protected bind()` method and were 
wondering if one could open the binding/type creation and make it more 
flexible.


Regards,
Timo


Am 23.05.18 um 19:19 schrieb Julian Hyde:

I am wondering why it is necessary to have a different geometry type in Flink 
than in Calcite.

If you think that the implementation in Calcite sucks, then rather than making 
a different one for Flink, how about making a different one for Calcite, and 
use that in Flink? I’m not super-proud of the implementation in Calcite; it was 
the best I could do given the time available.

In Calcite we are extremely short of development resources. All of the spatial 
work in Calcite has been done by me, on my own time. It is depressing to see 
someone use it and immediately decide they are going to re-implement it all in 
their own project.

Julian




On May 22, 2018, at 8:28 PM, Xingcan Cui  wrote:

Hi all,

Recently, the flink community aims to add some OpenGIS functions (see
FLINK-9219 ) provided in
CALCITE-1968 . For some
reasons, we plan to implement some Flink Geom types (as illustrated in this
comment
),
but the private constructor in `o.a.c.r.GeoFunctions` makes that impossible.

We are confused about the protected method `bind()`, as well as other
private modifiers in `o.a.c.r.GeoFunctions`. I wonder if someone could help
to give some explanations about that. Do we in the right direction?

Besides, maybe there should be a new Geom operand type in
`o.a.c.q.t.OperandTypes`?

Thanks,
Xingcan





[jira] [Created] (CALCITE-1867) Allow creating additional SqlGroupFunctions

2017-06-30 Thread Timo Walther (JIRA)
Timo Walther created CALCITE-1867:
-

 Summary: Allow creating additional SqlGroupFunctions
 Key: CALCITE-1867
 URL: https://issues.apache.org/jira/browse/CALCITE-1867
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Timo Walther
Assignee: Julian Hyde


In Flink we want to create additional group auxiliary functions (such as 
{{TUMBLE_ROWTIME(), TUMBLE_PROCTIME()}}). 

Unfortunately, {{SqlGroupFunction}} and its methods are package-private which 
prevents us from adding custom functions. Also {{AuxiliaryConverter}} limits 
because it is statically defined and not pluggable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Nested TUMBLE/HOP/SESSION windows

2017-05-17 Thread Timo Walther

Hi everyone,

we are very happy to support TUMBLE/HOP/SESSION in our upcoming Flink 
1.3 release. However, there are some problems regarding nested window 
queries that we would like to discuss with the Calcite community.


Take the following query:

SELECT
  rowtime, SUM(x)
FROM (
  SELECT
 TUMBLE_END(rowtime, INTERVAL '2' MINUTE) AS rowtime,
 MIN(x) AS x
  FROM MyTable
  GROUP BY TUMBLE(rowtime, INTERVAL '2' MINUTE)
)
GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR)


Initially, we thought that we can use the xxx_END() group auxiliary 
functions to define the rowtime for the upper query. However, according 
to http://calcite.apache.org/docs/stream.html, TUMBLE_END should return 
the timestamp of the exclusive window end, i.e., for a window of 1 hour 
that contains all elements from 12:00:00.000 until 12:59:59.999 
(inclusive), TUMBLE_END would return 13:00:00.000. The problem is that 
Flink uses the inclusive window end as new timestamp. The reason for 
that is that if you do preaggregation with a window, say 5 minute 
windows which later will be aggregated into 1 hour windows, the last 5 
minute window (from 12:55:00.000 until 12:59:59.999 incl) would have a 
timestamp of 13:00:00.000 and fall into the next window starting at 
13:00:00.000.



The question is how Calcite is planning to support nested windows. Right 
now we see the following options:


- TUMBLE_END returns the inclusive window end

- we introduce an additional group auxiliary function for the inclusive 
window end like: SELECT TUMBLE_TIME(rowtime, INTERVAL '2' MINUTE) AS 
rowtime ...


- we allow references to the window in the select: SELECT 
TUMBLE(rowtime, INTERVAL '1' HOUR) AS rowtime ...


What do you think?


Regards,

Timo






Re: TUMBLE/HOP/SESSION_START/END do not resolve time field correctly

2017-04-25 Thread Timo Walther
I created CALCITE-1761 for this issue. If you wanna take a look at the 
exception, this is my branch:


https://github.com/twalthr/flink/tree/FLINK-5884
The following test fails right now: 
org.apache.flink.table.api.scala.stream.sql.WindowAggregateTest#testTumbleFunction


Once we have a fix for that we can copy the class to Flink until the 
next Calcite release. We did that with other issues in the past, too.



Am 25/04/17 um 20:12 schrieb Timo Walther:
Thanks for your quick response. Flink does not use the monotonicity 
property yet and we are are also not using the STREAM keyword. Could 
this be a problem?



Am 25/04/17 um 19:39 schrieb Julian Hyde:

I've added a test case (and the test missed in CALCITE-1615) in
https://github.com/julianhyde/calcite/tree/-hop. I cannot
reproduce your problem. Please still log a jira case.

On Tue, Apr 25, 2017 at 10:13 AM, Julian Hyde <jh...@apache.org> wrote:

I just noticed that in
https://issues.apache.org/jira/browse/CALCITE-1615 tests were added to
SqlToRelConverterTest.xml but not to SqlToRelConverterTest.java. We've
been running without tests.

On Tue, Apr 25, 2017 at 10:01 AM, Julian Hyde <jh...@apache.org> wrote:

Can you log a bug please? I will help out if I can.

When this is fixed, I presume you will need a Calcite release at the
appropriate time so that you can release Flink. Can you start a
separate email thread when you know that timing?

On Tue, Apr 25, 2017 at 7:13 AM, Timo Walther <twal...@apache.org> 
wrote:

Hi all,


I'm working on integrating START and END for TUMBLE/HOP/SESSION in 
Flink SQL
with logical time indicator columns (e.g. rowtime, proctime). It 
seems there
is a bug in the resolution logic of SqlToRelConverter. Since our 
feature
freeze is next week and this feature should be part of Flink 1.3, 
it would

be great if you can help me with at least a hint for a quick fix.


The problem is as follows:


Input: MyTable(INTEGER a, VARCHAR b, BIGINT c, TIMESTAMP proctime, 
TIMESTAMP

rowtime)

SQL: SELECT COUNT(*), TUMBLE_START(rowtime, INTERVAL '15' MINUTE),
TUMBLE_END(rowtime, INTERVAL '15' MINUTE) FROM MyTable GROUP BY
TUMBLE(rowtime, INTERVAL '15' MINUTE)

Exception:

java.lang.RuntimeException: while converting
TUMBLE_START(`MyTable`.`rowtime`, INTERVAL '15' MINUTE)

 at
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:134) 


 at
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:61) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4415) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3783) 


 at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:137)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4317) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2723) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2541) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:654) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:616) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2951) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552) 


 .
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 


 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 


 at java.lang.reflect.Method.invoke(Method.java:498)
 at
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:131) 


 ... 42 more
Caused by: java.lang.AssertionError
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:4132) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3446) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3421) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter.access$1800(SqlToRelConverter.java:207) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4424) 


 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java 




The tests in Calcite only cover the case where rowtime is at the 
beginning
of a row. Once rowtime is somewhere else, the indices are messed 
up. I tried

to debug it, but I'm stuck at SqlToRelConverter#convertIdentifier().


Any help is very welcome.


Regards,

Timo





[jira] [Created] (CALCITE-1761) TUMBLE/HOP/SESSION_START/END do not resolve time field correctly

2017-04-25 Thread Timo Walther (JIRA)
Timo Walther created CALCITE-1761:
-

 Summary: TUMBLE/HOP/SESSION_START/END do not resolve time field 
correctly
 Key: CALCITE-1761
 URL: https://issues.apache.org/jira/browse/CALCITE-1761
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Timo Walther
Assignee: Julian Hyde


It seems there is a bug in the resolution logic of SqlToRelConverter.

The problem is as follows:

Input: {{MyTable(INTEGER a, VARCHAR b, BIGINT c, TIMESTAMP proctime, TIMESTAMP 
rowtime)}}

SQL: {{SELECT COUNT(*), TUMBLE_START(rowtime, INTERVAL '15' MINUTE), 
TUMBLE_END(rowtime, INTERVAL '15' MINUTE) FROM MyTable GROUP BY TUMBLE(rowtime, 
INTERVAL '15' MINUTE)}}

Exception:

{code}
java.lang.RuntimeException: while converting TUMBLE_START(`MyTable`.`rowtime`, 
INTERVAL '15' MINUTE)

at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:134)
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:61)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4415)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3783)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:137)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4317)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2723)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2541)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:654)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:616)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2951)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552)
.
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:131)
... 42 more
Caused by: java.lang.AssertionError
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:4132)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3446)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3421)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1800(SqlToRelConverter.java:207)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4424)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java
{code}

Additionally, tests were added to
SqlToRelConverterTest.xml but not to SqlToRelConverterTest.java. We've
been running without tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: TUMBLE/HOP/SESSION_START/END do not resolve time field correctly

2017-04-25 Thread Timo Walther
Thanks for your quick response. Flink does not use the monotonicity 
property yet and we are are also not using the STREAM keyword. Could 
this be a problem?



Am 25/04/17 um 19:39 schrieb Julian Hyde:

I've added a test case (and the test missed in CALCITE-1615) in
https://github.com/julianhyde/calcite/tree/-hop. I cannot
reproduce your problem. Please still log a jira case.

On Tue, Apr 25, 2017 at 10:13 AM, Julian Hyde <jh...@apache.org> wrote:

I just noticed that in
https://issues.apache.org/jira/browse/CALCITE-1615 tests were added to
SqlToRelConverterTest.xml but not to SqlToRelConverterTest.java. We've
been running without tests.

On Tue, Apr 25, 2017 at 10:01 AM, Julian Hyde <jh...@apache.org> wrote:

Can you log a bug please? I will help out if I can.

When this is fixed, I presume you will need a Calcite release at the
appropriate time so that you can release Flink. Can you start a
separate email thread when you know that timing?

On Tue, Apr 25, 2017 at 7:13 AM, Timo Walther <twal...@apache.org> wrote:

Hi all,


I'm working on integrating START and END for TUMBLE/HOP/SESSION in Flink SQL
with logical time indicator columns (e.g. rowtime, proctime). It seems there
is a bug in the resolution logic of SqlToRelConverter. Since our feature
freeze is next week and this feature should be part of Flink 1.3, it would
be great if you can help me with at least a hint for a quick fix.


The problem is as follows:


Input: MyTable(INTEGER a, VARCHAR b, BIGINT c, TIMESTAMP proctime, TIMESTAMP
rowtime)

SQL: SELECT COUNT(*), TUMBLE_START(rowtime, INTERVAL '15' MINUTE),
TUMBLE_END(rowtime, INTERVAL '15' MINUTE) FROM MyTable GROUP BY
TUMBLE(rowtime, INTERVAL '15' MINUTE)

Exception:

java.lang.RuntimeException: while converting
TUMBLE_START(`MyTable`.`rowtime`, INTERVAL '15' MINUTE)

 at
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:134)
 at
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:61)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4415)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3783)
 at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:137)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4317)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2723)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2541)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:654)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:616)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2951)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552)
 .
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:131)
 ... 42 more
Caused by: java.lang.AssertionError
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:4132)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3446)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3421)
 at
org.apache.calcite.sql2rel.SqlToRelConverter.access$1800(SqlToRelConverter.java:207)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4424)
 at
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java


The tests in Calcite only cover the case where rowtime is at the beginning
of a row. Once rowtime is somewhere else, the indices are messed up. I tried
to debug it, but I'm stuck at SqlToRelConverter#convertIdentifier().


Any help is very welcome.


Regards,

Timo





TUMBLE/HOP/SESSION_START/END do not resolve time field correctly

2017-04-25 Thread Timo Walther

Hi all,


I'm working on integrating START and END for TUMBLE/HOP/SESSION in Flink 
SQL with logical time indicator columns (e.g. rowtime, proctime). It 
seems there is a bug in the resolution logic of SqlToRelConverter. Since 
our feature freeze is next week and this feature should be part of Flink 
1.3, it would be great if you can help me with at least a hint for a 
quick fix.



The problem is as follows:


Input: MyTable(INTEGER a, VARCHAR b, BIGINT c, TIMESTAMP proctime, 
TIMESTAMP rowtime)


SQL: SELECT COUNT(*), TUMBLE_START(rowtime, INTERVAL '15' MINUTE), 
TUMBLE_END(rowtime, INTERVAL '15' MINUTE) FROM MyTable GROUP BY 
TUMBLE(rowtime, INTERVAL '15' MINUTE)


Exception:

java.lang.RuntimeException: while converting 
TUMBLE_START(`MyTable`.`rowtime`, INTERVAL '15' MINUTE)


at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:134)
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:61)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4415)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3783)

at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:137)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4317)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2723)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2541)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:654)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:616)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2951)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:552)

.
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:131)

... 42 more
Caused by: java.lang.AssertionError
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.getRootField(SqlToRelConverter.java:4132)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.adjustInputRef(SqlToRelConverter.java:3446)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertIdentifier(SqlToRelConverter.java:3421)
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1800(SqlToRelConverter.java:207)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4424)
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java



The tests in Calcite only cover the case where rowtime is at the 
beginning of a row. Once rowtime is somewhere else, the indices are 
messed up. I tried to debug it, but I'm stuck at 
SqlToRelConverter#convertIdentifier().



Any help is very welcome.


Regards,

Timo



Handling of system attributes in a row

2017-02-15 Thread Timo Walther

Hi everyone,

we (from Flink) are currently discussing how we can express 
time-semantics (event-time or processing-time) in a SQL query. The 
optimal solution would be to have two system attributes that are part of 
every table schema/every row data type. We could then access it like 
`SELECT * FROM MyTable ORDER BY rowtime`. However, it should not be part 
of the result in an expansion (`*`) and the user should not modify those 
attributes (no aliasing, read-only). I had a look into SqlValidator and 
there are several lines that contain things like `includeSystemVars` or 
`isSystemField` but nothing concrete. Am I right that this feature is 
not entirely implemented yet? Which parts would you touch/override to 
implement this feature?


Thanks in advance.

Regards,
Timo





[jira] [Created] (CALCITE-1435) Wrong comparison of TIMESTAMP literals

2016-10-13 Thread Timo Walther (JIRA)
Timo Walther created CALCITE-1435:
-

 Summary: Wrong comparison of TIMESTAMP literals
 Key: CALCITE-1435
 URL: https://issues.apache.org/jira/browse/CALCITE-1435
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Timo Walther
Assignee: Julian Hyde


The following expression always returns true when using {{ProjectToCalcRule}} 
which is invalid:

{code}
TIMESTAMP '2011-01-01 00:00:00.001' >= TIMESTAMP '2011-01-01 00:00:00.005'
{code}

The reason for this is that a wrong {{RexProgram}} is built since 
{{RexUtil#makeKey}} returns the same key for both timestamps. Both timestamps, 
however, have the same digest (fractional seconds are missing).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)