Re: Updating the Website

2019-12-11 Thread Stamatis Zampetakis
I guess it will require some effort to setup and automate the process for
supporting multiple versions but afterwards it may be easier to maintain.
If the only thing that a committer has to do to update the site is
committing to the master then there is not even need for a particular
workflow.

On Mon, Dec 9, 2019 at 10:31 PM Julian Hyde  wrote:

> We might be inventing requirements here, in order to justify a “cool”
> technical change.
>
> I don’t think there is a strong requirement for multiple versions of the
> site. (Sure, it would be nice.)
>
> This thread started with Stamatis pointing out that it was complicated to
> update the site. If we support multiple versions, will this actually make
> things less complicated?
>
> Julian
>
>
>
> > On Dec 9, 2019, at 1:23 PM, Stamatis Zampetakis 
> wrote:
> >
> > In the short term we should try to do our best to follow the existing
> > workflow.
> >
> > In the medium term we shall hope that things will be easier with the
> > automated build of the website.
> >
> > In the longer term, I would really prefer to migrate towards a solution
> > like the one proposed by Vladimir.
> > As I also mentioned in a previous email, there are many projects who
> > publish multiple versions of the doc and I find this very helpful.
> > People usually wait some time before updating their libraries to the
> latest
> > release; in this and other cases it is helpful to have a couple versions
> of
> > the doc available online.
> >
> >
> > On Sun, Dec 8, 2019 at 11:02 PM Vladimir Sitnikov <
> > sitnikov.vladi...@gmail.com> wrote:
> >
> >> Francis>There are also links to Avatica docs in
> >> Francis>the side bar and it would be a bit strange to have them always
> >> point to
> >> Francis>the master version of Avatica.
> >>
> >> gradle.properties references the Avatica version, so we could print the
> >> appropriate links.
> >>
> >> Michael>that need to be made that are independent of a particular
> release
> >> Michael>(e.g. adding a commiter)?
> >> Michael>Would I go back and edit the previous
> >> Michael>release branch?
> >>
> >> No. You update committers on a master branch
> >>
> >> Michael>Do we somehow label parts of the site as being
> >> Michael>release-independent?
> >>
> >> It makes little sense to discuss. The answer will be obvious once
> someone
> >> tries.
> >>
> >> Michael>Even if this is the case, consider when we might
> >> Michael>have to correct documentation errors from a revious release
> >>
> >> The current ASF rule is to have rel/... tag for each release.
> >> That is the site build script could use rel/vX.Y tags to get "released
> >> versions".
> >>
> >> Then there are at least two strategies.
> >> a) If we want to update documentation for calcite-1.10.0, then we could
> >> release calcite-v1.10.1.
> >> b) If a "silent" update is required (e.g. fix typo), then we could
> invent
> >> "support/vX.Y" branches, and commit the fix to that branch.
> >>
> >> Note: the current release process does not require a "release branch".
> >> The build script does NOT create new commits to the source repository.
> >> However, we could create one on-demand (e.g. in case we really need to
> >> patch the old site version or back-port a fix)
> >>
> >> Vladimir
> >>
>
>


Re: [Discuss] Make flattening on Struct/Row optional

2019-12-11 Thread Stamatis Zampetakis
So basically thanks to Igor :)

On Wed, Dec 11, 2019 at 9:56 PM Rui Wang  wrote:

> Thanks Stamatis's suggestion. Indeed a recent effort in [1] enhanced the
> support that reconstructs ROW in the top SELECT, which is supposed to solve
> the problem.
>
>
>
> [1]: https://jira.apache.org/jira/browse/CALCITE-3138
>
> On Mon, Dec 9, 2019 at 3:21 PM Rui Wang  wrote:
>
> > Hello,
> >
> > Sorry for the long delay on this thread. Recently I heard about requests
> > on how to deal with STRUCT without flattening it again in BeamSQL. Also I
> > realized Flink has already disabled it in their codebase[1]. I did try to
> > remove STRUCT flattening and run unit tests of calcite core to see how
> many
> > tests breaks: it was 25, which wasn't that bad. So I would like to pick
> up
> > this effort again.
> >
> > Before I do it, I just want to ask if Calcite community supports this
> > effort (or think if it is a good idea)?
> >
> > My current execution plan will be the following:
> > 1. Add a new flag to FrameworkConfig to specify whether flattening
> STRUCT.
> > By default, it is yes.
> > 2. When disabling struct flatterner, add more tests to test STRUCT
> support
> > in general. For example, test STRUCT support on projection, join
> condition,
> > filtering, etc.  If there is something breaks, try to fix it.
> > 3. Check the 25 failed tests above and see why they have failed if struct
> > flattener is gone. Duplicate those failed tests but have necessary fixes
> to
> > make sure they can pass without STRUCT flattening.
> >
> >
> > [1]:
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala#L166
> >
> >
> > -Rui
> >
> > On Wed, Sep 5, 2018 at 11:59 AM Julian Hyde  wrote:
> >
> >> It might not be minor, but it’s worth a try. At optimization time we
> >> treat all fields as fields, regardless of whether they have complex
> types
> >> (maps, arrays, multisets, records) so there should not be too many
> >> problems. The flattening was mainly for the benefit of the runtime.
> >>
> >>
> >> > On Sep 5, 2018, at 11:32 AM, Rui Wang 
> >> wrote:
> >> >
> >> > Thanks for your helpful response! It seems like disabling the
> flattening
> >> > will at least affect some rules in optimization. It might not be a
> minor
> >> > change.
> >> >
> >> >
> >> > -Rui
> >> >
> >> > On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis  >
> >> > wrote:
> >> >
> >> >> Hi Rui,
> >> >>
> >> >> Disabling flattening in some cases seems reasonable.
> >> >>
> >> >> If I am not mistaken, even in the existing code it is not used all
> the
> >> time
> >> >> so it makes sense to become configurable.
> >> >> For example, Calcite prepared statements (CalcitePrepareImpl) are
> >> using the
> >> >> flattener only for DDL operations that create materialized views (and
> >> this
> >> >> is because this code at some point passes from the PlannerImpl).
> >> >> On the other hand, any query that is using the Planner will also pass
> >> from
> >> >> the flattener.
> >> >>
> >> >> Disabling the flattener does not mean that all rules will work
> without
> >> >> problems. The Javadoc of the RelStructuredTypeFlattener at some point
> >> says
> >> >> "This approach has the benefit that real optimizer and codegen rules
> >> never
> >> >> have to deal with structured types.". Due to this, it is very likely
> >> that
> >> >> some rules were written based on the fact that there are no
> structured
> >> >> types.
> >> >>
> >> >> Best,
> >> >> Stamatis
> >> >>
> >> >>
> >> >> Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde <
> jh...@apache.org
> >> >
> >> >> έγραψε:
> >> >>
> >> >>> Flattening was introduced mainly because the original engine used
> flat
> >> >>> column-oriented storage. Now we have several ways to executing,
> >> >>> including generating java code.
> >> >>>
> >> >>> Adding a mode to disable flattening might make sense.
> >> >>> On Tue, Sep 4, 2018 at 12:52 PM Rui Wang  >
> >> >>> wrote:
> >> 
> >>  Hi Community,
> >> 
> >>  While trying to support Row type in Apache Beam SQL on top of
> >> Calcite,
> >> >> I
> >>  realized flattening Row logic will make structure information of
> Row
> >> >> lost
> >>  after Projections. There is a use case where users want to mix Beam
> >>  programming model with Beam SQL together to process a dataset. The
> >>  following is an example of the use case:
> >> 
> >>  dataset.apply(something user defined)
> >> .apply(SELECT ...)
> >> .apply(something user defined)
> >> 
> >>  As you can see, after the SQL statement is applied, the data
> >> structure
> >>  should be preserved for further processing.
> >> 
> >>  The most straightforward way to me is to make Struct fattening
> >> optional
> >> >>> so
> >>  I could choose to disable it and the Row structure is preserved.
> Can
> >> I
> >> >>> ask
> >>  if it is feasible to make it happen? 

[jira] [Created] (CALCITE-3597) The conversion between java.sql.Timestamp and long is not asymmetric

2019-12-11 Thread Zhenghua Gao (Jira)
Zhenghua Gao created CALCITE-3597:
-

 Summary: The conversion between java.sql.Timestamp and long is not 
asymmetric
 Key: CALCITE-3597
 URL: https://issues.apache.org/jira/browse/CALCITE-3597
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Zhenghua Gao


In Calcite, we use SqlFunctions.toLong(Timestamp) and 
SqlFunctions.internalToTimestamp(long) to convert java.sql.Timestmap to 
internal long and vice versa. The main logical inside is +/- local time zone 
offset.

But in the comments of TimeZone.getOffset(long date), the parameter represents 
in milliseconds since January 1, 1970 00:00:00 GMT. It means that there will 
one conversion above doesn't satisfy this hypothesis.

 

This causes many surprise to users:

(1) some Daylight Saving Time changes:

 
{code:java}
@Test public void testDayLightingSaving() {
 TimeZone tz = TimeZone.getDefault();
 TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"));

 java.sql.Timestamp dst2018Begin = java.sql.Timestamp.valueOf("2018-03-11 
03:00:00");

 assertThat(dst2018Begin, is(internalToTimestamp(toLong(dst2018Begin;

 TimeZone.setDefault(tz);
}{code}
fails with:
{code:java}
java.lang.AssertionError: 
Expected: is <2018-03-11 04:00:00.0>
 but: was <2018-03-11 03:00:00.0>
Expected :is <2018-03-11 04:00:00.0>
Actual :<2018-03-11 03:00:00.0>{code}
 

(2) "1900-01-01 00:00:00" Changes in some TimeZone
{code:java}
@Test public void test() {
 TimeZone tz = TimeZone.getDefault();
 TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"));

 java.sql.Timestamp ts = java.sql.Timestamp.valueOf("1900-01-01 00:00:00");
 assertThat(ts, is(internalToTimestamp(toLong(ts;
 TimeZone.setDefault(tz);
}{code}
fails with
{code:java}
java.lang.AssertionError: 
Expected: is <1899-12-31 23:54:17.0>
 but: was <1900-01-01 00:00:00.0>
Expected :is <1899-12-31 23:54:17.0>
Actual :<1900-01-01 00:00:00.0>
{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Ziwei Liu want to be contributor in calcite

2019-12-11 Thread Francis Chuang

I've added your account to the contributor role.

Francis

On 12/12/2019 2:13 pm, Liu Ziwei wrote:

Hi:
   I want to contribute to Apache Calcite.
  Would you please give me the contributor permission?
  My JIRA ID is Ziwei Liu.

Thanks.



Ziwei Liu want to be contributor in calcite

2019-12-11 Thread Liu Ziwei
Hi:
  I want to contribute to Apache Calcite.
 Would you please give me the contributor permission?
 My JIRA ID is Ziwei Liu.

Thanks.



Re: Quicksql

2019-12-11 Thread Haisheng Yuan
Nope, it doesn't use any adapters. It just submits partial SQL query to 
different engines.

If query contains table from single source, e.g.
select count(*) from hive_table1, hive_table2 where a=b;
then the whole query will be submitted to hive.

Otherwise, e.g.
select distinct a,b from hive_table union select distinct a,b from mysql_table;

The following query will be submitted to Spark and executed by Spark:
select a,b from spark_tmp_table1 union select a,b from spark_tmp_table2;

spark_tmp_table1: select distinct a,b from hive_table 
spark_tmp_table2: select distinct a,b from mysql_table

On 2019/12/11 04:27:07, "Juan Pan"  wrote: 
> Hi Haisheng,
> 
> 
> > The query on different data source will then be registered as temp spark 
> > tables (with filter or join pushed in), the whole query is rewritten as SQL 
> > text over these temp tables and submitted to Spark.
> 
> 
> Does it mean QuickSQL also need adaptors to make query executed on different 
> data source? 
> 
> 
> > Yes, virtualization is one of Calcite’s goals. In fact, when I created 
> > Calcite I was thinking about virtualization + in-memory materialized views. 
> > Not only the Spark convention but any of the “engine” conventions (Drill, 
> > Flink, Beam, Enumerable) could be used to create a virtual query engine.
> 
> 
> Basically, i like and agree with Julian’s statement. It is a great idea which 
> personally hope Calcite move towards.
> 
> 
> Give my best wishes to Calcite community. 
> 
> 
> Thanks,
> Trista
> 
> 
>  Juan Pan
> 
> 
> panj...@apache.org
> Juan Pan(Trista), Apache ShardingSphere
> 
> 
> On 12/11/2019 10:53,Haisheng Yuan wrote:
> As far as I know, users still need to register tables from other data sources 
> before querying it. QuickSQL uses Calcite for parsing queries and optimizing 
> logical expressions with several transformation rules. The query on different 
> data source will then be registered as temp spark tables (with filter or join 
> pushed in), the whole query is rewritten as SQL text over these temp tables 
> and submitted to Spark.
> 
> - Haisheng
> 
> --
> 发件人:Rui Wang
> 日 期:2019年12月11日 06:24:45
> 收件人:
> 主 题:Re: Quicksql
> 
> The co-routine model sounds fitting into Streaming cases well.
> 
> I was thinking how should Enumerable interface work with streaming cases
> but now I should also check Interpreter.
> 
> 
> -Rui
> 
> On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde  wrote:
> 
> The goal (or rather my goal) for the interpreter is to replace
> Enumerable as the quick, easy default convention.
> 
> Enumerable is efficient but not that efficient (compared to engines
> that work on off-heap data representing batches of records). And
> because it generates java byte code there is a certain latency to
> getting a query prepared and ready to run.
> 
> It basically implements the old Volcano query evaluation model. It is
> single-threaded (because all work happens as a result of a call to
> 'next()' on the root node) and cannot handle branching data-flow
> graphs (DAGs).
> 
> The Interpreter operates uses a co-routine model (reading from queues,
> writing to queues, and yielding when there is no work to be done) and
> therefore could be more efficient than enumerable in a single-node
> multi-core system. Also, there is little start-up time, which is
> important for small queries.
> 
> I would love to add another built-in convention that uses Arrow as
> data format and generates co-routines for each operator. Those
> co-routines could be deployed in a parallel and/or distributed data
> engine.
> 
> Julian
> 
> On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas
>  wrote:
> 
> What is the ultimate goal of the Calcite Interpreter?
> 
> To provide some context, I have been playing around with calcite + REST
> (see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest <
> https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for
> detail of my experiments)
> 
> 
> —Z
> 
> On Dec 9, 2019, at 9:05 PM, Julian Hyde  wrote:
> 
> Yes, virtualization is one of Calcite’s goals. In fact, when I created
> Calcite I was thinking about virtualization + in-memory materialized views.
> Not only the Spark convention but any of the “engine” conventions (Drill,
> Flink, Beam, Enumerable) could be used to create a virtual query engine.
> 
> See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite)
> https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
> <
> https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
> .
> 
> Julian
> 
> 
> 
> On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana 
> wrote:
> 
> I recently contacted one of the active contributors asking about the
> purpose of the project and here's his reply:
> 
> From my understanding, Quicksql is a data virtualization platform. It
> can
> query multiple data sources altogether and in a distributed way;
> Say, you
> can write a SQL with a

[jira] [Created] (CALCITE-3596) Sql in annotation of OverScope class has syntax error

2019-12-11 Thread Wang Yanlin (Jira)
Wang Yanlin created CALCITE-3596:


 Summary: Sql in annotation of OverScope class has syntax error
 Key: CALCITE-3596
 URL: https://issues.apache.org/jira/browse/CALCITE-3596
 Project: Calcite
  Issue Type: Bug
Reporter: Wang Yanlin


The sql in class of OverScope 
(https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/OverScope.java#L33)
 has syntax error, cause SqlParseException

{code:java}
// SqlToRelConverterTest
@Test public void testOver() {
/*
final String sql = "SELECT * FROM\n"
+ "(SELECT deptno, count(*) OVER (ORDER BY empno RANGE BETWEEN 2 
PRECEDING AND 2 FOLLOWING) FROM emp) ";
*/
final String sql = "SELECT name FROM\n"
+ "(SELECT * FROM\n"
+ "emp OVER (ORDER BY empno RANGE BETWEEN 2 PRECEDING AND 2 
FOLLOWING))";
sql(sql).ok();
  }
{code}
got
{code:java}
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered 
"OVER" at line 3, column 5.
Was expecting one of:
"AS" ...
"EXCEPT" ...
"EXTEND" ...
"FETCH" ...
"FOR" ...
"GROUP" ...
"HAVING" ...
"INTERSECT" ...
"LIMIT" ...
"MATCH_RECOGNIZE" ...
"OFFSET" ...
"ORDER" ...
"MINUS" ...
"TABLESAMPLE" ...
"UNION" ...
"WHERE" ...
"WINDOW" ...
"(" ...
")" ...
 ...
 ...
 ...
 ...
 ...
 ...
"NATURAL" ...
"JOIN" ...
"INNER" ...
"LEFT" ...
"RIGHT" ...
"FULL" ...
"CROSS" ...
"," ...
"OUTER" ...
"." ...
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3595) Test infrastructure overwrites reference log with wrong results

2019-12-11 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-3595:


 Summary: Test infrastructure overwrites reference log with wrong 
results
 Key: CALCITE-3595
 URL: https://issues.apache.org/jira/browse/CALCITE-3595
 Project: Calcite
  Issue Type: Bug
  Components: build
Reporter: Julian Hyde


When I add a test case to a test that is based on DiffRepository (such as 
SqlPrettyWriterTest), and run it in Intellij, the first time gives an error 
because the actual output does not match the expected output. This is good.

I would expect that if I run it again, it will fail again, but it does not. I'm 
not sure exactly what is going on, but maybe the test is reading from an output 
file instead of a reference log. This was probably broken during the migration 
to gradle, because the paths for reference logs have all changed.

This bug prevents me from using the workflow where I add a test case, run it, 
and when it fails, change the code until the output looks right, then overwrite 
the reference log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3594) Support Hot Groupby keys hint

2019-12-11 Thread Rui Wang (Jira)
Rui Wang created CALCITE-3594:
-

 Summary: Support Hot Groupby keys hint
 Key: CALCITE-3594
 URL: https://issues.apache.org/jira/browse/CALCITE-3594
 Project: Calcite
  Issue Type: Sub-task
Reporter: Rui Wang
Assignee: Rui Wang


It will be useful for Apache Beam if we support the following SqlHint:

SELECT * FROM t
GROUP BY t.key_column /* + hot_key(key1=fanout_factor, key2=fanout_factor) */)

The hot key strategy works on aggregation and it provides a list of hot keys 
with fanout factor for a column. The fanout factor says how many partition 
should be created for that specific key, such that we can have a per partition 
aggregate and then have a final aggregate. One example to explain it:

SELECT * FROM t
GROUP BY t.key_column /* + hot_key("value1"=2) */)

// for the key_column, there is a "value1" which appear so many times (so it's 
hot), please consider split it into two partition and process separately.


Such problem is common for big data processing, where hot key creates slowest 
machine which either slow down the whole pipeline or make retries. In such 
case, one common resolution is to split data to multiple partition and 
aggregate per partition, and then have a final combine. 

Usually execution engine won't know what is the hot key(s). SqlHint provides a 
good way to tell the engine which key is useful to deal with it.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3593) RelToSqlConverter changes target of ambiguous having clause with a Project on top of an Aggregate

2019-12-11 Thread Steven Talbot (Jira)
Steven Talbot created CALCITE-3593:
--

 Summary: RelToSqlConverter changes target of ambiguous having 
clause with a Project on top of an Aggregate
 Key: CALCITE-3593
 URL: https://issues.apache.org/jira/browse/CALCITE-3593
 Project: Calcite
  Issue Type: Bug
Reporter: Steven Talbot


Best shown with the shell of a test in RelToSqlConverter test.

 

The following SQL on BigQuery 

 
{code:java}
select product_id - 1000 as product_id
from (
 select product_id, avg(gross_weight) as agw
 from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as product
 where net_weight < 100
 group by product_id having product_id > 0){code}
produces one result, because the having filter applies to the product id before 
subtraction, of course.

Running it through the machinery in that test translates it to:

 
{noformat}
SELECT product_id - 1000 AS product_id
from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as product
WHERE net_weight < 100
GROUP BY product_id
HAVING product_id > 0{noformat}
 

This changes the meaning of the query: now the HAVING is on the 
after-subtraction product_id and you get no results, rather than the one result.

 

Note that this is _not_ true in HyperSQL, as it has different semantics around 
the HAVING namespace.

 
{noformat}
select "product_id" - 1000 as "product_id"
from (
 select "product_id", avg("gross_weight") as agw
 from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" FROM 
(VALUES(0))) as product
 where "net_weight" < 100
 group by "product_id" having "product_id" > 0){noformat}
becomes 

 
{noformat}
SELECT "product_id" - 1000 AS "product_id"
from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" FROM 
(VALUES(0))) as product
WHERE "net_weight" < 100
GROUP BY "product_id"
HAVING "product_id" > 0{noformat}
 

But the meaning is preserved: both return a row. 

 

I'm not enough of a SQL standards expert to know which one is being more 
compliant, but presumably both would have to be supported via some sort of flag?

 

I think the fix would be to force the subselect on dialects such as BigQuery 
that have this behavior. Probably something that looks a lot like 
[https://github.com/apache/calcite/blob/3530daaa8cad43aad6845b6c79e4bc1ca0e72f5f/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1043-L1047]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] Release apache-calcite-avatica-1.16.0 (release candidate 1)

2019-12-11 Thread Francis Chuang

Hi all,

I have created a build for Apache Calcite Avatica 1.16.0, release
candidate 1.

Thanks to everyone who has contributed to this release.

You can read the release notes here:
https://github.com/apache/calcite-avatica/blob/512bbee4aa24ef9fb8106d0286d1243679dce2d0/site/_docs/history.md

The commit to be voted upon:
https://gitbox.apache.org/repos/asf?p=calcite-avatica.git;a=commit;h=512bbee4aa24ef9fb8106d0286d1243679dce2d0

Its hash is 512bbee4aa24ef9fb8106d0286d1243679dce2d0

Tag:
https://gitbox.apache.org/repos/asf?p=calcite-avatica.git;a=tag;h=refs/tags/avatica-1.16.0-rc1

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-avatica-1.16.0-rc1
(revision 37181)

The hashes of the artifacts are as follows:
102d3ab0e90dd1db5e012a966d265bdfa8a0f24f9016a4187a6e5f0135a14770da124493dd2c7a18c9d8d8b9af5ecf4f5aceb90d48421251f38bc6ce6f5be697
*apache-calcite-avatica-1.16.0-src.tar.gz

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachecalcite-1071/org/apache/calcite/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/francischuang.asc
https://www.apache.org/dist/calcite/KEYS

N.B.
To create the jars and test Apache Calcite Avatica: "./gradlew build 
-PskipSigning".


If you do not have a Java environment available, you can run the tests
using docker. To do so, install docker and docker-compose, then run
"docker-compose run test" from the root of the directory.

Please vote on releasing this package as Apache Calcite Avatica 1.16.0.

The vote is open for the next 72 hours and passes if a majority of at
least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Calcite 1.16.0
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Francis


Re: [Discuss] Make flattening on Struct/Row optional

2019-12-11 Thread Rui Wang
Thanks Stamatis's suggestion. Indeed a recent effort in [1] enhanced the
support that reconstructs ROW in the top SELECT, which is supposed to solve
the problem.



[1]: https://jira.apache.org/jira/browse/CALCITE-3138

On Mon, Dec 9, 2019 at 3:21 PM Rui Wang  wrote:

> Hello,
>
> Sorry for the long delay on this thread. Recently I heard about requests
> on how to deal with STRUCT without flattening it again in BeamSQL. Also I
> realized Flink has already disabled it in their codebase[1]. I did try to
> remove STRUCT flattening and run unit tests of calcite core to see how many
> tests breaks: it was 25, which wasn't that bad. So I would like to pick up
> this effort again.
>
> Before I do it, I just want to ask if Calcite community supports this
> effort (or think if it is a good idea)?
>
> My current execution plan will be the following:
> 1. Add a new flag to FrameworkConfig to specify whether flattening STRUCT.
> By default, it is yes.
> 2. When disabling struct flatterner, add more tests to test STRUCT support
> in general. For example, test STRUCT support on projection, join condition,
> filtering, etc.  If there is something breaks, try to fix it.
> 3. Check the 25 failed tests above and see why they have failed if struct
> flattener is gone. Duplicate those failed tests but have necessary fixes to
> make sure they can pass without STRUCT flattening.
>
>
> [1]:
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/calcite/FlinkPlannerImpl.scala#L166
>
>
> -Rui
>
> On Wed, Sep 5, 2018 at 11:59 AM Julian Hyde  wrote:
>
>> It might not be minor, but it’s worth a try. At optimization time we
>> treat all fields as fields, regardless of whether they have complex types
>> (maps, arrays, multisets, records) so there should not be too many
>> problems. The flattening was mainly for the benefit of the runtime.
>>
>>
>> > On Sep 5, 2018, at 11:32 AM, Rui Wang 
>> wrote:
>> >
>> > Thanks for your helpful response! It seems like disabling the flattening
>> > will at least affect some rules in optimization. It might not be a minor
>> > change.
>> >
>> >
>> > -Rui
>> >
>> > On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis 
>> > wrote:
>> >
>> >> Hi Rui,
>> >>
>> >> Disabling flattening in some cases seems reasonable.
>> >>
>> >> If I am not mistaken, even in the existing code it is not used all the
>> time
>> >> so it makes sense to become configurable.
>> >> For example, Calcite prepared statements (CalcitePrepareImpl) are
>> using the
>> >> flattener only for DDL operations that create materialized views (and
>> this
>> >> is because this code at some point passes from the PlannerImpl).
>> >> On the other hand, any query that is using the Planner will also pass
>> from
>> >> the flattener.
>> >>
>> >> Disabling the flattener does not mean that all rules will work without
>> >> problems. The Javadoc of the RelStructuredTypeFlattener at some point
>> says
>> >> "This approach has the benefit that real optimizer and codegen rules
>> never
>> >> have to deal with structured types.". Due to this, it is very likely
>> that
>> >> some rules were written based on the fact that there are no structured
>> >> types.
>> >>
>> >> Best,
>> >> Stamatis
>> >>
>> >>
>> >> Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde > >
>> >> έγραψε:
>> >>
>> >>> Flattening was introduced mainly because the original engine used flat
>> >>> column-oriented storage. Now we have several ways to executing,
>> >>> including generating java code.
>> >>>
>> >>> Adding a mode to disable flattening might make sense.
>> >>> On Tue, Sep 4, 2018 at 12:52 PM Rui Wang 
>> >>> wrote:
>> 
>>  Hi Community,
>> 
>>  While trying to support Row type in Apache Beam SQL on top of
>> Calcite,
>> >> I
>>  realized flattening Row logic will make structure information of Row
>> >> lost
>>  after Projections. There is a use case where users want to mix Beam
>>  programming model with Beam SQL together to process a dataset. The
>>  following is an example of the use case:
>> 
>>  dataset.apply(something user defined)
>> .apply(SELECT ...)
>> .apply(something user defined)
>> 
>>  As you can see, after the SQL statement is applied, the data
>> structure
>>  should be preserved for further processing.
>> 
>>  The most straightforward way to me is to make Struct fattening
>> optional
>> >>> so
>>  I could choose to disable it and the Row structure is preserved. Can
>> I
>> >>> ask
>>  if it is feasible to make it happen? What could happen if Calcite
>> just
>>  doesn't flatten Struct in flattener? (I tried to disable it but had
>>  exceptions in optimizer. I wasn't sure if that were some minor thing
>> to
>> >>> fix
>>  or Struct flattening was a design choice so the impact of change was
>> >>> huge)
>> 
>>  Additionally, if there is a way to keep the information that I can
>> use
>> >

[jira] [Created] (CALCITE-3592) Add "~" SqlPrefixOperator for bitwise not

2019-12-11 Thread hailong wang (Jira)
hailong wang created CALCITE-3592:
-

 Summary: Add "~" SqlPrefixOperator for bitwise not
 Key: CALCITE-3592
 URL: https://issues.apache.org/jira/browse/CALCITE-3592
 Project: Calcite
  Issue Type: Improvement
  Components: babel
Affects Versions: 1.21.0
Reporter: hailong wang
 Fix For: 1.22.0, 1.21.0


mysql、postgresql or SQL Server  support non-standard operator "~" as Bitwise 
not, I think we support "~" SqlPrefixOperator for bitwise not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3591) Add BIT_XOR BIT aggregate function

2019-12-11 Thread hailong wang (Jira)
hailong wang created CALCITE-3591:
-

 Summary: Add BIT_XOR BIT aggregate function
 Key: CALCITE-3591
 URL: https://issues.apache.org/jira/browse/CALCITE-3591
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.21.0
Reporter: hailong wang
 Fix For: 1.22.0, 1.21.0


In https://issues.apache.org/jira/browse/CALCITE-2770 , We have support 
bit_and、bit_or aggregate function. I think we sholud also support BIT_XOR



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3590) Support SQL hints for Aggregate.

2019-12-11 Thread Shuo Cheng (Jira)
Shuo Cheng created CALCITE-3590:
---

 Summary: Support SQL hints for Aggregate.
 Key: CALCITE-3590
 URL: https://issues.apache.org/jira/browse/CALCITE-3590
 Project: Calcite
  Issue Type: Sub-task
  Components: core
Affects Versions: 1.21.0
Reporter: Shuo Cheng
 Fix For: 1.22.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3589) SqlValidatorException when conformation is set to LENIENT

2019-12-11 Thread bingfeng.guo (Jira)
bingfeng.guo created CALCITE-3589:
-

 Summary: SqlValidatorException when conformation is set to LENIENT
 Key: CALCITE-3589
 URL: https://issues.apache.org/jira/browse/CALCITE-3589
 Project: Calcite
  Issue Type: Bug
Reporter: bingfeng.guo


{color:#172b4d}When conformance is set to LENIENT, and the alias of the select 
expression is the same as the table name in the expression, and the expression 
appears in group by, a SqlValidatorException will occur{color}

{color:#ff8b00}SELECT MY_ID + 1 AS MY_ID, sum(PRICE){color}
{color:#ff8b00}FROM tableA{color}
{color:#ff8b00}GROUP BY MY_ID + 1{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3588) Support more operators in Join condition when convert RelNode to SqlNode

2019-12-11 Thread Jin Xing (Jira)
Jin Xing created CALCITE-3588:
-

 Summary: Support more operators in Join condition when convert 
RelNode to SqlNode
 Key: CALCITE-3588
 URL: https://issues.apache.org/jira/browse/CALCITE-3588
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jin Xing
Assignee: Jin Xing


Currently Calcite fail to convert RelNode to SqlNode when Join condition 
contains operators like 'SIMILAR_TO', 'IS_TRUE', 'IS_NOT_TRUE'. AssertionError 
will be thrown from [1]

 

[1] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L290



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[VOTE] [CALCITE-3559] Drop HydromaticFileSetCheck, upgrade Checkstyle, v2

2019-12-11 Thread Vladimir Sitnikov
Hi,

https://github.com/apache/calcite/pull/1625 fixes concurrency issues with
Checkstyle and improves "code style" error messages.
if no-one objects within three days, I'll assume lazy consensus and commit
it.

You can find more info here:
https://lists.apache.org/thread.html/62d8d7b5cdd73c14979bc4beef8472956caadd4a84e9dffa7c4123f1%40%3Cdev.calcite.apache.org%3E

Vladimir


Re: [VOTE] [CALCITE-3559] Drop HydromaticFileSetCheck, upgrade Checkstyle

2019-12-11 Thread Vladimir Sitnikov
Thank everybody for your participation.

The vote results are:

+1 Vladimir
-0 Michael
-1 Julian (it can not count as a veto because there's no technical
justification)

Unfortunately, the vote is undecided.

Vladimir


[jira] [Created] (CALCITE-3587) RexBuilder may lose decimal fraction for creating literal with DECIMAL type

2019-12-11 Thread Wang Yanlin (Jira)
Wang Yanlin created CALCITE-3587:


 Summary: RexBuilder may lose decimal fraction for creating literal 
with DECIMAL type
 Key: CALCITE-3587
 URL: https://issues.apache.org/jira/browse/CALCITE-3587
 Project: Calcite
  Issue Type: Bug
Reporter: Wang Yanlin



this test
{code:java}
// RexBuilderTest
@Test public void testDecimal() {
final RelDataTypeFactory typeFactory =
new SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);
final RelDataType type = typeFactory.createSqlType(SqlTypeName.DECIMAL, 4, 2);
final RexBuilder builder = new RexBuilder(typeFactory);
final RexLiteral literal = (RexLiteral) builder.makeLiteral(12.3, type, false);
Comparable value = literal.getValue();
assertThat(value.toString(), is("12.3"));
}
{code}

fails with message 

{code:java}
java.lang.AssertionError: 
Expected: is "12.3"
 but: was "12"
Expected :12.3
Actual   :12
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)