Hi Lekshmi,

Thanks for the interesting information. It is good to see more people
involved in the benchmark and optimizations on Calcite.

However, I am not sure I understand what you are trying to achieve by
performing an all-in-all comparison between Calcite and other databases (in
the particular case with Postgres).
Calcite provides you everything you need to build a database but itself is
not a database. Could you possibly share a bit more information on what you
are expecting to gain from these kind of experiments.

On the other hand, it would be very interesting to compare individual parts
of Calcite (e.g., optimizer) with the respective ones of Postgres (or other
database) although this will not be easy.
If for instance, you want to compare the optimizers in terms of
performance, time may not be a good metric since C code will almost always
be faster than Java code.
Another comparison axe for the optimizer, could be the quality of the
produced plans but finding a good metric can be also challenging. Measuring
the quality of the plan could be based on the execution time of the plan on
the same engine (all in Calcite or all in Postgres for instance). In terms
of research, I guess it would be very nice to demonstrate that a Volcano
optimizer (Calcite) with a custom rule-set can beat the built-in optimizer
of Postgres in terms of plan quality; plus it would be very useful for many
end-users of Calcite to have a rule-set that simulates the optimizer of
Postgres (or another database).

As a general comment, I think it would be easier to find good use cases in
favor of Calcite if you emphasize in data integration scenarios,
cross-database queries,  querying raw data (not in a database) and/or
systems without an optimizer.

Best,
Stamatis

Στις Δευ, 31 Δεκ 2018 στις 12:35 μ.μ., ο/η Lekshmi <lekshmib...@gmail.com>
έγραψε:

> Hi Julian,
>
> Thanks for a lot for the prompt response and support. I will try running
> the test with JMH and will let you know the feedback.
>
> I wish you all have a prosperous new year.
>
> Thanks and Regards
>
> Lekshmi B.G
> Email: lekshmib...@gmail.com
>
>
>
>
> On Mon, Dec 31, 2018 at 10:38 AM Julian Feinauer <
> j.feina...@pragmaticminds.de> wrote:
>
> > Hi Lekshmi,
> >
> > your activity sounds very interesting.
> > One important thing to note is that Performance testing in Java is always
> > tricky due to JIT and "warmup" phase of the JVM. Thus it is generally
> > recommended to do these tests with JMH (
> > https://openjdk.java.net/projects/code-tools/jmh/).
> >
> > I would assume that the time for sql2rel reduces drastically (perhaps one
> > or two orders) when run with JMH.
> >
> > Best
> > Julian
> >
> > Am 30.12.18, 23:12 schrieb "Lekshmi" <lekshmib...@gmail.com>:
> >
> >     Hello Folks,
> >
> >     For my research activities, I was trying to perform a benchmark
> > comparison
> >     between calcite with other database systems.  As an initial step, I
> was
> >     trying to do it for *Calcite* and *PostgresSql*. So, I thought TPCH
> > queries
> >     were the right thing to start with. I tried running the TpchTest (
> >
> >
> https://github.com/apache/calcite/blob/master/plus/src/test/java/org/apache/calcite/adapter/tpch/TpchTest.java
> > )
> >     by adding the *CalciteTimingTracer* in the junit tests to determine
> the
> >     execution time. While doing so, I could see that the execution time
> in
> >     calcite is significantly higher compared to postgresSql. On further
> >     investigation, I could see that we generate the required datas
> > required for
> >     these queries(which comes around 150,000 for some tables) and I was
> > under
> >     an impression that most of the time was spend on the data generation
> > and
> >     that the query execution could be faster. So, I modified the relevant
> >     schema class (
> >
> >
> https://github.com/apache/calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java
> > )
> >     to perform the data generation and query execution separately. Then,
> I
> >     traced the time took for just query execution. Even, then there was a
> >     significant difference from that of PostgresSql.
> >
> >     I, also enabled the *log4j.rootLogger* to *TRACE * to find the time
> > spend
> >     for sql2rel and optimization phases of the class Prepare
> >     <
> >
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/prepare/Prepare.java
> > >.
> >     And, to my surprise, I could see that calcite takes a time of 355ms
> for
> >     sql2rel and 352ms for optimization for the junit test *testQuery01*.
> > On the
> >     other side, the same query gave a planning time of 0.163ms in
> Postgres.
> >
> >     I would like to know, if this is the right way to test the
> performance
> > of
> >     TPCH queries using apache calcite. Can anyone let me know if there
> > exist
> >     any better ways to do it.
> >
> >     And, while searching through JIRA, I could find a ticket
> >     https://issues.apache.org/jira/browse/CALCITE-2169 which was created
> > by
> >     Edmon Begoli for performing a comparative performance study of the
> > calcite
> >     framework. I think, its related to my current problem. I have no idea
> >     regarding the status of the ticket. It would be really great if
> someone
> >     could help me with some information on it.
> >
> >     Also, now coming to the personal preference, I would like to continue
> > my
> >     research in calcite due to its simplicity and extensibility.  But,
> if I
> >     fail to give a good case study in favour of Calcite, I am afraid
> that I
> >     could loose an opportunity to work with calcite.
> >
> >     Thanks and Regards
> >
> >     Lekshmi B.G
> >     Email: lekshmib...@gmail.com
> >
> >
> >
>

Reply via email to