Hello,

I have been working using Calcite with PostgreSQL as a database, and I tried to 
run the TPCH benchmark using Calcite.
I have the TPCH data populated in PostgreSQL, and I have two ways for querying 
the data:

  1.  Using the JDBC connector of PostgreSQL and evaluate the query directly on 
PostgreSQL without using Calcite.
  2.  Using Calcite to communicate with PostgreSQL and evaluating the query 
using Calcite’s API (I use the JDBC adapter in Calcite).

When running Query 12 in the TPCH benchmark, and I get two different results. I 
tried with other queries such as Query-3 and Query-4, and I get the same result 
with both approaches.
I also tried a third approach which is using Spark SQL to query the PostgreSQL 
database, and the result for Query 12 was the same result as in approach (1), 
but different from approach (2).

I would be happy to know if you are aware of any work in progress / bugs that 
might lead to this behavior.

This is the main JAVA code that I’m using with Calcite:
Connection connection = DriverManager.getConnection("jdbc:calcite:");
CalciteConnection calciteConnection = 
connection.unwrap(CalciteConnection.class);
SchemaPlus rootSchema = calciteConnection.getRootSchema();

BasicDataSource dataSource = new BasicDataSource();
dataSource.setUrl(Utils.POSTGRES_DB_JDBC_URL);
dataSource.setUsername(Utils.POSTGRES_USER);
dataSource.setPassword(Utils.POSTGRES_PASSWORD);

JdbcSchema jdbcSchema = JdbcSchema.create(rootSchema, "tpch", dataSource,
        null, "tpch");

SchemaPlus schema = rootSchema.add("tpch",jdbcSchema);

Statement statement = calciteConnection.createStatement();
String tpchQuery12 = "select\n"
        + "  \"l\".\"l_shipmode\",\n"
        + "  sum(case\n"
        + "    when \"o\".\"o_orderpriority\" = '1-URGENT'\n"
        + "      or \"o\".\"o_orderpriority\" = '2-HIGH'\n"
        + "      then 1\n"
        + "    else 0\n"
        + "  end) as \"high_line_count\",\n"
        + "  sum(case\n"
        + "    when \"o\".\"o_orderpriority\" <> '1-URGENT'\n"
        + "      and \"o\".\"o_orderpriority\" <> '2-HIGH'\n"
        + "      then 1\n"
        + "    else 0\n"
        + "  end) as \"low_line_count\"\n"
        + "from\n"
        + "  \"tpch\".\"orders\" \"o\",\n"
        + "  \"tpch\".\"lineitem\" \"l\"\n"
        + "where\n"
        + "  \"o\".\"o_orderkey\" = \"l\".\"l_orderkey\"\n"
        + "  and \"l\".\"l_shipmode\" in ('TRUCK', 'REG AIR')\n"
        + "  and \"l\".\"l_commitdate\" < \"l\".\"l_receiptdate\"\n"
        + "  and \"l\".\"l_shipdate\" < \"l\".\"l_commitdate\"\n"
        + "--  and \"l\".\"l_receiptdate\" >= date '1994-01-01'\n"
        + "--  and \"l\".\"l_receiptdate\" < date '1994-01-01' + interval '1' 
year\n"
        + "group by\n"
        + "  \"l\".\"l_shipmode\"\n"
        + "order by\n"
        + "  \"l\".\"l_shipmode\"";
ResultSet rs = statement.executeQuery(tpchQuery12);


Thanks in advance,
Robert

Reply via email to