Hi Talat,

I did get your test case running and added some logging to
RexProgramBuilder.mergePrograms. There is only one merge that occurs during
the test and it has an output type of RecordType(JavaType(int) ID,
JavaType(class java.lang.String) V). This does seem like the correct output
name but it doesn't match the final output name, so something is still
different than the Beam test case. I also modified mergePrograms to
purposely corrupt the output names, that did not cause the test to fail or
trip the 'assert mergedProg.getOutputRowType() ==
topProgram.getOutputRowType();' in mergePrograms. I could not find any
Calcite unit tests for RexProgramBuilder.mergePrograms or
CoreRules.CALC_MERGE rule so I think it is still probable that the problem
is in this area.

One minor issue I encountered. It took me a while to get your test case
running, it doesn't appear there are any calcite gradle rules to run
CoreQuidemTest and constructing the classpath manually was tedious. Did I
miss something?

I'm still working on this but I'm out today and Monday, it will probably be
Wednesday before I make any more progress.

Andrew

On Fri, Jan 27, 2023 at 10:40 AM Talat Uyarer <tuya...@paloaltonetworks.com>
wrote:

> Hi Andrew,
>
> Yes This aligned also with my debugging. In My Kenn's reply you can see a
> sql test which I wrote in Calcite. Somehow Calcite does not have this issue
> with the 1.28 version.
>
> !use post
> !set outputformat mysql
>
> #Test aliases with with clause
> WITH tempTable(id, v) AS (select "hr"."emps"."empid" as id, 
> "hr"."emps"."name" as v from "hr"."emps")
> SELECT tempTable.id as id, tempTable.v as "value" FROM tempTable WHERE 
> tempTable.v <> '11' ;
> +-----+-----------+
> | ID  | value     |
> +-----+-----------+
> | 100 | Bill      |
> | 110 | Theodore  |
> | 150 | Sebastian |
> | 200 | Eric      |
> +-----+-----------+
> (4 rows)
>
> !ok
>
>
> On Wed, Jan 25, 2023 at 6:08 PM Andrew Pilloud <apill...@google.com>
> wrote:
>
>> Yes, that worked.
>>
>> The issue does not occur if I disable all of the following planner rules:
>> CoreRules.FILTER_CALC_MERGE, CoreRules.PROJECT_CALC_MERGE,
>> LogicalCalcMergeRule.INSTANCE (which wraps CoreRules.CALC_MERGE),
>> and BeamCalcMergeRule.INSTANCE (which wraps CoreRules.CALC_MERGE).
>>
>> All the rules share a common call to RexProgramBuilder.mergePrograms, so
>> I suspect the problem lies there. I spent some time looking but wasn't able
>> to find it by code inspection, it looks like this code path is doing the
>> right thing with names. I'll spend some time tomorrow trying to reproduce
>> this on pure Calcite.
>>
>> Andrew
>>
>>
>> On Tue, Jan 24, 2023 at 8:24 PM Talat Uyarer <
>> tuya...@paloaltonetworks.com> wrote:
>>
>>> Hi Andrew,
>>>
>>> Thanks for writing a test for this use case. Without Where clause it
>>> works as expected on our test cases also too. Please add where clause on
>>> second select. With the below query it does not return column names. I
>>> tested on my local also.
>>>
>>> WITH tempTable (id, v) AS (SELECT f_int as id, f_string as v FROM
>>> PCOLLECTION) SELECT id AS fout_int, v AS fout_string FROM tempTable WHERE
>>> id > 1
>>>
>>> Thanks
>>>
>>> On Tue, Jan 24, 2023 at 5:28 PM Andrew Pilloud <apill...@google.com>
>>> wrote:
>>>
>>>> +dev@beam.apache.org <dev@beam.apache.org>
>>>>
>>>> I tried reproducing this but was not successful, the output schema was
>>>> as expected. I added the following to BeamSqlMultipleSchemasTest.java at
>>>> head. (I did discover that  PAssert.that(result).containsInAnyOrder(output)
>>>> doesn't validate column names however.)
>>>>
>>>>   @Test
>>>>   public void testSelectAs() {
>>>>     PCollection<Row> input = pipeline.apply(create(row(1, "strstr")));
>>>>
>>>>     PCollection<Row> result =
>>>>         input.apply(SqlTransform.query("WITH tempTable (id, v) AS
>>>> (SELECT f_int as id, f_string as v FROM PCOLLECTION) SELECT id AS fout_int,
>>>> v AS fout_string FROM tempTable"));
>>>>
>>>>     Schema output_schema =
>>>>
>>>> Schema.builder().addInt32Field("fout_int").addStringField("fout_string").build();
>>>>     assertThat(result.getSchema(), equalTo(output_schema));
>>>>
>>>>     Row output = Row.withSchema(output_schema).addValues(1,
>>>> "strstr").build();
>>>>     PAssert.that(result).containsInAnyOrder(output);
>>>>     pipeline.run();
>>>>   }
>>>>
>>>> On Tue, Jan 24, 2023 at 8:13 AM Talat Uyarer <
>>>> tuya...@paloaltonetworks.com> wrote:
>>>>
>>>>> Hi Kenn,
>>>>>
>>>>> Thank you for replying back to my email.
>>>>>
>>>>> I was under the same impression about Calcite. But I wrote a test on
>>>>> Calcite 1.28 too. It is working without issue that I see on BEAM
>>>>>
>>>>> Here is my test case. If you want you can also run on Calcite. Please
>>>>> put under core/src/test/resources/sql as text file. and Run CoreQuidemTest
>>>>> class.
>>>>>
>>>>> !use post
>>>>> !set outputformat mysql
>>>>>
>>>>> #Test aliases with with clause
>>>>> WITH tempTable(id, v) AS (select "hr"."emps"."empid" as id, 
>>>>> "hr"."emps"."name" as v from "hr"."emps")
>>>>> SELECT tempTable.id as id, tempTable.v as "value" FROM tempTable WHERE 
>>>>> tempTable.v <> '11' ;
>>>>> +-----+-----------+
>>>>> | ID  | value     |
>>>>> +-----+-----------+
>>>>> | 100 | Bill      |
>>>>> | 110 | Theodore  |
>>>>> | 150 | Sebastian |
>>>>> | 200 | Eric      |
>>>>> +-----+-----------+
>>>>> (4 rows)
>>>>>
>>>>> !ok
>>>>>
>>>>>
>>>>> On Mon, Jan 23, 2023 at 10:16 AM Kenneth Knowles <k...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Looking at the code that turns a logical CalcRel into a BeamCalcRel I
>>>>>> do not see any obvious cause for this:
>>>>>> https://github.com/apache/beam/blob/b3aa2e89489898f8c760294ba4dba2310ac53e70/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamCalcRule.java#L69
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_beam_blob_b3aa2e89489898f8c760294ba4dba2310ac53e70_sdks_java_extensions_sql_src_main_java_org_apache_beam_sdk_extensions_sql_impl_rule_BeamCalcRule.java-23L69&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=BkW1L6EF7ergAVYDXCo-3Vwkpy6qjsWAz7_GD7pAR8g&m=KXc2qSceL6qFbFnQ_2qUOHr9mKuc6zYY8rJTNZC8p_wTcNs4M6mHQoCuoc4JfeaA&s=KjzplEf29oFB6uivvdjixpQiArWtfV-1SXpALL-ugEM&e=>
>>>>>>
>>>>>> I don't like to guess that upstream libraries have the bug, but in
>>>>>> this case I wonder if the alias is lost in the Calcite optimizer rule for
>>>>>> merging the projects and filters into a Calc.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Mon, Jan 23, 2023 at 10:13 AM Kenneth Knowles <k...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> I am not sure I understand the question, but I do see an issue.
>>>>>>>
>>>>>>> Context: "CalcRel" is an optimized relational operation that is
>>>>>>> somewhat like ParDo, with a small snippet of a single-assignment DSL
>>>>>>> embedded in it. Calcite will choose to merge all the projects and 
>>>>>>> filters
>>>>>>> into the node, and then generates Java bytecode to directly execute the 
>>>>>>> DSL.
>>>>>>>
>>>>>>> Problem: it looks like the CalcRel has output columns with aliases
>>>>>>> "id" and "v" where it should have output columns with aliases "id" and
>>>>>>> "value".
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Thu, Jan 19, 2023 at 6:01 PM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Adding: @Andrew Pilloud <apill...@google.com> @Kenneth Knowles
>>>>>>>> <k...@google.com>
>>>>>>>>
>>>>>>>> On Thu, Jan 12, 2023 at 12:31 PM Talat Uyarer via user <
>>>>>>>> u...@beam.apache.org> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I am using Beam 2.43 with Calcite SQL with Java.
>>>>>>>>>
>>>>>>>>> I have a query with a WITH clause and some aliasing. Looks like
>>>>>>>>> Beam Query optimizer after optimizing my query, it drops Select 
>>>>>>>>> statement's
>>>>>>>>> aliases. Can you help me to identify where the problem is ?
>>>>>>>>>
>>>>>>>>> This is my query
>>>>>>>>> INFO: SQL:
>>>>>>>>> WITH `tempTable` (`id`, `v`) AS (SELECT
>>>>>>>>> `PCOLLECTION`.`f_nestedRow`.`f_nestedInt` AS `id`,
>>>>>>>>> `PCOLLECTION`.`f_nestedRow`.`f_nestedString` AS `v`
>>>>>>>>> FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`) (SELECT
>>>>>>>>> `tempTable`.`id` AS `id`, `tempTable`.`v` AS `value`
>>>>>>>>> FROM `tempTable` AS `tempTable`
>>>>>>>>> WHERE `tempTable`.`v` <> '11')
>>>>>>>>>
>>>>>>>>> This is Calcite Plan look at LogicalProject(id=[$0], value=[$1])
>>>>>>>>> in SQL plan.
>>>>>>>>>
>>>>>>>>> Jan 12, 2023 12:19:08 PM
>>>>>>>>> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner 
>>>>>>>>> convertToBeamRel
>>>>>>>>> INFO: SQLPlan>
>>>>>>>>> LogicalProject(id=[$0], value=[$1])
>>>>>>>>>   LogicalFilter(condition=[<>($1, '11')])
>>>>>>>>>     LogicalProject(id=[$1.f_nestedInt], v=[$1.f_nestedString])
>>>>>>>>>       BeamIOSourceRel(table=[[beam, PCOLLECTION]])
>>>>>>>>>
>>>>>>>>> But Beam Plan does not have a LogicalProject(id=[$0], value=[$1])
>>>>>>>>> or similar.
>>>>>>>>>
>>>>>>>>> Jan 12, 2023 12:19:08 PM
>>>>>>>>> org.apache.beam.sdk.extensions.sql.impl.CalciteQueryPlanner 
>>>>>>>>> convertToBeamRel
>>>>>>>>> INFO: BEAMPlan>
>>>>>>>>> BeamCalcRel(expr#0..1=[{inputs}], expr#2=[$t1.f_nestedInt],
>>>>>>>>> expr#3=[$t1.f_nestedString], expr#4=['11':VARCHAR], expr#5=[<>($t3, 
>>>>>>>>> $t4)],
>>>>>>>>> id=[$t2], v=[$t3], $condition=[$t5])
>>>>>>>>>   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>

Reply via email to