[jira] [Commented] (CALCITE-6393) Byte code of SqlFunctions is invalid

2024-05-03 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843139#comment-17843139
 ] 

Ruben Q L commented on CALCITE-6393:


Thanks for creating the PR [~zabetak]!

[~MasseGuillaume], do you think this new check would need to verify both cases 
on your original code snippet (checkClass true/false), considering that both 
have unveiled different exceptions?

> Byte code of SqlFunctions is invalid
> 
>
> Key: CALCITE-6393
> URL: https://issues.apache.org/jira/browse/CALCITE-6393
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.36.0
>Reporter: Sergey Nuyanzin
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> The issue is a result of testing of Apache Calcite 1.37.0 rc 4 in this thread 
> [1]
> There is test project andprocedure provided by [~MasseGuillaume] [2] (see 
> also original thread where this was first discussed [3])
> it shows that since Calcite 1.36.0 it starts failing as 
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: Index 65536 out of bounds for 
> length 297
> at org.objectweb.asm.ClassReader.readLabel(ClassReader.java:2695)
> at org.objectweb.asm.ClassReader.createLabel(ClassReader.java:2711)
> at 
> org.objectweb.asm.ClassReader.readTypeAnnotations(ClassReader.java:2777)
> at org.objectweb.asm.ClassReader.readCode(ClassReader.java:1929)
> at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
> at org.objectweb.asm.ClassReader.accept(ClassReader.java:745)
> {noformat}
> Also  since Calcite 1.27.0 it starts failing as 
> {noformat}
> java.lang.IllegalArgumentException: Invalid end label (must be visited 
> first)
> at 
> org.objectweb.asm.util.CheckMethodAdapter.checkLabel(CheckMethodAdapter.java:1453)
> at 
> org.objectweb.asm.util.CheckMethodAdapter.visitLocalVariableAnnotation(CheckMethodAdapter.java:996)
> at 
> org.objectweb.asm.MethodVisitor.visitLocalVariableAnnotation(MethodVisitor.java:757)
> at 
> org.objectweb.asm.commons.MethodRemapper.visitLocalVariableAnnotation(MethodRemapper.java:257)
> at org.objectweb.asm.ClassReader.readCode(ClassReader.java:2614)
> at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
> {noformat}
> [1] https://lists.apache.org/thread/n6cs1l86mt6fc5q8pcxr97czs3p6w65f
> [2] https://github.com/MasseGuillaume/asm-remapper-bug
> [3] https://lists.apache.org/thread/o736wz4qnr4l285bj5gv073cy0qll9t0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6393) Byte code of SqlFunctions is invalid

2024-04-30 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6393:
---
Description: 
The issue is a result of testing of Apache Calcite 1.37.0 rc 4 in this thread 
[1]
There is test project andprocedure provided by [~MasseGuillaume] [2] (see also 
original thread where this was first discussed [3])

it shows that since Calcite 1.36.0 it starts failing as 
{noformat}
java.lang.ArrayIndexOutOfBoundsException: Index 65536 out of bounds for 
length 297
at org.objectweb.asm.ClassReader.readLabel(ClassReader.java:2695)
at org.objectweb.asm.ClassReader.createLabel(ClassReader.java:2711)
at 
org.objectweb.asm.ClassReader.readTypeAnnotations(ClassReader.java:2777)
at org.objectweb.asm.ClassReader.readCode(ClassReader.java:1929)
at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
at org.objectweb.asm.ClassReader.accept(ClassReader.java:745)
{noformat}

Also  since Calcite 1.27.0 it starts failing as 
{noformat}
java.lang.IllegalArgumentException: Invalid end label (must be visited 
first)
at 
org.objectweb.asm.util.CheckMethodAdapter.checkLabel(CheckMethodAdapter.java:1453)
at 
org.objectweb.asm.util.CheckMethodAdapter.visitLocalVariableAnnotation(CheckMethodAdapter.java:996)
at 
org.objectweb.asm.MethodVisitor.visitLocalVariableAnnotation(MethodVisitor.java:757)
at 
org.objectweb.asm.commons.MethodRemapper.visitLocalVariableAnnotation(MethodRemapper.java:257)
at org.objectweb.asm.ClassReader.readCode(ClassReader.java:2614)
at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
{noformat}
[1] https://lists.apache.org/thread/n6cs1l86mt6fc5q8pcxr97czs3p6w65f
[2] https://github.com/MasseGuillaume/asm-remapper-bug
[3] https://lists.apache.org/thread/o736wz4qnr4l285bj5gv073cy0qll9t0

  was:
The issue is a result of testing of Apache Calcite 1.37.0 rc 4 in this thread 
[1]
There is test project andprocedure provided by [~MasseGuillaume]

it shows that since Calcite 1.36.0 it starts failing as 
{noformat}
java.lang.ArrayIndexOutOfBoundsException: Index 65536 out of bounds for 
length 297
at org.objectweb.asm.ClassReader.readLabel(ClassReader.java:2695)
at org.objectweb.asm.ClassReader.createLabel(ClassReader.java:2711)
at 
org.objectweb.asm.ClassReader.readTypeAnnotations(ClassReader.java:2777)
at org.objectweb.asm.ClassReader.readCode(ClassReader.java:1929)
at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
at org.objectweb.asm.ClassReader.accept(ClassReader.java:745)
{noformat}

Also  since Calcite 1.27.0 it starts failing as 
{noformat}
java.lang.IllegalArgumentException: Invalid end label (must be visited 
first)
at 
org.objectweb.asm.util.CheckMethodAdapter.checkLabel(CheckMethodAdapter.java:1453)
at 
org.objectweb.asm.util.CheckMethodAdapter.visitLocalVariableAnnotation(CheckMethodAdapter.java:996)
at 
org.objectweb.asm.MethodVisitor.visitLocalVariableAnnotation(MethodVisitor.java:757)
at 
org.objectweb.asm.commons.MethodRemapper.visitLocalVariableAnnotation(MethodRemapper.java:257)
at org.objectweb.asm.ClassReader.readCode(ClassReader.java:2614)
at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
{noformat}
[1] https://lists.apache.org/thread/n6cs1l86mt6fc5q8pcxr97czs3p6w65f
[2] https://github.com/MasseGuillaume/asm-remapper-bug


> Byte code of SqlFunctions is invalid
> 
>
> Key: CALCITE-6393
> URL: https://issues.apache.org/jira/browse/CALCITE-6393
> Project: Calcite
>  Issue Type: Bug
>Reporter: Sergey Nuyanzin
>Priority: Major
>
> The issue is a result of testing of Apache Calcite 1.37.0 rc 4 in this thread 
> [1]
> There is test project andprocedure provided by [~MasseGuillaume] [2] (see 
> also original thread where this was first discussed [3])
> it shows that since Calcite 1.36.0 it starts failing as 
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: Index 65536 out of bounds for 
> length 297
> at org.objectweb.asm.ClassReader.readLabel(ClassReader.java:2695)
> at org.objectweb.asm.ClassReader.createLabel(ClassReader.java:2711)
> at 
> org.objectweb.asm.ClassReader.readTypeAnnotations(ClassReader.java:2777)
> at org.objectweb.asm.ClassReader.readCode(ClassReader.java:1929)
> at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
> at org.objectweb.asm.ClassReader.accept(ClassReader.java:745)
> {noformat}
> Also  since Calcite 1.27.0 it starts failing as 
> {noformat}
> java.lang.IllegalArgumentException: Invalid end label (must be visited 
> first)
> at 
> 

[jira] [Commented] (CALCITE-6393) Byte code of SqlFunctions is invalid

2024-04-30 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842388#comment-17842388
 ] 

Ruben Q L commented on CALCITE-6393:


Thanks [~Sergey Nuyanzin] for creating the ticket and describing the issue.
{quote}... by the way ArrayIndexOutOfBoundsException is only if it was built 
with jdk1.8, in case it was built with other jdk e.g. 11+ then there will 
another Exception
{quote}
What other exception do you get with jdk11+?

Correct me if I'm wrong [~MasseGuillaume], but the main blocking issue is #1: 
this problem prevents ASM from processing Calcite files correctly. Issue #2 
does not happen "by default", since it requires the usage of CheckClassAdapter 
(but it's still a problem that needs to be investigated).

> Byte code of SqlFunctions is invalid
> 
>
> Key: CALCITE-6393
> URL: https://issues.apache.org/jira/browse/CALCITE-6393
> Project: Calcite
>  Issue Type: Bug
>Reporter: Sergey Nuyanzin
>Priority: Major
>
> The issue is a result of testing of Apache Calcite 1.37.0 rc 4 in this thread 
> [1]
> There is test project andprocedure provided by [~MasseGuillaume]
> it shows that since Calcite 1.36.0 it starts failing as 
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: Index 65536 out of bounds for 
> length 297
> at org.objectweb.asm.ClassReader.readLabel(ClassReader.java:2695)
> at org.objectweb.asm.ClassReader.createLabel(ClassReader.java:2711)
> at 
> org.objectweb.asm.ClassReader.readTypeAnnotations(ClassReader.java:2777)
> at org.objectweb.asm.ClassReader.readCode(ClassReader.java:1929)
> at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
> at org.objectweb.asm.ClassReader.accept(ClassReader.java:745)
> {noformat}
> Also  since Calcite 1.27.0 it starts failing as 
> {noformat}
> java.lang.IllegalArgumentException: Invalid end label (must be visited 
> first)
> at 
> org.objectweb.asm.util.CheckMethodAdapter.checkLabel(CheckMethodAdapter.java:1453)
> at 
> org.objectweb.asm.util.CheckMethodAdapter.visitLocalVariableAnnotation(CheckMethodAdapter.java:996)
> at 
> org.objectweb.asm.MethodVisitor.visitLocalVariableAnnotation(MethodVisitor.java:757)
> at 
> org.objectweb.asm.commons.MethodRemapper.visitLocalVariableAnnotation(MethodRemapper.java:257)
> at org.objectweb.asm.ClassReader.readCode(ClassReader.java:2614)
> at org.objectweb.asm.ClassReader.readMethod(ClassReader.java:1515)
> {noformat}
> [1] https://lists.apache.org/thread/n6cs1l86mt6fc5q8pcxr97czs3p6w65f
> [2] https://github.com/MasseGuillaume/asm-remapper-bug



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6390) ArrowAdapterTest fails on Windows

2024-04-29 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841925#comment-17841925
 ] 

Ruben Q L commented on CALCITE-6390:


+1 on [~zabetak]'s idea.
Would it make sense to cancel the vote for 1.37 RC3, merge the PR and create a 
new RC?

> ArrowAdapterTest fails on Windows
> -
>
> Key: CALCITE-6390
> URL: https://issues.apache.org/jira/browse/CALCITE-6390
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> -That's seems somehow highlights the difference between Windows Server and 
> non Server-
> -we have tests against Windows Server on gha (windows-latest) and they are 
> green-
> -At the same time local tests on Windows 11 show that {{ArrowAdapterTest}} 
> fails like-
> Based on deeper analysis Arrow module was never tested on Windows since for 
> Windows conf on gha it is {{--exclude-task :arrow:build}} which makes it 
> skipping the tests for this module 
> https://github.com/apache/calcite/blob/aa8d81bf1ff39e3632aeb856fc4cc247ce9727e5/.github/workflows/main.yml#L110C60-L110C88
> Any attempt to test it leads to
> {noformat}
> FAILURE   0.0sec, org.apache.calcite.adapter.arrow.ArrowAdapterTest > 
> executionError
>     java.io.IOException: Failed to delete temp directory 
> D:\MyConfiguration\cancai.cai\AppData\Local\Temp\junit5105379620525559011. 
> The following paths could not be deleted (see suppressed exceptions for 
> details): , arrow
>         at 
> org.junit.jupiter.engine.extension.TempDirectory$CloseablePath.createIOExceptionWithAttachedFailures(TempDirectory.java:350)
>         at 
> org.junit.jupiter.engine.extension.TempDirectory$CloseablePath.close(TempDirectory.java:251)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.lambda$closeAllStoredCloseableValues$3(ExtensionValuesStore.java:68)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at java.util.ArrayList.forEach(ArrayList.java:1259)
>         at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:390)
>         at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>         at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>         at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.closeAllStoredCloseableValues(ExtensionValuesStore.java:68)
>         at 
> org.junit.jupiter.engine.descriptor.AbstractExtensionContext.close(AbstractExtensionContext.java:80)
>         at 
> org.junit.jupiter.engine.execution.JupiterEngineExecutionContext.close(JupiterEngineExecutionContext.java:53)
>         at 
> org.junit.jupiter.engine.descriptor.JupiterTestDescriptor.cleanUp(JupiterTestDescriptor.java:222)
>         at 
> org.junit.jupiter.engine.descriptor.JupiterTestDescriptor.cleanUp(JupiterTestDescriptor.java:57)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$cleanUp$10(NodeTestTask.java:167)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.cleanUp(NodeTestTask.java:167)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:98)
>         at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
>         at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService.invokeAll(ForkJoinPoolHierarchicalTestExecutorService.java:129)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
>         at 
> 

[jira] [Commented] (CALCITE-6390) ArrowAdapterTest fails on Windows

2024-04-29 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841912#comment-17841912
 ] 

Ruben Q L commented on CALCITE-6390:


[~Sergey Nuyanzin] thanks for investigating this. 
Wouldn't it be more precise to just exclude {{arrow:test}}  ?

> ArrowAdapterTest fails on Windows
> -
>
> Key: CALCITE-6390
> URL: https://issues.apache.org/jira/browse/CALCITE-6390
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Sergey Nuyanzin
>Priority: Major
>
> -That's seems somehow highlights the difference between Windows Server and 
> non Server-
> -we have tests against Windows Server on gha (windows-latest) and they are 
> green-
> -At the same time local tests on Windows 11 show that {{ArrowAdapterTest}} 
> fails like-
> Based on deeper analysis Arrow module was never tested on Windows since for 
> Windows conf on gha it is {{--exclude-task :arrow:build}} which makes it 
> skipping the tests for this module 
> https://github.com/apache/calcite/blob/aa8d81bf1ff39e3632aeb856fc4cc247ce9727e5/.github/workflows/main.yml#L110C60-L110C88
> Any attempt to test it leads to
> {noformat}
> FAILURE   0.0sec, org.apache.calcite.adapter.arrow.ArrowAdapterTest > 
> executionError
>     java.io.IOException: Failed to delete temp directory 
> D:\MyConfiguration\cancai.cai\AppData\Local\Temp\junit5105379620525559011. 
> The following paths could not be deleted (see suppressed exceptions for 
> details): , arrow
>         at 
> org.junit.jupiter.engine.extension.TempDirectory$CloseablePath.createIOExceptionWithAttachedFailures(TempDirectory.java:350)
>         at 
> org.junit.jupiter.engine.extension.TempDirectory$CloseablePath.close(TempDirectory.java:251)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.lambda$closeAllStoredCloseableValues$3(ExtensionValuesStore.java:68)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
>         at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>         at java.util.ArrayList.forEach(ArrayList.java:1259)
>         at java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:390)
>         at java.util.stream.Sink$ChainedReference.end(Sink.java:258)
>         at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483)
>         at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>         at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
>         at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
>         at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at 
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
>         at 
> org.junit.jupiter.engine.execution.ExtensionValuesStore.closeAllStoredCloseableValues(ExtensionValuesStore.java:68)
>         at 
> org.junit.jupiter.engine.descriptor.AbstractExtensionContext.close(AbstractExtensionContext.java:80)
>         at 
> org.junit.jupiter.engine.execution.JupiterEngineExecutionContext.close(JupiterEngineExecutionContext.java:53)
>         at 
> org.junit.jupiter.engine.descriptor.JupiterTestDescriptor.cleanUp(JupiterTestDescriptor.java:222)
>         at 
> org.junit.jupiter.engine.descriptor.JupiterTestDescriptor.cleanUp(JupiterTestDescriptor.java:57)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$cleanUp$10(NodeTestTask.java:167)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.cleanUp(NodeTestTask.java:167)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:98)
>         at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService$ExclusiveTask.compute(ForkJoinPoolHierarchicalTestExecutorService.java:185)
>         at 
> org.junit.platform.engine.support.hierarchical.ForkJoinPoolHierarchicalTestExecutorService.invokeAll(ForkJoinPoolHierarchicalTestExecutorService.java:129)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
>         at 
> org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
>         at 
> org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
>         at 
> org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
>         at 
> 

[jira] [Commented] (CALCITE-6385) LintTest fails when run in source distribution

2024-04-24 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840408#comment-17840408
 ] 

Ruben Q L commented on CALCITE-6385:


Nice catch [~zabetak]! 
To fix this we basically need to add
{code}
assumeTrue(TestUnsafe.haveGit(), "Invalid git environment");
{code}
at the beginning of both tests, right?

> LintTest fails when run in source distribution
> --
>
> Key: CALCITE-6385
> URL: https://issues.apache.org/jira/browse/CALCITE-6385
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.36.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> Steps to reproduce:
> # Download 
> [https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-1.37.0-rc0/apache-calcite-1.37.0-src.tar.gz]
> # tar -xvf apache-calcite-1.37.0-src.tar.gz
> # cd apache-calcite-1.37.0-src
> #  /opt/gradle/gradle-7.6.1/bin/gradle cleanTest :core:test --tests LintTest
> {noformat}
> FAILURE   0.1sec, org.apache.calcite.test.LintTest > 
> testContributorsFileIsSorted()
> java.lang.RuntimeException: command [git, ls-files, *.bat, *.cmd, *.csv, 
> *.fmpp, *.ftl, *.iq, *.java, *.json, *.jj, *.kt, *.kts, .mailmap, *.md, 
> *.properties, *.sh, *.sql, *.txt, *.xml, *.yaml, *.yml]: failed with exception
> at org.apache.calcite.util.TestUnsafe.getGitFiles(TestUnsafe.java:185)
> at 
> org.apache.calcite.util.TestUnsafe.getTextFiles(TestUnsafe.java:158)
> at 
> org.apache.calcite.test.LintTest.testContributorsFileIsSorted(LintTest.java:400)
> Caused by: java.lang.RuntimeException: command [git, ls-files, *.bat, 
> *.cmd, *.csv, *.fmpp, *.ftl, *.iq, *.java, *.json, *.jj, *.kt, *.kts, 
> .mailmap, *.md, *.properties, *.sh, *.sql, *.txt, *.xml, *.yaml, *.yml]: 
> exited with status 128
> at 
> org.apache.calcite.util.TestUnsafe.getGitFiles(TestUnsafe.java:180)
> ... 2 more
> FAILURE   0.0sec, org.apache.calcite.test.LintTest > testMailmapFile()
> java.lang.RuntimeException: command [git, ls-files, *.bat, *.cmd, *.csv, 
> *.fmpp, *.ftl, *.iq, *.java, *.json, *.jj, *.kt, *.kts, .mailmap, *.md, 
> *.properties, *.sh, *.sql, *.txt, *.xml, *.yaml, *.yml]: failed with exception
> at org.apache.calcite.util.TestUnsafe.getGitFiles(TestUnsafe.java:185)
> at 
> org.apache.calcite.util.TestUnsafe.getTextFiles(TestUnsafe.java:158)
> at org.apache.calcite.test.LintTest.testMailmapFile(LintTest.java:419)
> Caused by: java.lang.RuntimeException: command [git, ls-files, *.bat, 
> *.cmd, *.csv, *.fmpp, *.ftl, *.iq, *.java, *.json, *.jj, *.kt, *.kts, 
> .mailmap, *.md, *.properties, *.sh, *.sql, *.txt, *.xml, *.yaml, *.yml]: 
> exited with status 128
> at 
> org.apache.calcite.util.TestUnsafe.getGitFiles(TestUnsafe.java:180)
> ... 2 more
> FAILURE   0.1sec,6 completed,   2 failed,   2 skipped, 
> org.apache.calcite.test.LintTest
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-17 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838044#comment-17838044
 ] 

Ruben Q L commented on CALCITE-6265:


Thanks for taking care of the merge [~mbudiu]! I confirm my project's tests are 
back to stable with the latest Calcite main.

> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-15 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836560#comment-17836560
 ] 

Ruben Q L edited comment on CALCITE-6265 at 4/15/24 3:10 PM:
-

I have [opened a PR|https://github.com/apache/calcite/pull/3758] to fix the 
problems (partially reverting some of the original changes, partially making 
some adjustments):
- It seems the original issue could be solved easier in RexToLixTranslator, by 
simply converting to Number and then to storageType in case of Numeric types.
- All the bind* tests that were added on the first PR still work; plus a new 
test that I added to illustrate the issues that I found (with setBigDecimal)
- The problems that I detected on my downstream project are also fixed.
- The only thing that is missing (didn't have the time to dig deeper) from the 
original solution is the "check for overflow and throw". I have left the 
auxiliary method that generates this dynamic code (Expressions#convertChecked), 
and left a TODO on EnumUtils#convert for future work. I think that this was not 
part of the strict scope of the current ticket's description, so IMHO it would 
be acceptable open a separate ticket and work on that in the future, adding 
more thorough tests on this regard (and not just the one 
JdbcTest#bindOverflowingTinyIntParameter that was originally added, which I 
disabled on my branch).  UPDATE: created CALCITE-6366 for this purpose.

I'd appreciate some feedback, and if you think I can move on and merge my 
proposal.



was (Author: rubenql):
I have [opened a PR|https://github.com/apache/calcite/pull/3758] to fix the 
problems (partially reverting some of the original changes, partially making 
some adjustments):
- It seems the original issue could be solved easier in RexToLixTranslator, by 
simply converting to Number and then to storageType in case of Numeric types.
- All the bind* tests that were added on the first PR still work; plus a new 
test that I added to illustrate the issues that I found (with setBigDecimal)
- The problems that I detected on my downstream project are also fixed.
- The only thing that is missing (didn't have the time to dig deeper) from the 
original solution is the "check for overflow and throw". I have left the 
auxiliary method that generates this dynamic code (Expressions#convertChecked), 
and left a TODO on EnumUtils#convert for future work. I think that this was not 
part of the strict scope of the current ticket's description, so IMHO it would 
be acceptable open a separate ticket and work on that in the future, adding 
more thorough tests on this regard (and not just the one 
JdbcTest#bindOverflowingTinyIntParameter that was originally added, which I 
disabled on my branch). 

I'd appreciate some feedback, and if you think I can move on and merge my 
proposal.


> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6366) Code generated by EnumUtils#convert should throw an exception if the target type is overflowed

2024-04-15 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-6366:
--

 Summary: Code generated by EnumUtils#convert should throw an 
exception if the target type is overflowed
 Key: CALCITE-6366
 URL: https://issues.apache.org/jira/browse/CALCITE-6366
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Ruben Q L


Code generated by EnumUtils#convert should throw an exception if the target 
type is overflowed (consider using Expressions#convertChecked)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-12 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reopened CALCITE-6265:

  Assignee: Ruben Q L  (was: Tim Nieradzik)

> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-12 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836560#comment-17836560
 ] 

Ruben Q L edited comment on CALCITE-6265 at 4/12/24 11:16 AM:
--

I have [opened a PR|https://github.com/apache/calcite/pull/3758] to fix the 
problems (partially reverting some of the original changes, partially making 
some adjustments):
- It seems the original issue could be solved easier in RexToLixTranslator, by 
simply converting to Number and then to storageType in case of Numeric types.
- All the bind* tests that were added on the first PR still work; plus a new 
test that I added to illustrate the issues that I found (with setBigDecimal)
- The problems that I detected on my downstream project are also fixed.
- The only thing that is missing (didn't have the time to dig deeper) from the 
original solution is the "check for overflow and throw". I have left the 
auxiliary method that generates this dynamic code (Expressions#convertChecked), 
and left a TODO on EnumUtils#convert for future work. I think that this was not 
part of the strict scope of the current ticket's description, so IMHO it would 
be acceptable open a separate ticket and work on that in the future, adding 
more thorough tests on this regard (and not just the one 
JdbcTest#bindOverflowingTinyIntParameter that was originally added, which I 
disabled on my branch). 

I'd appreciate some feedback, and if you think I can move on and merge my 
proposal.



was (Author: rubenql):
I have [opened a PR|https://github.com/apache/calcite/pull/3758] to fix the 
problems (partially reverting some of the original changes, partially making 
some adjustments):
- It seems the original issue could be solved easier in RexToLixTranslator, by 
simply converting to Number and then to storageType in case of Numeric types.
- All the bind* tests that were added on the first PR still work; plus a new 
test that I added to illustrate the issues that I found (with setBigDecimal)
- The problems that I detected on my downstream project are also fixed.
- The only thing that is missing (didn't have the time to dig deeper) from the 
original solution is the "check for overflow and throw". I have left the 
auxiliary method that generates this dynamic code, and left a TODO on 
EnumUtils#convert for future work. I think that this was not part of the strict 
scope of the current ticket's description, so IMHO it would be ok open a 
separate ticket and work on that in the future, adding more thorough tests (and 
not just the one that was originally added, which I disabled on my branch). 

I'd appreciate some feedback, and if you think I can move on and merge my 
proposal.


> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Tim Nieradzik
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-12 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836560#comment-17836560
 ] 

Ruben Q L commented on CALCITE-6265:


I have [opened a PR|https://github.com/apache/calcite/pull/3758] to fix the 
problems (partially reverting some of the original changes, partially making 
some adjustments):
- It seems the original issue could be solved easier in RexToLixTranslator, by 
simply converting to Number and then to storageType in case of Numeric types.
- All the bind* tests that were added on the first PR still work; plus a new 
test that I added to illustrate the issues that I found (with setBigDecimal)
- The problems that I detected on my downstream project are also fixed.
- The only thing that is missing (didn't have the time to dig deeper) from the 
original solution is the "check for overflow and throw". I have left the 
auxiliary method that generates this dynamic code, and left a TODO on 
EnumUtils#convert for future work. I think that this was not part of the strict 
scope of the current ticket's description, so IMHO it would be ok open a 
separate ticket and work on that in the future, adding more thorough tests (and 
not just the one that was originally added, which I disabled on my branch). 

I'd appreciate some feedback, and if you think I can move on and merge my 
proposal.


> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Tim Nieradzik
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-12 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836464#comment-17836464
 ] 

Ruben Q L commented on CALCITE-6265:


I just found a simple test (to be added to JdbcTest.java) related to my case A, 
that would have worked fine before this commit, and now it's broken:
{code}
  @Test void bindDecimalParameter() {
final String sql =
"with cte as (select 2500.55 as val)"
+ "select * from cte where val = ?";

CalciteAssert.hr()
.query(sql)
.consumesPreparedStatement(p -> {
  p.setBigDecimal(1, new BigDecimal("2500.55"));
})
.returnsUnordered("VAL=2500.55");
  }
{code}

I'll try to provide a PR with a fix...

> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Tim Nieradzik
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-11 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836187#comment-17836187
 ] 

Ruben Q L edited comment on CALCITE-6265 at 4/11/24 2:06 PM:
-

FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalField = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not necessarily bring back the original value of myBigDecimal, since 
BigDecimal#longValue can lose information, and I guess this is what is 
happening here.

B) The issue with the ClassCastException (BigDecimal cannot be cast to Integer) 
is also related to the new casting in dynamic code, but it is a bit more 
subtle, and it seems related to some internal Hoisting that we have in place on 
my project (transforming literals into parameters to increase the chances of 
internal cache hits when converting a logical plan into physical plan). The 
thing is, in the literal=>parameter conversion, we use 
[RexLiteral#getValue3|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/rex/RexLiteral.java#L989]
 (which as stated on its javadoc returns "the value of this literal, in the 
form that the rex-to-lix translator wants it") to extract the value (and use it 
as parameter value). So far it worked like charm. But with this change, the 
ClassCastException arises because actually getValue3 of a RexLiteral with type 
= BasicSqlTypeINTEGER, returns the value as a BigDecimal (as it is stored in 
RexLiteral#value field), because its typeName is SqlTypeName#DECIMAL; btw this 
RexLiteral is created by the standard behavior that we get if we create a 
RexLiteral from an Integer via Calcite's 
[RelBuilder#literal(IntegerValue)|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L481].

PS: Also, I know I'm late to the party here, but one could argue if this bug is 
_really_  a bug, I mean, if there is an INT field, then the user should use the 
appropriate [setInt method; and not 
setShort|https://github.com/apache/calcite-avatica/blob/275a082d364e00a9f2e9d50a4468fef0782e3c81/core/src/main/java/org/apache/calcite/avatica/AvaticaSite.java#L93].
 I agree than having an exception at runtime in the dynamic code in this case 
is ugly; but maybe a check-and-fail could be added earlier? Or maybe it should 
be Avatica's responsibility to perform the necessary conversions (as 
[~julianhyde] suggested) ?
Is this bug fixing an error on Calcite side, which only happens on a certain 
scenario triggered by Avatica?






was (Author: rubenql):
FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalField = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets 

[jira] [Comment Edited] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-11 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836187#comment-17836187
 ] 

Ruben Q L edited comment on CALCITE-6265 at 4/11/24 1:47 PM:
-

FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalField = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not necessarily bring back the original value of myBigDecimal, since 
BigDecimal#longValue can lose information, and I guess this is what is 
happening here.

B) The issue with the ClassCastException (BigDecimal cannot be cast to Integer) 
is also related to the new casting in dynamic code, but it is a bit more 
subtle, and it seems related to some internal Hoisting that we have in place on 
my project (transforming literals into parameters to increase the chances of 
internal cache hits when converting a logical plan into physical plan). The 
thing is, in the literal=>parameter conversion, we use 
[RexLiteral#getValue3|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/rex/RexLiteral.java#L989]
 (which as stated on its javadoc returns "the value of this literal, in the 
form that the rex-to-lix translator wants it") to extract the value (and use it 
as parameter value). So far it worked like charm. But with this change, the 
ClassCastException arises because actually getValue3 of a RexLiteral with type 
= BasicSqlTypeINTEGER, returns the value as a BigDecimal (as it is stored in 
RexLiteral#value field), because its typeName is SqlTypeName#DECIMAL; btw this 
RexLiteral is created by the standard behavior that we get if we create a 
RexLiteral from an Integer via Calcite's 
[RelBuilder#literal(IntegerValue)|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L481].

 






was (Author: rubenql):
FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalField = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not 

[jira] [Comment Edited] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-11 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836187#comment-17836187
 ] 

Ruben Q L edited comment on CALCITE-6265 at 4/11/24 1:45 PM:
-

FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalField = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not necessarily bring back the original value of myBigDecimal, since 
BigDecimal#longValue can lose information, and I guess this is what is 
happening here.

B) The issue with the ClassCastException (BigDecimal cannot be cast to Integer) 
is also related to the new casting in dynamic code, but it is a bit more 
subtle, and it seems related to some internal Hoisting that we have in place on 
my project (transforming literals into parameters to increase the chances of 
internal cache hits when converting a logical plan into physical plan). The 
thing is, in the literal=>parameter conversion, we use 
[RexLiteral#getValue3|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/rex/RexLiteral.java#L989]
 (which as stated on its javadoc returns "the value of this literal, in the 
form that the rex-to-lix translator wants it") to extract the value (and use it 
as parameter value). So far it worked like charm. But with this change, the 
ClassCastException arises because actually getValue3 of a RexLiteral with type 
= BasicSqlTypeINTEGER, returns the value as a BigDecimal (as it is stored in 
RexLiteral#value field), because its typeName is SqlTypeName#DECIMAL; btw this 
RexLiteral is created by the standard behavior that we get if we create a 
RexLiteral from an Integer via Calcite's RelBuilder#literal(IntegerValue).

 






was (Author: rubenql):
FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalValue = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not necessarily bring back the original value of myBigDecimal, since 
BigDecimal#longValue can lose information, and I guess this is what is 
happening 

[jira] [Commented] (CALCITE-6265) Type coercion is failing for numeric values in prepared statements

2024-04-11 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836187#comment-17836187
 ] 

Ruben Q L commented on CALCITE-6265:


FYI, this fix caused a few regressions on my project:
A) one test that used to return 1 row now returns 0 rows
B) a couple of tests that used to work fine now fail with a ClassCastException 
on the dynamic code

After some investigation, I'd say A seems a real regression on Calcite, whereas 
B could be arguably on my side; but I share this info because probably both 
could happen to other projects if we release the next version with this patch 
as it is.

A) The problem with the test that no longer returns the expected 1 row is as 
follows:
- A table with a decimal field, there is one column with value {{BigDecimal 
DECIMAL1 = new BigDecimal("299792.45799985");}}
- A simple query: SELECT t.id FROM myTable t WHERE t.myDecimalValue = ?
- I set the parameter value to DECIMAL1 
- Execute the query, should get 1 row, it gets nothing.

The problem is that, before the patch, the dynamic code would simply "read" the 
parameter value from the root context and use it, [but now with the new code 
there is this cast logic into Number and back to 
BigDecimal|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/adapter/enumerable/RexToLixTranslator.java#L1396]
 (unnecessary in this case, since everything is BigDecimal), which leads to the 
following dynamic code:
{code}
(Number) root.get("?1") == null ? (java.math.BigDecimal) null : 
java.math.BigDecimal.valueOf(((Number) root.get("?1")).longValue())
{code}
The problem is that doing {{BigDecimal.valueOf(myBigDecimal.longValue())}} does 
not necessarily bring back the original value of myBigDecimal, since 
BigDecimal#longValue can lose information, and I guess this is what is 
happening here.

B) The issue with the ClassCastException (BigDecimal cannot be cast to Integer) 
is also related to the new casting in dynamic code, but it is a bit more 
subtle, and it seems related to some internal Hoisting that we have in place on 
my project (transforming literals into parameters to increase the chances of 
internal cache hits when converting a logical plan into physical plan). The 
thing is, in the literal=>parameter conversion, we use 
[RexLiteral#getValue3|https://github.com/apache/calcite/blob/e7c1f0c9baecef62289facf9aaa03bdff950b938/core/src/main/java/org/apache/calcite/rex/RexLiteral.java#L989]
 (which as stated on its javadoc returns "the value of this literal, in the 
form that the rex-to-lix translator wants it") to extract the value (and use it 
as parameter value). So far it worked like charm. But with this change, the 
ClassCastException arises because actually getValue3 of a RexLiteral with type 
= BasicSqlTypeINTEGER, returns the value as a BigDecimal (as it is stored in 
RexLiteral#value field), because its typeName is SqlTypeName#DECIMAL; btw this 
RexLiteral is created by the standard behavior that we get if we create a 
RexLiteral from an Integer via Calcite's RelBuilder#literal(IntegerValue).

 





> Type coercion is failing for numeric values in prepared statements
> --
>
> Key: CALCITE-6265
> URL: https://issues.apache.org/jira/browse/CALCITE-6265
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Tim Nieradzik
>Assignee: Tim Nieradzik
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Given a column of type {{{}INT{}}}. When providing a {{short}} value as a 
> placeholder in a prepared statement, a {{ClassCastException}} is thrown.
> h2. Test case
> {{final String sql =}}
> {{    "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)";}}{{  
>   CalciteAssert.hr()}}
> {{    .query(sql)}}
> {{    .consumesPreparedStatement(p -> {}}
> {{    p.setShort(1, (short) 100);}}
> {{        p.setShort(2, (short) 110);}}
> {{    })}}
> {{    .returnsUnordered("empid=100", "empid=110");}}
> h2. Stack trace
> {{java.lang.ClassCastException: class java.lang.Short cannot be cast to class 
> java.lang.Integer (java.lang.Short and java.lang.Integer are in module 
> java.base of loader 'bootstrap')}}
> {{    at Baz$1$1.moveNext(Unknown Source)}}
> {{    at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5390) RelDecorrelator throws NullPointerException

2024-04-10 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835683#comment-17835683
 ] 

Ruben Q L commented on CALCITE-5390:


Recently I stumbled upon this bug, in my case it's a LogicalFilter with 2 
RexSubquery using the same $cor0 which triggers the NPE when calling the 
Decorrelator. It's a far more complicated query that the ones already mentioned 
previously (so far I haven't been able to reproduce my issue with a 
minimalistic test), but I think the principle is the same.

I'm not sure trimming plays a role here (at least on my case I get the error 
without trimming the plan). I pretty much agree with [~libenchao]'s analysis 
here, it seems strange that the SubQueryRemoveRule generates a plan with two 
LogicalCorrelate using the same corr variable (it's exactly the same on my 
case). The plan looks odd, but... if I am not mistaken, if we skip the 
decorrelation, that odd plan seems to run fine (at least on my case, where we 
run the decorrelator within a try-catch, given that it's been traditionally a 
problematic module), and it case of exception, we simply go with the original 
(non-decorrelated) plan, and everything is fine. Actually, it seems that this 
"bizarre plan structure" of having 2 LogicalCorrelate with the same variable is 
not that uncommon: if we look inside {{RelOptRulesTest.xml}} we can see around 
10 coincidences of said pattern.

Moreover, it'd seem that this pattern is a necessary condition for the 
Decorrelator NPE, but not sufficient (according to my tests, some of these 
plans can be correctly decorrelated; it probably depends on which fields are 
accessed on each "instance" of the corr variable, as [~julianhyde] mentioned 
here).

To sum up, I agree with [~libenchao] that one potential solution could be 
modifying SubQueryRemoveRule to introduce different correlating variables 
instead of one in these cases (and that will probably make the Decorralator 
happy). But, given that the current plan generated by SubQueryRemoveRule seems 
"odd by correct" (it gets executed correctly if we skip the Decorrelator and 
its potential NPE); I wonder if perhaps our efforts should rather focus on 
fixing the Decorrelator and make it work with this type of plans. I'm not sure 
which approach would be simpler, but I just wanted to mention it.

PS: I haven't looked in detail, but I feel that on Decorrelator side, the 
problem (or at least one of them) might originate because its {{CorelMap}} 
contains just a Map for {{}} , and it would seems that 
in case like this, a MultiMap would be required? (two different 
LogicalCorrelate values having the same CorrelationId key?)

> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> 

[jira] [Resolved] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations in the presence of aliasing

2024-03-25 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6338.

Resolution: Fixed

Fixed via 
[{{d5b6b5c}}|https://github.com/apache/calcite/commit/d5b6b5c01ff17cac100a591a4cc4c9e83d4e1ace]

> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project (or a Calc) that projects the following 
> expressions (notice that $2 will become $1 and $2 after the projection): $0, 
> $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations in the presence of aliasing

2024-03-22 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Description: 
{{RelMdCollation#project}} can return an incomplete list of collations.

Let us say we have a Project (or a Calc) that projects the following 
expressions (notice that $2 will become $1 and $2 after the projection): $0, 
$2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.



  was:
{{RelMdCollation#project}} can return an incomplete list of collations.

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.




> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list 

[jira] [Commented] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations in the presence of aliasing

2024-03-22 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829792#comment-17829792
 ] 

Ruben Q L commented on CALCITE-6338:


Thanks for the feedback [~julianhyde], I have improved the tests and refactored 
the implementation to make it simpler.

> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations in the presence of aliasing

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Summary: RelMdCollation#project can return an incomplete list of collations 
in the presence of aliasing  (was: RelMdCollation#project can return an 
incomplete list of collations)

> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Description: 
{{RelMdCollation#project}} can return an incomplete list of collations.

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.



  was:
{{RelMdCollation#project}} can return an incomplete list of collations.

(I'll try to produce a unit test, for now I'll just describe the situation)

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.




> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete 

[jira] [Updated] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
---
Fix Version/s: 1.37.0

> RelMdCollation#project can return an incomplete list of collations
> --
>
> Key: CALCITE-6338
> URL: https://issues.apache.org/jira/browse/CALCITE-6338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.36.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> (I'll try to produce a unit test, for now I'll just describe the situation)
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List project(RelMetadataQuery mq,
>   RelNode input, List projects) {
>   ...
>   for (RelFieldCollation ifc : ic.getFieldCollations()) {
> final Collection integers = targets.get(ifc.getFieldIndex());
> if (integers.isEmpty()) {
>   continue loop; // cannot do this collation
> }
> fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>   }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6338) RelMdCollation#project can return an incomplete list of collations

2024-03-21 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-6338:
--

 Summary: RelMdCollation#project can return an incomplete list of 
collations
 Key: CALCITE-6338
 URL: https://issues.apache.org/jira/browse/CALCITE-6338
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.36.0
Reporter: Ruben Q L
Assignee: Ruben Q L


{{RelMdCollation#project}} can return an incomplete list of collations.

(I'll try to produce a unit test, for now I'll just describe the situation)

Let us say we have a Project that projects the following expressions (notice 
that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
The Project's input has collation [2, 3]
In order to calculate the Project's own collation, {{RelMdCollation#project}} 
will be called, and a MultiMap targets will be computed because, as in this 
case, a certain "source field" (e.g. 2) can have multiple project targets (e.g. 
1 and 2). However, when the collation is being computed, *only the first target 
will be considered* (and the rest will be discarded):
{code}
  public static @Nullable List project(RelMetadataQuery mq,
  RelNode input, List projects) {
  ...
  for (RelFieldCollation ifc : ic.getFieldCollations()) {
final Collection integers = targets.get(ifc.getFieldIndex());
if (integers.isEmpty()) {
  continue loop; // cannot do this collation
}
fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
// <-- HERE!!
  }
{code}
Because of this, the Project's collation will be [1 3], but there is also 
another valid one ([2 3]), so the correct (complete) result should be: [1 3] [2 
3]

This seems a minor problem, but it can be the root cause of more relevant 
issues. For instance, at the moment I have a scenario (not so easy to reproduce 
with a unit test) where a certain plan with a certain combination of rules in a 
HepPlanner results in a StackOverflow due to SortJoinTransposeRule being fired 
infinitely. The root cause is that, after the first application, the rule does 
not detect that the Join's left input is already sorted (due to the previous 
application of the rule), because there is a "problematic" Project on it (that 
shows the problem described above), which returns only one collation, whereas 
the second collation (the one being discarded) is the Sort's collation, so it 
would be one that would prevent the SortJoinTransposeRule from being re-applied 
over and over.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6249) RelNode::estimatedRowCount should not be used in computeSelfCost

2024-02-19 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6249.

Resolution: Fixed

Done in 
[{{a19a954}}|https://github.com/apache/calcite/commit/a19a954738e3405f621843a9df71e51002e1609b]
 

Thanks [~kramerul] for the patch!

> RelNode::estimatedRowCount should not be used in computeSelfCost
> 
>
> Key: CALCITE-6249
> URL: https://issues.apache.org/jira/browse/CALCITE-6249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelNode::estimatedRowCount}} is still used in many place inside 
> {{computeSelfCost}}
> If it should be possible to implement the estimation of the row count 
> completely outside Calcite, these calls should be replaced by 
> {{mq.getRowCount(rel)}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6249) RelNode::estimatedRowCount should not be used in computeSelfCost

2024-02-14 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6249:
---
Fix Version/s: 1.37.0

> RelNode::estimatedRowCount should not be used in computeSelfCost
> 
>
> Key: CALCITE-6249
> URL: https://issues.apache.org/jira/browse/CALCITE-6249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> {{RelNode::estimatedRowCount}} is still used in many place inside 
> {{computeSelfCost}}
> If it should be possible to implement the estimation of the row count 
> completely outside Calcite, these calls should be replaced by 
> {{mq.getRowCount(rel)}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6251) innerEnumerator in EnumerableDefaults::correlateBatchJoin is not closed

2024-02-12 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6251.

Fix Version/s: 1.37.0
   Resolution: Fixed

Fixed via 
[{{abf462a}}|https://github.com/apache/calcite/commit/abf462a7ddab11a44610827b0cd69510c099d9b6]

Thanks [~kramerul] for the patch!

> innerEnumerator in EnumerableDefaults::correlateBatchJoin is not closed
> ---
>
> Key: CALCITE-6251
> URL: https://issues.apache.org/jira/browse/CALCITE-6251
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.36.0
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> The 
> [innerEnumerator|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1681]
>  is only closed at the end. But if there are multiple loops, [innerEnumerator 
> is just assigned to a different value without closing 
> it|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1720].
> It should look like 
> [here|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1547-L1550].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6251) innerEnumerator in EnumerableDefaults::correlateBatchJoin is not closed

2024-02-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815294#comment-17815294
 ] 

Ruben Q L commented on CALCITE-6251:


Well spotted [~kramerul]! Would you open a PR to fix this?

> innerEnumerator in EnumerableDefaults::correlateBatchJoin is not closed
> ---
>
> Key: CALCITE-6251
> URL: https://issues.apache.org/jira/browse/CALCITE-6251
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.36.0
>Reporter: Ulrich Kramer
>Priority: Major
>
> The 
> [innerEnumerator|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1681]
>  is only closed at the end. But if there are multiple loops, [innerEnumerator 
> is just assigned to a different value without closing 
> it|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1720].
> It should look like 
> [here|https://github.com/apache/calcite/blob/f7069cc5245c22f816c565669f52b4f30b046f4d/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1547-L1550].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5647) RelMdPopulationSize should use mq.getRowCount(rel) instead of rel.estimateRowCount(mq)

2024-02-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815157#comment-17815157
 ] 

Ruben Q L edited comment on CALCITE-5647 at 2/7/24 8:33 AM:


Done in 
[{{f7069cc}}|https://github.com/apache/calcite/commit/f7069cc5245c22f816c565669f52b4f30b046f4d]
 

Thanks [~adamkennedy77] and [~jduong] for working on this!


was (Author: rubenql):
Done in 
[{{f7069cc}}|https://github.com/apache/calcite/commit/f7069cc5245c22f816c565669f52b4f30b046f4d]
 



> RelMdPopulationSize should use mq.getRowCount(rel) instead of 
> rel.estimateRowCount(mq)
> --
>
> Key: CALCITE-5647
> URL: https://issues.apache.org/jira/browse/CALCITE-5647
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Adam Kennedy
>Assignee: Adam Kennedy
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> A few locations in Calcite call rel.estimateRowCount(mq) when they should 
> instead call mq.getRowCount(red).
> We detected this because we implemented row count estimation entirely within 
> an alternative handle instead of RelMdRowCount, and then override 
> estimateRowCount to ensure the custom handler is user, by throwing an 
> unreachable code exception.
> A few places in Calcite trigger these unreachable exceptions because they do 
> not use mq.getRowCount.
> The most easily triggered on is in RelMdPopulationSize for the Values 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5647) RelMdPopulationSize should use mq.getRowCount(rel) instead of rel.estimateRowCount(mq)

2024-02-07 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5647:
---
Fix Version/s: 1.37.0

> RelMdPopulationSize should use mq.getRowCount(rel) instead of 
> rel.estimateRowCount(mq)
> --
>
> Key: CALCITE-5647
> URL: https://issues.apache.org/jira/browse/CALCITE-5647
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Adam Kennedy
>Assignee: Adam Kennedy
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> A few locations in Calcite call rel.estimateRowCount(mq) when they should 
> instead call mq.getRowCount(red).
> We detected this because we implemented row count estimation entirely within 
> an alternative handle instead of RelMdRowCount, and then override 
> estimateRowCount to ensure the custom handler is user, by throwing an 
> unreachable code exception.
> A few places in Calcite trigger these unreachable exceptions because they do 
> not use mq.getRowCount.
> The most easily triggered on is in RelMdPopulationSize for the Values 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5647) RelMdPopulationSize should use mq.getRowCount(rel) instead of rel.estimateRowCount(mq)

2024-02-07 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5647.

Resolution: Fixed

Done in 
[{{f7069cc}}|https://github.com/apache/calcite/commit/f7069cc5245c22f816c565669f52b4f30b046f4d]
 



> RelMdPopulationSize should use mq.getRowCount(rel) instead of 
> rel.estimateRowCount(mq)
> --
>
> Key: CALCITE-5647
> URL: https://issues.apache.org/jira/browse/CALCITE-5647
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Adam Kennedy
>Assignee: Adam Kennedy
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> A few locations in Calcite call rel.estimateRowCount(mq) when they should 
> instead call mq.getRowCount(red).
> We detected this because we implemented row count estimation entirely within 
> an alternative handle instead of RelMdRowCount, and then override 
> estimateRowCount to ensure the custom handler is user, by throwing an 
> unreachable code exception.
> A few places in Calcite trigger these unreachable exceptions because they do 
> not use mq.getRowCount.
> The most easily triggered on is in RelMdPopulationSize for the Values 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6249) RelNode::estimatedRowCount should not be used in computeSelfCost

2024-02-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815152#comment-17815152
 ] 

Ruben Q L commented on CALCITE-6249:


[~kramerul] I think the current ticket is related to CALCITE-5647, but not a 
duplicate of it (I have updated the link between both accordingly).
I think the commit that you propose seems reasonable, could you create the 
corresponding PR?

> RelNode::estimatedRowCount should not be used in computeSelfCost
> 
>
> Key: CALCITE-6249
> URL: https://issues.apache.org/jira/browse/CALCITE-6249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Minor
>
> {{RelNode::estimatedRowCount}} is still used in many place inside 
> {{computeSelfCost}}
> If it should be possible to implement the estimation of the row count 
> completely outside Calcite, these calls should be replaced by 
> {{mq.getRowCount(rel)}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5647) RelMdPopulationSize should use mq.getRowCount(rel) instead of rel.estimateRowCount(mq)

2024-02-05 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814239#comment-17814239
 ] 

Ruben Q L commented on CALCITE-5647:


[~jduong] done.

> RelMdPopulationSize should use mq.getRowCount(rel) instead of 
> rel.estimateRowCount(mq)
> --
>
> Key: CALCITE-5647
> URL: https://issues.apache.org/jira/browse/CALCITE-5647
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Adam Kennedy
>Assignee: Adam Kennedy
>Priority: Minor
>  Labels: pull-request-available
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> A few locations in Calcite call rel.estimateRowCount(mq) when they should 
> instead call mq.getRowCount(red).
> We detected this because we implemented row count estimation entirely within 
> an alternative handle instead of RelMdRowCount, and then override 
> estimateRowCount to ensure the custom handler is user, by throwing an 
> unreachable code exception.
> A few places in Calcite trigger these unreachable exceptions because they do 
> not use mq.getRowCount.
> The most easily triggered on is in RelMdPopulationSize for the Values 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5647) RelMdPopulationSize should use mq.getRowCount(rel) instead of rel.estimateRowCount(mq)

2024-02-05 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5647:
---
Summary: RelMdPopulationSize should use mq.getRowCount(rel) instead of 
rel.estimateRowCount(mq)  (was: RelMdPopulationSize calls rel.estimateRowCount 
instead of mq.getRowCount)

> RelMdPopulationSize should use mq.getRowCount(rel) instead of 
> rel.estimateRowCount(mq)
> --
>
> Key: CALCITE-5647
> URL: https://issues.apache.org/jira/browse/CALCITE-5647
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Adam Kennedy
>Assignee: Adam Kennedy
>Priority: Minor
>  Labels: pull-request-available
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> A few locations in Calcite call rel.estimateRowCount(mq) when they should 
> instead call mq.getRowCount(red).
> We detected this because we implemented row count estimation entirely within 
> an alternative handle instead of RelMdRowCount, and then override 
> estimateRowCount to ensure the custom handler is user, by throwing an 
> unreachable code exception.
> A few places in Calcite trigger these unreachable exceptions because they do 
> not use mq.getRowCount.
> The most easily triggered on is in RelMdPopulationSize for the Values 
> parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6234) Add tests on SqlOperatorTest for to_char function

2024-02-02 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6234.

Resolution: Fixed

Done in  
[{{2aabf21}}|https://github.com/apache/calcite/commit/2aabf210dc1918c6ca20e63b39661ff445535eb8]

Thanks [~caicancai] for the patch!

> Add tests on SqlOperatorTest for to_char function
> -
>
> Key: CALCITE-6234
> URL: https://issues.apache.org/jira/browse/CALCITE-6234
> Project: Calcite
>  Issue Type: Test
>  Components: tests
>Affects Versions: 1.36.0
>Reporter: Caican Cai
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 1.37.0
>
>
> Since the implementation of pg's format_date and big_query's format_date are 
> the same, but their usage is different, I will add the corresponding test 
> display.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813610#comment-17813610
 ] 

Ruben Q L edited comment on CALCITE-6236 at 2/2/24 10:37 AM:
-

{quote}I am afraid I am just adding negative karma to the discussion :)
{quote}
Of course not, [~zabetak], your comments are very relevant, as always :)

FYI [~kramerul], even if we don't reach an agreement here, as a workaround you 
can fix this issue on your project by creating your own {{MyRelMdRowCount}}, 
re-define in there the rowCount computation for EBNLJ via a {{public Double 
getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery mq)}} with 
your preferred solution, and include {{MyRelMdRowCount}} on your metadata 
provider.


was (Author: rubenql):
{quote}I am afraid I am just adding negative karma to the discussion :)
{quote}
Of course not, [~zabetak], your comments are very relevant, as always :)

FYI [~kramerul], even if we don't reach an agreement here, as a workaround, you 
can fix this issue on your project by creating your own {{MyRelMdRowCount}}, 
re-define in there the rowCount computation for EBNLJ via a {{public Double 
getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery mq)}} with 
your preferred solution, and include {{MyRelMdRowCount}} on your metadata 
provider.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813610#comment-17813610
 ] 

Ruben Q L commented on CALCITE-6236:


{quote}I am afraid I am just adding negative karma to the discussion :)
{quote}
Of course not, [~zabetak], your comments are very relevant, as always :)

FYI [~kramerul], even if we don't reach an agreement here, as a workaround, you 
can fix this issue on your project by creating your own {{MyRelMdRowCount}}, 
re-define in there the rowCount computation for EBNLJ via a {{public Double 
getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery mq)}} with 
your preferred solution, and include {{MyRelMdRowCount}} on your metadata 
provider.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813593#comment-17813593
 ] 

Ruben Q L commented on CALCITE-6236:


Thanks [~kramerul] . I also started to look into this idea, but unfortunately I 
have the impression it might not be so simple for the case of SEMI or ANTI 
join. You can take a look at the latest version of the [unit 
test|https://github.com/apache/calcite/pull/3661/commits/ea2b758aff14ce7c8ea213d245b455a931272bd0]
 in [PR#3661|https://github.com/apache/calcite/pull/3661], I have the 
impression that your new PR would break that test.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-01 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813276#comment-17813276
 ] 

Ruben Q L commented on CALCITE-6236:


{quote}Rules create semantically equivalent plans. Someone could argue that 
equivalent means that they should have the same number of rows/costs.
{quote}
I'd argue that they would have the same rowCount, but can have different costs 
(e.g. a MergeJoin can have higher cost than its equivalent HashJoin, since the 
former requires the inputs to be sorted).

Circling back to the "correction factor" approach for EBNLJ, what if:
 - When creating the EBNLJ, we store the selectivity of the correlate filter 
upon the (original) RHS.
 - We know that, from that point on, the EBNLJ's new RHS will have its rowCount 
reduced due to the correlate filter that has been applied.
 - For the rowCount estimation of the EBNLJ, we can get back the original 
rowCount of the RHS by doing something like:
{code:java}
adjusted_rowCount_RHS = rowCount_RHS / selectivity_of_correlate_filter
{code}
And we use that adjustedRowCount in the computation of EBNLJ's rowCount?

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-01 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813259#comment-17813259
 ] 

Ruben Q L edited comment on CALCITE-6236 at 2/1/24 2:40 PM:


[~zabetak] I agree with you that my solution is not a good pattern, I was just 
presenting it as an alternative solution vs the initial PR, knowing that it 
might not be acceptable :)

As [~kramerul] says, the Correlate is not a valid model, because in fact it 
suffers from the same problem (due to its correlate filter pushed down on its 
RHS): a Join (with rowCount X), that gets converted via JoinToCorralteRule will 
result in a Correlate with rowCount Y (different from the original X), and this 
does not make too much sense, because the fact that a certain Join gets 
implemented in a Correlate fashion does not impact its rowCount.

We could go back to the solution of the original approach, which was trying to 
compute the rowCount by "subtracting" the correlate filter (but this can get 
tricky, especially if we have several nested EBNLJ/Correlate, and potentially 
different correlate filters merged into a single Filter operator).

What if, instead of storing the Join inside the EnumerableBatchNestedLoopJoin 
we precompute and store just its rowCount (mq.getRowCount(join))? And we just 
return that value as rowCount of the EnumerableBatchNestedLoopJoin ? I guess it 
would be a valid assumption that, even if the inputss of the 
EnumerableBatchNestedLoopJoin have further transformations due to subsequent 
optimization rules, its rowCount should always be the same as the rowCount of 
the Join that originated it, shouldn't it? In fact, I'd argue that any 
EnumerableX operator that is originated from a LogicalJoin should always return 
the same rowCount as said LogicalJoin (and currently this does not happen for 
EBNLJ/EnumerableCorrelate), or am I missing something?

UPDATE:
bq. EnumerableBatchNestedLoopJoin estimation could be multiplied by some 
constant factor (may be in correlation with the batch size) 
I think that might be a valid solution for EBNLJ (and Correlate) too, but 
possibly this constant facto should be related to the selectivity of the 
correlated filter introduced on the RHS.


was (Author: rubenql):
[~zabetak] I agree with you that my solution is not a good pattern, I was just 
presenting it as an alternative solution vs the initial PR, knowing that it 
might not be acceptable :)

As [~kramerul] says, the Correlate is not a valid model, because in fact it 
suffers from the same problem (due to its correlate filter pushed down on its 
RHS): a Join (with rowCount X), that gets converted via JoinToCorralteRule will 
result in a Correlate with rowCount Y (different from the original X), and this 
does not make too much sense, because the fact that a certain Join gets 
implemented in a Correlate fashion does not impact its rowCount.

We could go back to the solution of the original approach, which was trying to 
compute the rowCount by "subtracting" the correlate filter (but this can get 
tricky, especially if we have several nested EBNLJ/Correlate, and potentially 
different correlate filters merged into a single Filter operator).

What if, instead of storing the Join inside the EnumerableBatchNestedLoopJoin 
we precompute and store just its rowCount (mq.getRowCount(join))? And we just 
return that value as rowCount of the EnumerableBatchNestedLoopJoin ? I guess it 
would be a valid assumption that, even if the inputss of the 
EnumerableBatchNestedLoopJoin have further transformations due to subsequent 
optimization rules, its rowCount should always be the same as the rowCount of 
the Join that originated it, shouldn't it? In fact, I'd argue that any 
EnumerableX operator that is originated from a LogicalJoin should always return 
the same rowCount as said LogicalJoin (and currently this does not happen for 
EBNLJ/EnumerableCorrelate), or am I missing something?


> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 

[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-01 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813259#comment-17813259
 ] 

Ruben Q L edited comment on CALCITE-6236 at 2/1/24 2:40 PM:


[~zabetak] I agree with you that my solution is not a good pattern, I was just 
presenting it as an alternative solution vs the initial PR, knowing that it 
might not be acceptable :)

As [~kramerul] says, the Correlate is not a valid model, because in fact it 
suffers from the same problem (due to its correlate filter pushed down on its 
RHS): a Join (with rowCount X), that gets converted via JoinToCorralteRule will 
result in a Correlate with rowCount Y (different from the original X), and this 
does not make too much sense, because the fact that a certain Join gets 
implemented in a Correlate fashion does not impact its rowCount.

We could go back to the solution of the original approach, which was trying to 
compute the rowCount by "subtracting" the correlate filter (but this can get 
tricky, especially if we have several nested EBNLJ/Correlate, and potentially 
different correlate filters merged into a single Filter operator).

What if, instead of storing the Join inside the EnumerableBatchNestedLoopJoin 
we precompute and store just its rowCount (mq.getRowCount(join))? And we just 
return that value as rowCount of the EnumerableBatchNestedLoopJoin ? I guess it 
would be a valid assumption that, even if the inputss of the 
EnumerableBatchNestedLoopJoin have further transformations due to subsequent 
optimization rules, its rowCount should always be the same as the rowCount of 
the Join that originated it, shouldn't it? In fact, I'd argue that any 
EnumerableX operator that is originated from a LogicalJoin should always return 
the same rowCount as said LogicalJoin (and currently this does not happen for 
EBNLJ/EnumerableCorrelate), or am I missing something?

UPDATE:
bq. EnumerableBatchNestedLoopJoin estimation could be multiplied by some 
constant factor (may be in correlation with the batch size) 
I think that might be a valid solution for EBNLJ (and Correlate) too, but 
possibly this constant factor should be related to the selectivity of the 
correlated filter introduced on the RHS.


was (Author: rubenql):
[~zabetak] I agree with you that my solution is not a good pattern, I was just 
presenting it as an alternative solution vs the initial PR, knowing that it 
might not be acceptable :)

As [~kramerul] says, the Correlate is not a valid model, because in fact it 
suffers from the same problem (due to its correlate filter pushed down on its 
RHS): a Join (with rowCount X), that gets converted via JoinToCorralteRule will 
result in a Correlate with rowCount Y (different from the original X), and this 
does not make too much sense, because the fact that a certain Join gets 
implemented in a Correlate fashion does not impact its rowCount.

We could go back to the solution of the original approach, which was trying to 
compute the rowCount by "subtracting" the correlate filter (but this can get 
tricky, especially if we have several nested EBNLJ/Correlate, and potentially 
different correlate filters merged into a single Filter operator).

What if, instead of storing the Join inside the EnumerableBatchNestedLoopJoin 
we precompute and store just its rowCount (mq.getRowCount(join))? And we just 
return that value as rowCount of the EnumerableBatchNestedLoopJoin ? I guess it 
would be a valid assumption that, even if the inputss of the 
EnumerableBatchNestedLoopJoin have further transformations due to subsequent 
optimization rules, its rowCount should always be the same as the rowCount of 
the Join that originated it, shouldn't it? In fact, I'd argue that any 
EnumerableX operator that is originated from a LogicalJoin should always return 
the same rowCount as said LogicalJoin (and currently this does not happen for 
EBNLJ/EnumerableCorrelate), or am I missing something?

UPDATE:
bq. EnumerableBatchNestedLoopJoin estimation could be multiplied by some 
constant factor (may be in correlation with the batch size) 
I think that might be a valid solution for EBNLJ (and Correlate) too, but 
possibly this constant facto should be related to the selectivity of the 
correlated filter introduced on the RHS.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, 

[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-02-01 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17813259#comment-17813259
 ] 

Ruben Q L commented on CALCITE-6236:


[~zabetak] I agree with you that my solution is not a good pattern, I was just 
presenting it as an alternative solution vs the initial PR, knowing that it 
might not be acceptable :)

As [~kramerul] says, the Correlate is not a valid model, because in fact it 
suffers from the same problem (due to its correlate filter pushed down on its 
RHS): a Join (with rowCount X), that gets converted via JoinToCorralteRule will 
result in a Correlate with rowCount Y (different from the original X), and this 
does not make too much sense, because the fact that a certain Join gets 
implemented in a Correlate fashion does not impact its rowCount.

We could go back to the solution of the original approach, which was trying to 
compute the rowCount by "subtracting" the correlate filter (but this can get 
tricky, especially if we have several nested EBNLJ/Correlate, and potentially 
different correlate filters merged into a single Filter operator).

What if, instead of storing the Join inside the EnumerableBatchNestedLoopJoin 
we precompute and store just its rowCount (mq.getRowCount(join))? And we just 
return that value as rowCount of the EnumerableBatchNestedLoopJoin ? I guess it 
would be a valid assumption that, even if the inputss of the 
EnumerableBatchNestedLoopJoin have further transformations due to subsequent 
optimization rules, its rowCount should always be the same as the rowCount of 
the Join that originated it, shouldn't it? In fact, I'd argue that any 
EnumerableX operator that is originated from a LogicalJoin should always return 
the same rowCount as said LogicalJoin (and currently this does not happen for 
EBNLJ/EnumerableCorrelate), or am I missing something?


> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812799#comment-17812799
 ] 

Ruben Q L commented on CALCITE-6236:


See https://github.com/apache/calcite/pull/3661

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812730#comment-17812730
 ] 

Ruben Q L edited comment on CALCITE-6236 at 1/31/24 3:23 PM:
-

[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original plan, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the BNLJ's Right Hand Side there is another join, and the 
BatchNestedLoop filter has been pushed inside this inside join (by the relevant 
rule for that purpose), shall we examine the Left or the Right hand side of 
this inside join? What if this inside join is also a BatchNestedLoop with its 
own filter inside, how can we distinguish the outside BNL filter from the 
inside's?

We faced this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin".
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin's rowCount and cost estimation to 
use the same rowCount value as the original join that generated it. I think it 
would be a valid approach to fix your problem.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider it in order to fix this situation.


was (Author: rubenql):
[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original plan, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the BNLJ's Right Hand Side there is another join, and the 
BatchNestedLoop filter has been pushed inside this inside join (by the relevant 
rule for that purpose), shall we examine the Left or the Right hand side of 
this inside join? What if this inside join is also a BatchNestedLoop with its 
own filter inside, how can we distinguish the outside BNL filter from the 
inside's?

We faced this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin".
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it. I think it 
would be a valid approach to fix your problem.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> 

[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812747#comment-17812747
 ] 

Ruben Q L commented on CALCITE-6236:


Working on it...

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812730#comment-17812730
 ] 

Ruben Q L edited comment on CALCITE-6236 at 1/31/24 3:03 PM:
-

[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original plan, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the BNLJ's Right Hand Side there is another join, and the 
BatchNestedLoop filter has been pushed inside this inside join (by the relevant 
rule for that purpose), shall we examine the Left or the Right hand side of 
this inside join? What if this inside join is also a BatchNestedLoop with its 
own filter inside, how can we distinguish the outside BNL filter from the 
inside's?

We faced this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin".
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it. I think it 
would be a valid approach to fix your problem.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.


was (Author: rubenql):
[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original plan, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the RHS there is another join, and the BatchNestedLoop filter has 
been pushed inside the join (by the relevant rule for that purpose), shall we 
examine the Left or the Right hand side of this inner join? What if this inner 
join is also a BatchNestedLoop with its own filter inside, how can we 
distinguish the outer BNL filter from the inner's?

We face this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin"
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 

[jira] [Comment Edited] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812730#comment-17812730
 ] 

Ruben Q L edited comment on CALCITE-6236 at 1/31/24 3:01 PM:
-

[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original plan, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the RHS there is another join, and the BatchNestedLoop filter has 
been pushed inside the join (by the relevant rule for that purpose), shall we 
examine the Left or the Right hand side of this inner join? What if this inner 
join is also a BatchNestedLoop with its own filter inside, how can we 
distinguish the outer BNL filter from the inner's?

We face this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin"
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.


was (Author: rubenql):
[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original one, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the RHS there is another join, and the BatchNestedLoop filter has 
been pushed inside the join (by the relevant rule for that purpose), shall we 
examine the Left or the Right hand side of this inner join? What if this inner 
join is also a BatchNestedLoop with its own filter inside, how can we 
distinguish the outer BNL filter from the inner's?

We face this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin"
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 

[jira] [Commented] (CALCITE-6236) EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation

2024-01-31 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812730#comment-17812730
 ] 

Ruben Q L commented on CALCITE-6236:


[~kramerul], I have taken a quick look at the 
[PR#3660|https://github.com/apache/calcite/pull/3660], I'm afraid this solution 
might not be 100% bullet-proof:
- What if the filter that you find is not the filter introduced by the 
BatchNestedLoop? (but a different one that was part of the original one, or 
maybe a combination of both after FilterMergeRule was applied).
- What if in the RHS there is another join, and the BatchNestedLoop filter has 
been pushed inside the join (by the relevant rule for that purpose), shall we 
examine the Left or the Right hand side of this inner join? What if this inner 
join is also a BatchNestedLoop with its own filter inside, how can we 
distinguish the outer BNL filter from the inner's?

We face this issue in our project, the solution that we put in place was:
- Inside EnumerableBatchNestedLoop add a new field "originalJoin", include it 
on the constructor and create methods, add a getter for it.
- In EnumerableBatchNestedLoopRule, when it creates the 
EnumerableBatchNestedLoopJoin, pass the Join that fired the rule as 
"originalJoin"
- In RelMdRowCount, "override" the rowCount computation for BNLJ so that:
{code}
public Double getRowCount(EnumerableBatchNestedLoopJoin join, RelMetadataQuery 
mq)
{
  return mq.getRowCount(join.getOriginalJoin());
}
{code}

This results in EnumerableBatchNestedLoopJoin rowCount and cost estimation to 
use the same rowCount value as the original join that generated it.

I can prepare a PR with this solution, if it is accepted by the community we 
can consider is in order to fix this situation.

> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> ---
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
>  Issue Type: Bug
>Reporter: Ulrich Kramer
>Priority: Major
>  Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right 
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a 
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower 
> compared to the one returned for a {{JdbcJoin}}. 
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}} 
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
>   EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5 
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0 
> cumulative_costs=190.0
>   JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
>   JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
>   JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
>   JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-08 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784096#comment-17784096
 ] 

Ruben Q L commented on CALCITE-6087:


I have the impression that both the current ticket and CALCITE-6089 appear 
because we try to apply EnumerableSortedAggregate on an aggregation with empty 
groupSet. 

I'm not sure if EnumerableSortedAggregate implementation is supposed to support 
this case, but even if it was, it seems to me that applying a sorted aggregated 
on an empty groupSet kind of defeats its purpose, because there will be no 
sorted input for the aggregation so, if I am not mistaken, 
EnumerableSortedAggregate behavior would "degenerate" into the standard 
EnumerableAggregate.

Perhaps the simpler solution would be to simply avoid it on 
EnumerableSortedAggregateRule:
{code}
@Override public @Nullable RelNode convert(RelNode rel) {
  final Aggregate agg = (Aggregate) rel;
  if (!Aggregate.isSimple(agg)) {
return null;
  }
  ...
=>
@Override public @Nullable RelNode convert(RelNode rel) {
  final Aggregate agg = (Aggregate) rel;
  if (!Aggregate.isSimple(agg) || agg.getGroupSet().isEmpty()) {  // <-- ***
return null;
  }
  ...
{code}

[~hyuan], [~amaliujia], you worked on the original implem of 
EnumerableSortedAggregate + EnumerableSortedAggregateRule, wdyt?

> EnumerableSortedAggregate returns incorrect result when input is empty
> --
>
> Key: CALCITE-6087
> URL: https://issues.apache.org/jira/browse/CALCITE-6087
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> Performing a MAX on an empty table (or on a table where we apply a filter 
> which does not return any value) shall return NULL, we can verify this with 
> our "standard" {{EnumerableAggregate}} operator:
> {code:java}
>   @Test void enumerableAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .explainContains(
> "EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> However, if we use {{Hook.PLANNER}} to force the aggregation to be 
> implemented via {{EnumerableSortedAggregate}} on the same query:
> {code:java}
>   @Test void enumerableSortedAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>   planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> })
> .explainContains(
> "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> It fails, because the {{EnumerableSortedAggregate}} returns an empty result 
> set rather than NULL:
> {noformat}
> java.lang.AssertionError: 
> Expected: "m=null"
>  but: was ""
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6089) EnumerableSortedAggregate fails with ClassCastException: class X cannot be cast to class org.apache.calcite.runtime.FlatLists$ComparableList

2023-11-03 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782466#comment-17782466
 ] 

Ruben Q L commented on CALCITE-6089:


We can see similar instances of the same exception if we force Calcite tests to 
run with EnumerableSortedAggregate instead of EnumerableAggregate, e.g. 
changing {{EnumerableRules#ENUMERABLE_RULES}}:
{code}
...
// EnumerableRules.ENUMERABLE_AGGREGATE_RULE,
EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE,
...
{code}

And executing {{CalciteSqlOperatorTest}}.

> EnumerableSortedAggregate fails with ClassCastException: class X cannot be 
> cast to class org.apache.calcite.runtime.FlatLists$ComparableList
> 
>
> Key: CALCITE-6089
> URL: https://issues.apache.org/jira/browse/CALCITE-6089
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> The problem can be reproduced with this test (to be added e.g. into 
> {{EnumerableSortedAggregateTest.java}}):
> {code}
>   @Test void sortedAggCountUnion() {
> tester(false, new HrSchema())
> .query(
> "select count(*) as c from ( "
> + "select * from emps where deptno=10 "
> + "union all "
> + "select * from emps where deptno=20)")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>   planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> })
> .explainContains(
> "EnumerableSortedAggregate(group=[{}], c=[COUNT()])\n"
> + "  EnumerableUnion(all=[true])\n"
> + "EnumerableCalc(expr#0..4=[{inputs}], 
> expr#5=[CAST($t1):INTEGER NOT NULL], expr#6=[10], expr#7=[=($t5, $t6)], 
> commission=[$t4], $condition=[$t7])\n"
> + "  EnumerableTableScan(table=[[s, emps]])\n"
> + "EnumerableCalc(expr#0..4=[{inputs}], 
> expr#5=[CAST($t1):INTEGER NOT NULL], expr#6=[20], expr#7=[=($t5, $t6)], 
> commission=[$t4], $condition=[$t7])\n"
> + "  EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered(
> "c=4");
>   }
> {code}
> Which fails with:
> {noformat}
> ...
> Caused by: java.lang.ClassCastException: class java.lang.Integer cannot be 
> cast to class org.apache.calcite.runtime.FlatLists$ComparableList 
> (java.lang.Integer is in module java.base of loader 'bootstrap'; 
> org.apache.calcite.runtime.FlatLists$ComparableList is in unnamed module of 
> loader 'app')
>   at Baz$6.compare(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults$SortedAggregateEnumerator.moveNext(EnumerableDefaults.java:938)
>   at 
> org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)
>   at org.apache.calcite.linq4j.Linq4j.enumeratorIterator(Linq4j.java:97)
> ...
> {noformat}
> The same test with EnumerableAggregate (instead of EnumerableSortedAggregate) 
> will execute the query correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6089) EnumerableSortedAggregate fails with ClassCastException: class X cannot be cast to class org.apache.calcite.runtime.FlatLists$ComparableList

2023-11-03 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-6089:
--

 Summary: EnumerableSortedAggregate fails with ClassCastException: 
class X cannot be cast to class 
org.apache.calcite.runtime.FlatLists$ComparableList
 Key: CALCITE-6089
 URL: https://issues.apache.org/jira/browse/CALCITE-6089
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Ruben Q L


The problem can be reproduced with this test (to be added e.g. into 
{{EnumerableSortedAggregateTest.java}}):

{code}
  @Test void sortedAggCountUnion() {
tester(false, new HrSchema())
.query(
"select count(*) as c from ( "
+ "select * from emps where deptno=10 "
+ "union all "
+ "select * from emps where deptno=20)")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], c=[COUNT()])\n"
+ "  EnumerableUnion(all=[true])\n"
+ "EnumerableCalc(expr#0..4=[{inputs}], 
expr#5=[CAST($t1):INTEGER NOT NULL], expr#6=[10], expr#7=[=($t5, $t6)], 
commission=[$t4], $condition=[$t7])\n"
+ "  EnumerableTableScan(table=[[s, emps]])\n"
+ "EnumerableCalc(expr#0..4=[{inputs}], 
expr#5=[CAST($t1):INTEGER NOT NULL], expr#6=[20], expr#7=[=($t5, $t6)], 
commission=[$t4], $condition=[$t7])\n"
+ "  EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered(
"c=4");
  }
{code}

Which fails with:
{noformat}
...
Caused by: java.lang.ClassCastException: class java.lang.Integer cannot be cast 
to class org.apache.calcite.runtime.FlatLists$ComparableList (java.lang.Integer 
is in module java.base of loader 'bootstrap'; 
org.apache.calcite.runtime.FlatLists$ComparableList is in unnamed module of 
loader 'app')
at Baz$6.compare(Unknown Source)
at 
org.apache.calcite.linq4j.EnumerableDefaults$SortedAggregateEnumerator.moveNext(EnumerableDefaults.java:938)
at 
org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.(Linq4j.java:679)
at org.apache.calcite.linq4j.Linq4j.enumeratorIterator(Linq4j.java:97)
...
{noformat}

The same test with EnumerableAggregate (instead of EnumerableSortedAggregate) 
will execute the query correctly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-02 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6087:
---
Description: 
Performing a MAX on an empty table (or on a table where we apply a filter which 
does not return any value) shall return NULL, we can verify this with our 
"standard" {{EnumerableAggregate}} operator:
{code:java}
  @Test void enumerableAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.explainContains(
"EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
However, if we use {{Hook.PLANNER}} to force the aggregation to be implemented 
via {{EnumerableSortedAggregate}} on the same query:
{code:java}
  @Test void enumerableSortedAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
It fails, because the {{EnumerableSortedAggregate}} returns an empty result set 
rather than NULL:
{noformat}
java.lang.AssertionError: 
Expected: "m=null"
 but: was ""
{noformat}

  was:
Performing a MAX on an empty table (or on a table where we apply a filter which 
does not return any value) shall return NULL, we can verify this with our 
"standard" {{EnumerableAggregate}} operator:
{code:java}
  @Test void enumerableAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.explainContains(
"EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
However, if we use {{Hook.PLANNER}} to force the aggregation to be implemented 
via {{EnumerableSortedAggregate}} on the same query:
{code:java}
  @Test void enumerableSortedAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
It fails, because the EnumerableSortedAggregate returns an empty result set 
rather than NULL:
{noformat}
java.lang.AssertionError: 
Expected: "m=null"
 but: was ""
{noformat}


> EnumerableSortedAggregate returns incorrect result when input is empty
> --
>
> Key: CALCITE-6087
> URL: https://issues.apache.org/jira/browse/CALCITE-6087
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> Performing a MAX on an empty table (or on a table where we apply a filter 
> which does not return any value) shall return NULL, we can verify this with 
> our "standard" {{EnumerableAggregate}} operator:
> {code:java}
>   @Test void enumerableAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .explainContains(
> "EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> However, if we use {{Hook.PLANNER}} to force the aggregation to be 
> implemented via {{EnumerableSortedAggregate}} on the same query:
> {code:java}
>   @Test void 

[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782234#comment-17782234
 ] 

Ruben Q L commented on CALCITE-6087:


Same thing happens if we perform a COUNT on an empty input (which shall return 
0):
{code}
  @Test void enumerableAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select count(1) as c from emps where deptno>100")
.explainContains(
"EnumerableAggregate(group=[{}], c=[COUNT()])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("c=0");  // ok
  }

  @Test void enumerableSortedAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select count(1) as c from emps where deptno>100")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], c=[COUNT()])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("c=0");  // KO!!! Returns empty!!
  }
{code}

The second test (EnumerableSortedAggregate) will fail with:
{noformat}
java.lang.AssertionError: 
Expected: "c=0"
 but: was ""
{noformat}

> EnumerableSortedAggregate returns incorrect result when input is empty
> --
>
> Key: CALCITE-6087
> URL: https://issues.apache.org/jira/browse/CALCITE-6087
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> Performing a MAX on an empty table (or on a table where we apply a filter 
> which does not return any value) shall return NULL, we can verify this with 
> our "standard" {{EnumerableAggregate}} operator:
> {code:java}
>   @Test void enumerableAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .explainContains(
> "EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> However, if we use {{Hook.PLANNER}} to force the aggregation to be 
> implemented via {{EnumerableSortedAggregate}} on the same query:
> {code:java}
>   @Test void enumerableSortedAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>   planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> })
> .explainContains(
> "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> It fails, because the {{EnumerableSortedAggregate}} returns an empty result 
> set rather than NULL:
> {noformat}
> java.lang.AssertionError: 
> Expected: "m=null"
>  but: was ""
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-02 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-6087:
--

 Summary: EnumerableSortedAggregate returns incorrect result when 
input is empty
 Key: CALCITE-6087
 URL: https://issues.apache.org/jira/browse/CALCITE-6087
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Ruben Q L


Performing a MAX on an empty table (or on a table where we apply a filter which 
does not return any value) shall return NULL, we can verify this with our 
"standard" {{EnumerableAggregate}} operator:
{code:java}
  @Test void enumerableAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.explainContains(
"EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
However, if we use {{Hook.PLANNER}} to force the aggregation to be implemented 
via {{EnumerableSortedAggregate}} on the same query:
{code:java}
  @Test void enumerableSortedAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select max(deptno) as m from emps where deptno>100")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("m=null");
  }
{code}
It fails, because the EnumerableSortedAggregate returns an empty result set 
rather than NULL:
{noformat}
java.lang.AssertionError: 
Expected: "m=null"
 but: was ""
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-10-27 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5949.

Resolution: Fixed

Fixed in 
[{{417a1b5}}|https://github.com/apache/calcite/commit/417a1b53ec9fd6bd449601036535bdb323229059]

Thanks [~cbrisson] for the patch!

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-10-26 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reassigned CALCITE-5949:
--

Assignee: Ruben Q L

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-129) Support recursive WITH queries

2023-10-26 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779781#comment-17779781
 ] 

Ruben Q L commented on CALCITE-129:
---

I agree with [~julianhyde].

I have started to take a look at the PR, I have left some minor comments but 
IMO it is in a pretty good shape, with plenty of new tests for this 
improvement. However, I'm not the most expert on the SQL parser & validator, so 
I'd appreciate if someone else could help on that part. 

> Support recursive WITH queries
> --
>
> Key: CALCITE-129
> URL: https://issues.apache.org/jira/browse/CALCITE-129
> Project: Calcite
>  Issue Type: Improvement
>Reporter: GitHub Import
>Assignee: Hanumath Rao Maduri
>Priority: Major
>  Labels: github-import, pull-request-available
> Fix For: 1.36.0
>
>
> Building on https://github.com/julianhyde/optiq/issues/128, enable the 
> RECURSIVE keyword.
> This feature allows relations to be computed iteratively. Whereas 
> ([#128|https://github.com/JulianHyde/optiq/issues/128] | 
> [FLINK-128|https://issues.apache.org/jira/browse/OPTIQ-128]) was mainly a 
> parser/validator change, this feature requires significant changes to the 
> planner and runtime.
>  Imported from GitHub 
> Url: https://github.com/julianhyde/optiq/issues/129
> Created by: [julianhyde|https://github.com/julianhyde]
> Labels: enhancement, 
> Created at: Tue Feb 11 20:29:35 CET 2014
> State: open



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779639#comment-17779639
 ] 

Ruben Q L commented on CALCITE-5921:


No problem at all [~mbudiu], happy to help.
Adding potential corrections to the PR of the related ticket CALCITE-5990 
sounds good, I'll take a look at that one as well.

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779523#comment-17779523
 ] 

Ruben Q L commented on CALCITE-5921:


The PR has been merged (thanks [~libenchao] for the reactivity to review it!), 
CI on master branch is back to stable.

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779490#comment-17779490
 ] 

Ruben Q L commented on CALCITE-5921:


I have created a new PR that aims to fix the broken tests: 
https://github.com/apache/calcite/pull/3482

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779451#comment-17779451
 ] 

Ruben Q L edited comment on CALCITE-5921 at 10/25/23 12:10 PM:
---

I have the impression that the problem comes from the recent changes introduced 
by CALCITE-5923 (UPDATE: and CALCITE-6014), which were not part of [~mbudiu]'s 
PR (because the branch was created before CALCITE-5923/CALCITE-6014 was 
merged). Probably some adjustments are needed to make the ensemble work fine, 
wip...


was (Author: rubenql):
I have the impression that the problem comes from the recent changes introduced 
by CALCITE-5923, which were not part of [~mbudiu]'s PR (because the branch was 
created before CALCITE-5923 was merged). Probably some adjustments are needed 
to make the ensemble work fine, wip...

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779451#comment-17779451
 ] 

Ruben Q L commented on CALCITE-5921:


I have the impression that the problem comes from the recent changes introduced 
by CALCITE-5923, which were not part of [~mbudiu]'s PR (because the branch was 
created before CALCITE-5923 was merged). Probably some adjustments are needed 
to make the ensemble work fine, wip...

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779445#comment-17779445
 ] 

Ruben Q L edited comment on CALCITE-5921 at 10/25/23 10:31 AM:
---

It seems that, even all the PR checks were green, after the merge some tests 
(involving CAST) are broken on master
https://github.com/apache/calcite/actions/runs/6638635554/

Does anyone have any idea why? Looking into that


was (Author: rubenql):
It seems that, even all the PR checks were green, after the merge some tests 
are broken on master
https://github.com/apache/calcite/actions/runs/6638635554/

Does anyone have any idea why? Looking into that

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779445#comment-17779445
 ] 

Ruben Q L commented on CALCITE-5921:


It seems that, even all the PR checks were green, after the merge some tests 
are broken on master
https://github.com/apache/calcite/actions/runs/6638635554/

Does anyone have any idea why? Looking into that

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5921.

Resolution: Fixed

Fixed via  
[{{0bec957}}|https://github.com/apache/calcite/commit/0bec957071468a2e54a22519290ac101a752fcad]

Thanks [~Runking] and [~mbudiu] for your work on this one!

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5921) SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure

2023-10-25 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5921:
---
Fix Version/s: 1.36.0

> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure
> ---
>
> Key: CALCITE-5921
> URL: https://issues.apache.org/jira/browse/CALCITE-5921
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> SqlOperatorFixture.checkFails and checkAggFails don't check runtime failure. 
> See more in [code 
> line|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/testkit/src/main/java/org/apache/calcite/test/SqlOperatorFixtureImpl.java#L160].
>  When the parameter `runtime` of SqlOperatorFixture.checkFails is true, it 
> should execute the query and check runtime failure, but currently it ignores 
> this, and only checks the parse and validation failure.
> When fix this, there are 4 failed test cases in CalciteSqlOperatorTest.
> At last, this issue was found when to implement `BIT_GET` function in 
> CALCITE-5848.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6061) MapValueConstructor/MapQueryConstructor use LinkedHashMap erroneously

2023-10-19 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1429#comment-1429
 ] 

Ruben Q L commented on CALCITE-6061:


Agree with [~julianhyde] and [~mbudiu]. Even if the specification says that 
"the order is not guaranteed", that does not mean that "the order must not be 
guaranteed" (as the description of the current ticket seems to imply).
Using a LinkedHashMap is totally valid, and has some advantages (as others have 
mentioned before).

> MapValueConstructor/MapQueryConstructor use LinkedHashMap erroneously
> -
>
> Key: CALCITE-6061
> URL: https://issues.apache.org/jira/browse/CALCITE-6061
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Ran Tao
>Priority: Major
>
> when we call:
> {code:java}
> select map[1,2,3,4,5,6];{code}
> The order of results returned is the same every time. 
> {code:java}
> +-+
> |     EXPR$0      |
> +-+
> | {1=2, 3=4, 5=6} |
> +-+
>  {code}
> because calcite use LinkedHashMap for storage. But semantically, the order 
> should not be guaranteed just like MULTISET.
> we can see other engines such as apache spark/flink just use HashMap in this 
> case.
> {code:java}
> Flink SQL> select map[1,2,3,4,5,6];
> +++
> | op |                         EXPR$0 |
> +++
> | +I |                {5=6, 1=2, 3=4} |
> +++
> Received a total of 1 row {code}
> it will return different order when you call it.
> [https://github.com/apache/flink/blob/a2681f6a85aaad21179f91e03a91b4a05158841e/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/codegen/ExprCodeGenerator.scala#L711]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6014) Create a SqlOperatorFixture that parses, unparses, and then parses again before executing

2023-10-16 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6014.

Resolution: Fixed

Fixed via 
[{{5151168}}|https://github.com/apache/calcite/commit/5151168e9a9035595939c2ae0f21a06984229209]
 

Thanks [~mbudiu] for your contribution!

> Create a SqlOperatorFixture that parses, unparses, and then parses again 
> before executing
> -
>
> Key: CALCITE-6014
> URL: https://issues.apache.org/jira/browse/CALCITE-6014
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Such a fixture will help catch bugs in the unparsing code.
> Several bugs were found using this technique, e.g., CALCITE-5997.
> This is related to CALCITE-5891, CALCITE-6000.
> The SqlParserFixture UnparsingTesterImpl provides a similar service, but 
> since it does not validate the code after unparsing, it will catch fewer bugs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6033) Broken links on adapter page

2023-10-09 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6033.

Resolution: Fixed

Fixed via 
[{{0291e1e}}|https://github.com/apache/calcite/commit/0291e1eeb4c5e01dccbac14e1a6eeeb79d87282b]

Thanks [~duanzhengqiang] for the patch! 

> Broken links on adapter page
> 
>
> Key: CALCITE-6033
> URL: https://issues.apache.org/jira/browse/CALCITE-6033
> Project: Calcite
>  Issue Type: Bug
>  Components: site
>Affects Versions: 1.35.0
>Reporter: Zhengqiang Duan
>Assignee: Zhengqiang Duan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Hi, Calcite community, I found the following dead link in the adapter 
> document:
>  * 
> [https://calcite.apache.org/avatica/apidocs/org/apache/calcite/avatica/ConnectStringParser.html]
>  * 
> [https://calcite.apache.org/javadocAggregate/org/apache/calcite/sql/fun/SqlGroupedWindowFunction.html]
>  * [https://pig.apache.org/docs/r0.7.0/piglatin_ref1.html]
>  * 
> [https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/cassandra/CassandraProject.html]
>  * 
> https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/metadata/RelMetadataQuery.html{_}#getSelectivity-org.apache.calcite.rel.RelNode-org.apache.calcite.rex.RexNode-{_}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5989) Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect

2023-09-29 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5989.

Resolution: Fixed

Fixed via 
[{{3bb7117}}|https://github.com/apache/calcite/commit/3bb71174460d10542c3c9892c6a1a8effb573f9e]

Thanks [~mbudiu] for the patch!

> Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect
> --
>
> Key: CALCITE-5989
> URL: https://issues.apache.org/jira/browse/CALCITE-5989
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> The type inference uses the type `ReturnTypes.ARG0_NULLABLE_VARYING` for the 
> output.
> This means that the output cannot be longer than arg0. This bug surfaces when 
> the query is optimized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6028) Join on with more than 20 in conditions will report a null pointer error.

2023-09-27 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17769591#comment-17769591
 ] 

Ruben Q L commented on CALCITE-6028:


I agree with [~libenchao], if this is a valid SQL, then Calcite should return 
the expected result, independently of the number of values inside the IN or the 
{{DEFAULT_IN_SUB_QUERY_THRESHOLD}}. The fact that there is a workaround (moving 
the IN condition from the ON to the WHERE) does not hide that there is a bug on 
the original query.

> Join on with more than 20 in conditions will report a null pointer error.
> -
>
> Key: CALCITE-6028
> URL: https://issues.apache.org/jira/browse/CALCITE-6028
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: loukey_j
>Priority: Critical
> Attachments: image-2023-09-27-20-59-27-654.png, 
> image-2023-09-27-21-03-21-074.png
>
>
> final String sql = "select t1.x from (values (1, 'a'), (2, 'b')) as t1(x, y) 
> left join (values (1, 'a'), (2, 'b')) as t2(x, y) on t1.x=t2.x and t1.x in 
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21)";   
>  
>  
> java.lang.RuntimeException: while converting `T1`.`X` = `T2`.`X` AND `T1`.`X` 
> IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
> at 
> org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$1(ReflectiveConvertletTable.java:99)
> at 
> org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:59)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5656)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4827)
> at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:166)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:5469)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertOnCondition(SqlToRelConverter.java:3261)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertJoin(SqlToRelConverter.java:3182)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2401)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2285)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:698)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:679)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3748)
> at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:599)
> at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:257)
> at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:220)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:666)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:519)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:487)
> at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:236)
> at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:630)
> at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:677)
> at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:157)
> at 
> org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:228)
> at 
> org.apache.calcite.examples.foodmart.java.JdbcExample.run(JdbcExample.java:52)
> at 
> org.apache.calcite.examples.foodmart.java.JdbcExample.main(JdbcExample.java:36)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Caused by: java.lang.reflect.InvocationTargetException
>  
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$1(ReflectiveConvertletTable.java:95)
> ... 25 more
> Caused by: java.lang.NullPointerException
> at java.util.Objects.requireNonNull(Objects.java:203)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-6017) Update the GitHub link of released versions

2023-09-27 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-6017.

Resolution: Fixed

Fixed via 
[{{56f7b82}}|https://github.com/apache/calcite/commit/56f7b8248352568539cbfe0222606903d69e521c]

Thanks [~taoran] for the patch!

> Update the GitHub link of released versions
> ---
>
> Key: CALCITE-6017
> URL: https://issues.apache.org/jira/browse/CALCITE-6017
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.35.0
>Reporter: Ran Tao
>Assignee: Ran Tao
>Priority: Minor
>  Labels: doc, pull-request-available
> Fix For: 1.36.0
>
> Attachments: image-2023-09-21-16-03-16-132.png, 
> image-2023-09-21-16-04-00-579.png, image-2023-09-21-16-04-54-398.png
>
>
> In calcite history page, we have a description:
> {noformat}
> For a full list of releases, see
> https://github.com/apache/calcite/releases;>github. 
> {noformat}
> however, there is nothing on this page, because calcite not use github to 
> manage the released versions. So this description may seem a bit strange to 
> end users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6017) The content of github link about the released versions in history doc is empty

2023-09-25 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768725#comment-17768725
 ] 

Ruben Q L commented on CALCITE-6017:


Thanks for digging into this, [~taoran]. Looking at your analysis, IMHO 
replacing the link with "tags" seems a reasonable way to fix this issue.

> The content of github link about the released versions in history doc is empty
> --
>
> Key: CALCITE-6017
> URL: https://issues.apache.org/jira/browse/CALCITE-6017
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Ran Tao
>Assignee: Ran Tao
>Priority: Minor
>  Labels: doc
> Attachments: image-2023-09-21-16-03-16-132.png, 
> image-2023-09-21-16-04-00-579.png, image-2023-09-21-16-04-54-398.png
>
>
> In calcite history page, we have a description:
> {noformat}
> For a full list of releases, see
> https://github.com/apache/calcite/releases;>github. 
> {noformat}
> however, there is nothing on this page, because calcite not use github to 
> manage the released versions. So this description may seem a bit strange to 
> end users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5989) Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect

2023-09-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5989:
---
Fix Version/s: 1.36.0

> Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect
> --
>
> Key: CALCITE-5989
> URL: https://issues.apache.org/jira/browse/CALCITE-5989
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> The type inference uses the type `ReturnTypes.ARG0_NULLABLE_VARYING` for the 
> output.
> This means that the output cannot be longer than arg0. This bug surfaces when 
> the query is optimized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5994) Add optimization rule to remove Sort when its input's row number is less or equal to one

2023-09-21 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5994.

Resolution: Fixed

Done via 
[{{1775baf}}|https://github.com/apache/calcite/commit/1775baf2184bc2d10b4991a2dea7e60c474bb5fc]
 

Thanks [~shenlang] for the patch!

> Add optimization rule to remove Sort when its input's row number is less or 
> equal to one
> 
>
> Key: CALCITE-5994
> URL: https://issues.apache.org/jira/browse/CALCITE-5994
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: LakeShen
>Assignee: LakeShen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> When a Sort's input source max row numer is 1,then we could remove the 
> redundant Sort,the Sort could be a sorted semantic Sort(offset and fetch is 
> null).
> For example,the sql:
> {code:java}
> select * from (select * from tableA limit 1)  order by c ;
> {code}
> because the `(select * from tableA limit 1) ` max row number is 1, then we 
> could remove order by c
> {code:java}
> select * from tableA limit 1;
> {code}
> The sql:
> {code:java}
> select max(totalprice) from orders order by 1 {code}
> could converted to:
> {code:java}
> select max(totalprice) from orders{code}
> Above logic are same as Presto/Trino's 
> [RemoveRedundantSort|https://github.com/prestodb/presto/blob/c21fc28846252cd910d90f046514bf586d7bb5c6/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java#L27]
>  rule:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-09-11 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5949:
---
Fix Version/s: 1.36.0

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-09-11 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5949:
---
Summary: RexExecutable should return unchanged original expressions when it 
fails  (was: RexExecutable correct handling of invalid constant expressions)

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Priority: Major
>  Labels: pull-request-available
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-09 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5980.

Resolution: Fixed

Fixed via  
[{{7176860}}|https://github.com/apache/calcite/commit/717686088265aeb1a1b3ba561748c6cb64e1b1b0]
 

> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems caused by the changes introduced in 
> CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But in my case it results on {*}both files being the same{*}, thus when the 
> outFile is created, it actually erases all the tests that were contained 
> inside the inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5732) EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'

2023-09-08 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763005#comment-17763005
 ] 

Ruben Q L edited comment on CALCITE-5732 at 9/8/23 9:43 AM:


Fixed via 
[{{3aee0b8}}|https://github.com/apache/calcite/commit/3aee0b86aa23476cbdecc75ad5d43b936a6fff7b]
 

Thansk [~viggoc] for reporting this issue, and [~libenchao] for the review!


was (Author: rubenql):
Fixed via 
[{{3aee0b8}}|https://github.com/apache/calcite/commit/3aee0b86aa23476cbdecc75ad5d43b936a6fff7b]
 

 Thansk [~viggoc] for reporting this issue, and [~libenchao] for the review!

> EnumerableHashJoin and EnumerableMergeJoin on composite key return rows 
> matching condition 'null = null'
> 
>
> Key: CALCITE-5732
> URL: https://issues.apache.org/jira/browse/CALCITE-5732
> Project: Calcite
>  Issue Type: Bug
>  Components: core, linq4j
>Reporter: Viggo Chen
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> In `EnumerableJoinTest#testMergeJoinWithCompositeKeyAndNullValues`, the query 
> is like 
> {code:java}
> select 
>   emps.empid
> from 
>   emps a join emps b
> on a.deptno = b.deptno
> and a.commission = b.commission;{code}
> and the data is like 
> {code:java}
>   INSERT INTO "emps" VALUES (100, 10, 'Bill', 1, 1000);
>   INSERT INTO "emps" VALUES (200, 20, 'Eric', 8000, 500);
>   INSERT INTO "emps" VALUES (150, 10, 'Sebastian', 7000, null);
>   INSERT INTO "emps" VALUES (110, 10, 'Theodore', 11500, 250); {code}
> And row with empid = 150 is in expected result. Is this the expected result 
> of join with null condition.
> Whats more hash join result with condition a.deptno = b.deptno and 
> a.commission = b.commission is same as merge join. And if there is just one 
> condition a.commission = b.commission, the result do not include empid = 150.
>  
> Here is a unit test for it
> {code:java}
> @Test void testHashJoinWithCompositeKeyAndNullValues() {
>   // Both join side 'commission' a limited to null, so a.commission = 
> b.commission should always be false.
>   // So all columns in right table b are expected to be null, this sql should 
> result in 0 rows.
>   final String sql = "select * from\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as a\n"
>   + " left join\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as b\n"
>   + " on a.salary = b.salary and a.commission = b.commission\n"
>   + " where b.empid is not null";
>   CalciteAssert.that()
>   .with(CalciteConnectionProperty.LEX, Lex.JAVA)
>   .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
>   .withSchema("s", new ReflectiveSchema(new HrSchemaBig()))
>   .query(sql)
>   .withHook(Hook.PLANNER, (Consumer) planner -> {
> planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
> planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
>   })
>   .explainContains("EnumerableHashJoin")
>   .returnsCount(0)
>   ;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5732) EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'

2023-09-08 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5732.

Resolution: Fixed

Fixed via 
[{{3aee0b8}}|https://github.com/apache/calcite/commit/3aee0b86aa23476cbdecc75ad5d43b936a6fff7b]
 

 Thansk [~viggoc] for reporting this issue, and [~libenchao] for the review!

> EnumerableHashJoin and EnumerableMergeJoin on composite key return rows 
> matching condition 'null = null'
> 
>
> Key: CALCITE-5732
> URL: https://issues.apache.org/jira/browse/CALCITE-5732
> Project: Calcite
>  Issue Type: Bug
>  Components: core, linq4j
>Reporter: Viggo Chen
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> In `EnumerableJoinTest#testMergeJoinWithCompositeKeyAndNullValues`, the query 
> is like 
> {code:java}
> select 
>   emps.empid
> from 
>   emps a join emps b
> on a.deptno = b.deptno
> and a.commission = b.commission;{code}
> and the data is like 
> {code:java}
>   INSERT INTO "emps" VALUES (100, 10, 'Bill', 1, 1000);
>   INSERT INTO "emps" VALUES (200, 20, 'Eric', 8000, 500);
>   INSERT INTO "emps" VALUES (150, 10, 'Sebastian', 7000, null);
>   INSERT INTO "emps" VALUES (110, 10, 'Theodore', 11500, 250); {code}
> And row with empid = 150 is in expected result. Is this the expected result 
> of join with null condition.
> Whats more hash join result with condition a.deptno = b.deptno and 
> a.commission = b.commission is same as merge join. And if there is just one 
> condition a.commission = b.commission, the result do not include empid = 150.
>  
> Here is a unit test for it
> {code:java}
> @Test void testHashJoinWithCompositeKeyAndNullValues() {
>   // Both join side 'commission' a limited to null, so a.commission = 
> b.commission should always be false.
>   // So all columns in right table b are expected to be null, this sql should 
> result in 0 rows.
>   final String sql = "select * from\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as a\n"
>   + " left join\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as b\n"
>   + " on a.salary = b.salary and a.commission = b.commission\n"
>   + " where b.empid is not null";
>   CalciteAssert.that()
>   .with(CalciteConnectionProperty.LEX, Lex.JAVA)
>   .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
>   .withSchema("s", new ReflectiveSchema(new HrSchemaBig()))
>   .query(sql)
>   .withHook(Hook.PLANNER, (Consumer) planner -> {
> planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
> planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
>   })
>   .explainContains("EnumerableHashJoin")
>   .returnsCount(0)
>   ;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762840#comment-17762840
 ] 

Ruben Q L commented on CALCITE-5980:


Sure. PR updated.

> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems caused by the changes introduced in 
> CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But in my case it results on {*}both files being the same{*}, thus when the 
> outFile is created, it actually erases all the tests that were contained 
> inside the inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762596#comment-17762596
 ] 

Ruben Q L commented on CALCITE-5980:


What about checking that each inFile must not be empty? That check fails on 
current master, but passes on my branch.

> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems caused by the changes introduced in 
> CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But in my case it results on {*}both files being the same{*}, thus when the 
> outFile is created, it actually erases all the tests that were contained 
> inside the inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-07 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762596#comment-17762596
 ] 

Ruben Q L edited comment on CALCITE-5980 at 9/7/23 7:28 AM:


What about checking that each inFile must not be empty? On my env, that check 
fails on current master, but passes on my branch.


was (Author: rubenql):
What about checking that each inFile must not be empty? That check fails on 
current master, but passes on my branch.

> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems caused by the changes introduced in 
> CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But in my case it results on {*}both files being the same{*}, thus when the 
> outFile is created, it actually erases all the tests that were contained 
> inside the inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5977) RexLiteral returns the same value for TIMESTAMP and TIMESTAMP_WITH_LOCAL_TIME_ZONE

2023-09-06 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762424#comment-17762424
 ] 

Ruben Q L commented on CALCITE-5977:


My bad, everything is fine. I just realized that, for my case to work as 
expected, I needed to use the explicit "CAST" operation with the appropriate 
timezone code in the literal:
{code:sql}
CAST('2023-09-04 17:44:00 Europe/Paris' AS TIMESTAMP WITH LOCAL TIME ZONE)
{code}

Closing this ticket...

> RexLiteral returns the same value for TIMESTAMP and 
> TIMESTAMP_WITH_LOCAL_TIME_ZONE
> --
>
> Key: CALCITE-5977
> URL: https://issues.apache.org/jira/browse/CALCITE-5977
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Perhaps I'm missing something, but this seems odd to me.
> If we want to get the value in Long format (i.e. milliseconds since 
> 1970-01-01 00:00:00) of a TIMESTAMP / TIMESTAMP_WITH_LOCAL_TIME_ZONE 
> RexLiteral, their code is exactly the same (even if they are in different 
> "case" blocks):
> {code}
>   public @Nullable Object getValue2() {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return getValueAs(Long.class);
> ...
>   }
>   public  @Nullable T getValueAs(Class clazz) {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
>   ...
>   break;
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
> ...
> {code}
> In case of a TIMESTAMP_WITH_LOCAL_TIME_ZONE, shouldn't this code include some 
> extra processing to "shift" the value with the proper offset / daylight 
> savings?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (CALCITE-5977) RexLiteral returns the same value for TIMESTAMP and TIMESTAMP_WITH_LOCAL_TIME_ZONE

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L closed CALCITE-5977.
--
Resolution: Not A Problem

> RexLiteral returns the same value for TIMESTAMP and 
> TIMESTAMP_WITH_LOCAL_TIME_ZONE
> --
>
> Key: CALCITE-5977
> URL: https://issues.apache.org/jira/browse/CALCITE-5977
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Perhaps I'm missing something, but this seems odd to me.
> If we want to get the value in Long format (i.e. milliseconds since 
> 1970-01-01 00:00:00) of a TIMESTAMP / TIMESTAMP_WITH_LOCAL_TIME_ZONE 
> RexLiteral, their code is exactly the same (even if they are in different 
> "case" blocks):
> {code}
>   public @Nullable Object getValue2() {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return getValueAs(Long.class);
> ...
>   }
>   public  @Nullable T getValueAs(Class clazz) {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
>   ...
>   break;
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
> ...
> {code}
> In case of a TIMESTAMP_WITH_LOCAL_TIME_ZONE, shouldn't this code include some 
> extra processing to "shift" the value with the proper offset / daylight 
> savings?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5980:
---
Description: 
Discovered by accident on my Windows+IntelliJ environment while I was trying to 
add new tests on a *.iq file. My new tests did not seem to be executed. I even 
tried adding syntax errors on purpose into different iq files to force them to 
fail, but the tests were still successful. The reason seems to be that, at 
least on my environment (Windows), the test files do not execute any of their 
statements. This seems caused by the changes introduced in CALCITE-5786.

While debugging, I found this line in QuidemTest.java (that aims to create the 
inFile and outFile):
{code:java}
protected void checkRun(String path) throws Exception {
  ...
  // e.g. path = "sql/agg.iq"
  // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
  // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
  inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
  outFile = replaceDir(inFile, "resources", "quidem");
  ...
}
{code}
But in my case it results on {*}both files being the same{*}, thus when the 
outFile is created, it actually erases all the tests that were contained inside 
the inFile, so the test runs nothing.

The reason for that is that the auxiliary method {{{}replaceDir{}}}:
{code:java}
  private static File replaceDir(File file, String target, String replacement) {
return new File(
file.getAbsolutePath().replace(n2u('/' + target + '/'),
n2u('/' + replacement + '/')));
  }
{code}
is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
path, but in my case this path does not contain the pattern to be replaced, 
since it contains backslashes: 
"C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
operation does nothing.

  was:
Discovered by accident on my Windows+IntelliJ environment while I was trying to 
add new tests on a *.iq file. My new tests did not seem to be executed. I even 
tried adding syntax errors on purpose into different iq files to force them to 
fail, but the tests were still successful. The reason seems to be that, at 
least on my environment (Windows), the test files do not execute any of their 
statements. This seems a consequence of CALCITE-5786.


While debugging, I found this line in QuidemTest.java (that aims to create the 
inFile and outFile):
{code:java}
protected void checkRun(String path) throws Exception {
  ...
  // e.g. path = "sql/agg.iq"
  // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
  // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
  inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
  outFile = replaceDir(inFile, "resources", "quidem");
  ...
}
{code}
But it results on {*}both files being the same{*}, thus when the outFile is 
created, it actually erases all the tests that were contained inside the 
inFile, so the test runs nothing.

The reason for that is that the auxiliary method {{{}replaceDir{}}}:
{code:java}
  private static File replaceDir(File file, String target, String replacement) {
return new File(
file.getAbsolutePath().replace(n2u('/' + target + '/'),
n2u('/' + replacement + '/')));
  }
{code}
is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
path, but in my case this path does not contain the pattern to be replaced, 
since it contains backslashes: 
"C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
operation does nothing.


> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems caused by the changes introduced in 
> CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> 

[jira] [Updated] (CALCITE-5980) QuidemTests are not effectively executed on Windows

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5980:
---
Summary: QuidemTests are not effectively executed on Windows  (was: 
QuidemTest are not effectively executed on Windows)

> QuidemTests are not effectively executed on Windows
> ---
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems a consequence of CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But it results on {*}both files being the same{*}, thus when the outFile is 
> created, it actually erases all the tests that were contained inside the 
> inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CALCITE-5980) QuidemTest are not effectively executed on Windows

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reassigned CALCITE-5980:
--

Assignee: Ruben Q L

> QuidemTest are not effectively executed on Windows
> --
>
> Key: CALCITE-5980
> URL: https://issues.apache.org/jira/browse/CALCITE-5980
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.36.0
>
>
> Discovered by accident on my Windows+IntelliJ environment while I was trying 
> to add new tests on a *.iq file. My new tests did not seem to be executed. I 
> even tried adding syntax errors on purpose into different iq files to force 
> them to fail, but the tests were still successful. The reason seems to be 
> that, at least on my environment (Windows), the test files do not execute any 
> of their statements. This seems a consequence of CALCITE-5786.
> While debugging, I found this line in QuidemTest.java (that aims to create 
> the inFile and outFile):
> {code:java}
> protected void checkRun(String path) throws Exception {
>   ...
>   // e.g. path = "sql/agg.iq"
>   // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
>   // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
>   // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
>   inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
>   outFile = replaceDir(inFile, "resources", "quidem");
>   ...
> }
> {code}
> But it results on {*}both files being the same{*}, thus when the outFile is 
> created, it actually erases all the tests that were contained inside the 
> inFile, so the test runs nothing.
> The reason for that is that the auxiliary method {{{}replaceDir{}}}:
> {code:java}
>   private static File replaceDir(File file, String target, String 
> replacement) {
> return new File(
> file.getAbsolutePath().replace(n2u('/' + target + '/'),
> n2u('/' + replacement + '/')));
>   }
> {code}
> is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
> path, but in my case this path does not contain the pattern to be replaced, 
> since it contains backslashes: 
> "C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
> operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5980) QuidemTest are not effectively executed on Windows

2023-09-06 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-5980:
--

 Summary: QuidemTest are not effectively executed on Windows
 Key: CALCITE-5980
 URL: https://issues.apache.org/jira/browse/CALCITE-5980
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Ruben Q L
 Fix For: 1.36.0


Discovered by accident on my Windows+IntelliJ environment while I was trying to 
add new tests on a *.iq file. My new tests did not seem to be executed. I even 
tried adding syntax errors on purpose into different iq files to force them to 
fail, but the tests were still successful. The reason seems to be that, at 
least on my environment (Windows), the test files do not execute any of their 
statements. This seems a consequence of CALCITE-5786.


While debugging, I found this line in QuidemTest.java (that aims to create the 
inFile and outFile):
{code:java}
protected void checkRun(String path) throws Exception {
  ...
  // e.g. path = "sql/agg.iq"
  // inUrl = "file:/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // inFile = "/home/fred/calcite/core/build/resources/test/sql/agg.iq"
  // outDir = "/home/fred/calcite/core/build/quidem/test/sql"
  // outFile = "/home/fred/calcite/core/build/quidem/test/sql/agg.iq"
  inFile = Sources.of(requireNonNull(inUrl, "inUrl")).file();
  outFile = replaceDir(inFile, "resources", "quidem");
  ...
}
{code}
But it results on {*}both files being the same{*}, thus when the outFile is 
created, it actually erases all the tests that were contained inside the 
inFile, so the test runs nothing.

The reason for that is that the auxiliary method {{{}replaceDir{}}}:
{code:java}
  private static File replaceDir(File file, String target, String replacement) {
return new File(
file.getAbsolutePath().replace(n2u('/' + target + '/'),
n2u('/' + replacement + '/')));
  }
{code}
is trying to replace "/resources/" with "/quidem/" from the inFile absolute 
path, but in my case this path does not contain the pattern to be replaced, 
since it contains backslashes: 
"C:\...\calcite\core\build\resources\test\sql\agg.iq"; so the replacement 
operation does nothing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5967) UnsupportedOperationException while implementing a call that requires a special collator

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5967.

Resolution: Fixed

Fixed via 
[{{164ff0a}}|https://github.com/apache/calcite/commit/164ff0a27e243850d294908dc5cff90760d0a35a]
 

> UnsupportedOperationException while implementing a call that requires a 
> special collator
> 
>
> Key: CALCITE-5967
> URL: https://issues.apache.org/jira/browse/CALCITE-5967
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Regression introduced by a minor change within CALCITE-5914, detected while 
> testing a downstream project with the latest Calcite main.
> CALCITE-5914 (see 
> [2a96512c|https://github.com/apache/calcite/commit/2a96512c352bda4a5d9c0c80730f5c115ac363d6])
>  introduced this apparently innocuous change in 
> {{RexImpTable#AbstractRexCallImplementor#unboxIfNecessary}}:
> {code}
> // old
> argValueList.stream()
> .map(AbstractRexCallImplementor::unboxExpression)
> .collect(Collectors.toList());
> =>
> // new
> Util.transform(argValueList,
> AbstractRexCallImplementor::unboxExpression);
> {code}
> Both expressions seem equivalent, however there is a subtle difference: the 
> old one returns an {{ArrayList}} (where we can add new elements); whereas the 
> new one returns a {{TransformingList}} that extends {{AbstractList}} and that 
> does not support {{List#add}}.
> After calling {{unboxIfNecessary}}, we might need to modify the argument list 
> if we need a special collator to perform the operation:
> {code}
> private ParameterExpression genValueStatement(...) {
>   ...
>   optimizedArgValueList = unboxIfNecessary(optimizedArgValueList);
>   final Expression callValue =
>   implementSafe(translator, call, optimizedArgValueList);
>   ...
> }
> @Override Expression implementSafe(...) {
>   ...
>   final Expression fieldComparator =
>   generateCollatorExpression(relDataType0.getCollation());
>   if (fieldComparator != null) {
> argValueList.add(fieldComparator);  // <--- 
> UnsupportedOperationException!
>   }
>   ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5952) SemiJoinJoinTransposeRule should check if JoinType supports pushing predicates into its inputs

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5952.

Resolution: Fixed

Fixed via 
[{{d667123}}|https://github.com/apache/calcite/commit/d667123585bf518edd6a9bf93e23c1785fe03376]

Thanks [~lchistov1987] for the patch!{{{}{}}}

> SemiJoinJoinTransposeRule should check if JoinType supports pushing 
> predicates into its inputs
> --
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5952) SemiJoinJoinTransposeRule should check if JoinType supports pushing predicates into its inputs

2023-09-06 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5952:
---
Fix Version/s: 1.36.0

> SemiJoinJoinTransposeRule should check if JoinType supports pushing 
> predicates into its inputs
> --
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5967) UnsupportedOperationException while implementing a call that requires a special collator

2023-09-06 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762296#comment-17762296
 ] 

Ruben Q L commented on CALCITE-5967:


Sure, I'll apply those suggestions before merging the patch.

> UnsupportedOperationException while implementing a call that requires a 
> special collator
> 
>
> Key: CALCITE-5967
> URL: https://issues.apache.org/jira/browse/CALCITE-5967
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Regression introduced by a minor change within CALCITE-5914, detected while 
> testing a downstream project with the latest Calcite main.
> CALCITE-5914 (see 
> [2a96512c|https://github.com/apache/calcite/commit/2a96512c352bda4a5d9c0c80730f5c115ac363d6])
>  introduced this apparently innocuous change in 
> {{RexImpTable#AbstractRexCallImplementor#unboxIfNecessary}}:
> {code}
> // old
> argValueList.stream()
> .map(AbstractRexCallImplementor::unboxExpression)
> .collect(Collectors.toList());
> =>
> // new
> Util.transform(argValueList,
> AbstractRexCallImplementor::unboxExpression);
> {code}
> Both expressions seem equivalent, however there is a subtle difference: the 
> old one returns an {{ArrayList}} (where we can add new elements); whereas the 
> new one returns a {{TransformingList}} that extends {{AbstractList}} and that 
> does not support {{List#add}}.
> After calling {{unboxIfNecessary}}, we might need to modify the argument list 
> if we need a special collator to perform the operation:
> {code}
> private ParameterExpression genValueStatement(...) {
>   ...
>   optimizedArgValueList = unboxIfNecessary(optimizedArgValueList);
>   final Expression callValue =
>   implementSafe(translator, call, optimizedArgValueList);
>   ...
> }
> @Override Expression implementSafe(...) {
>   ...
>   final Expression fieldComparator =
>   generateCollatorExpression(relDataType0.getCollation());
>   if (fieldComparator != null) {
> argValueList.add(fieldComparator);  // <--- 
> UnsupportedOperationException!
>   }
>   ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5977) RexLiteral returns the same value for TIMESTAMP and TIMESTAMP_WITH_LOCAL_TIME_ZONE

2023-09-05 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762129#comment-17762129
 ] 

Ruben Q L commented on CALCITE-5977:


Thanks for the feedback [~julianhyde].

But then, if I am not mistaken, that would mean that if we run these two 
queries in Calcite:
{code:sql}
SELECT * FROM myTable t WHERE t.lastUpdate > TIMESTAMP'2023-08-28 09:56:54'

SELECT * FROM myTable t WHERE t.lastUpdate > TIMESTAMP WITH LOCAL TIME 
ZONE'2023-08-28 09:56:54'
{code}

The dynamic code that will be generated to produce the query result with 
Enumerable convention will be exactly the same, right? Is that correct?



> RexLiteral returns the same value for TIMESTAMP and 
> TIMESTAMP_WITH_LOCAL_TIME_ZONE
> --
>
> Key: CALCITE-5977
> URL: https://issues.apache.org/jira/browse/CALCITE-5977
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Priority: Major
>
> Perhaps I'm missing something, but this seems odd to me.
> If we want to get the value in Long format (i.e. milliseconds since 
> 1970-01-01 00:00:00) of a TIMESTAMP / TIMESTAMP_WITH_LOCAL_TIME_ZONE 
> RexLiteral, their code is exactly the same (even if they are in different 
> "case" blocks):
> {code}
>   public @Nullable Object getValue2() {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return getValueAs(Long.class);
> ...
>   }
>   public  @Nullable T getValueAs(Class clazz) {
> ...
> switch (typeName) {
> ...
> case TIMESTAMP:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
>   ...
>   break;
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   if (clazz == Long.class) {
> // Milliseconds since 1970-01-01 00:00:00
> return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
>   }
> ...
> {code}
> In case of a TIMESTAMP_WITH_LOCAL_TIME_ZONE, shouldn't this code include some 
> extra processing to "shift" the value with the proper offset / daylight 
> savings?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5977) RexLiteral returns the same value for TIMESTAMP and TIMESTAMP_WITH_LOCAL_TIME_ZONE

2023-09-05 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-5977:
--

 Summary: RexLiteral returns the same value for TIMESTAMP and 
TIMESTAMP_WITH_LOCAL_TIME_ZONE
 Key: CALCITE-5977
 URL: https://issues.apache.org/jira/browse/CALCITE-5977
 Project: Calcite
  Issue Type: Bug
  Components: core
Reporter: Ruben Q L


Perhaps I'm missing something, but this seems odd to me.

If we want to get the value in Long format (i.e. milliseconds since 1970-01-01 
00:00:00) of a TIMESTAMP / TIMESTAMP_WITH_LOCAL_TIME_ZONE RexLiteral, their 
code is exactly the same (even if they are in different "case" blocks):

{code}
  public @Nullable Object getValue2() {
...
switch (typeName) {
...
case TIMESTAMP:
case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
  return getValueAs(Long.class);
...
  }

  public  @Nullable T getValueAs(Class clazz) {
...
switch (typeName) {
...
case TIMESTAMP:
  if (clazz == Long.class) {
// Milliseconds since 1970-01-01 00:00:00
return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
  }
  ...
  break;
case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
  if (clazz == Long.class) {
// Milliseconds since 1970-01-01 00:00:00
return clazz.cast(((TimestampString) value).getMillisSinceEpoch());
  }
...
{code}

In case of a TIMESTAMP_WITH_LOCAL_TIME_ZONE, shouldn't this code include some 
extra processing to "shift" the value with the proper offset / daylight savings?




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5967) UnsupportedOperationException while implementing a call that requires a special collator

2023-09-04 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761863#comment-17761863
 ] 

Ruben Q L commented on CALCITE-5967:


[~julianhyde] do you have any other remark? Otherwise I'll move forward and 
merge the patch into main.

> UnsupportedOperationException while implementing a call that requires a 
> special collator
> 
>
> Key: CALCITE-5967
> URL: https://issues.apache.org/jira/browse/CALCITE-5967
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> Regression introduced by a minor change within CALCITE-5914, detected while 
> testing a downstream project with the latest Calcite main.
> CALCITE-5914 (see 
> [2a96512c|https://github.com/apache/calcite/commit/2a96512c352bda4a5d9c0c80730f5c115ac363d6])
>  introduced this apparently innocuous change in 
> {{RexImpTable#AbstractRexCallImplementor#unboxIfNecessary}}:
> {code}
> // old
> argValueList.stream()
> .map(AbstractRexCallImplementor::unboxExpression)
> .collect(Collectors.toList());
> =>
> // new
> Util.transform(argValueList,
> AbstractRexCallImplementor::unboxExpression);
> {code}
> Both expressions seem equivalent, however there is a subtle difference: the 
> old one returns an {{ArrayList}} (where we can add new elements); whereas the 
> new one returns a {{TransformingList}} that extends {{AbstractList}} and that 
> does not support {{List#add}}.
> After calling {{unboxIfNecessary}}, we might need to modify the argument list 
> if we need a special collator to perform the operation:
> {code}
> private ParameterExpression genValueStatement(...) {
>   ...
>   optimizedArgValueList = unboxIfNecessary(optimizedArgValueList);
>   final Expression callValue =
>   implementSafe(translator, call, optimizedArgValueList);
>   ...
> }
> @Override Expression implementSafe(...) {
>   ...
>   final Expression fieldComparator =
>   generateCollatorExpression(relDataType0.getCollation());
>   if (fieldComparator != null) {
> argValueList.add(fieldComparator);  // <--- 
> UnsupportedOperationException!
>   }
>   ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-09-04 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761850#comment-17761850
 ] 

Ruben Q L commented on CALCITE-5952:


bq. My new proposal is "SemiJoinJoinTransposeRule should check if JoinType does 
support pushing predicates into its inputs"

Sounds good (y)   
(nitpick: I'd simplify it a bit by using just "supports" instead of "does 
support")

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-09-04 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761830#comment-17761830
 ] 

Ruben Q L commented on CALCITE-5952:


{quote}Maybe just "SemiJoinJoinTransposeRule should not push SemiJoin to the 
right (left) input of the Left (Right) Join"
{quote}
Mmmm, I think that is not entirely accurate, there are cases where Left (Right) 
Join can apply the rule: if the SemiJoin keys contains keys from X (Join's LHS 
in SemiJoinJoinTransposeRule example), then the rule can be applied on a 
LeftJoin (but not on a RightJoin), and vice-versa; right?
{noformat}
LogicalJoin(condition: emp.bonusId = bonus.id, joinType=[semi])
  LogicalJoin(condition: emp.deptId = dept.id, joinType=[left])
LogicalTableScan(table=[[scott, EMP]])
LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) 
=> valid transformation
LogicalJoin(condition: emp.deptId = dept.id, joinType=[left])
  LogicalJoin(condition: emp.bonusId = bonus.id, joinType=[semi])
LogicalTableScan(table=[[scott, EMP]])
LogicalTableScan(table=[[scott, BONUS]]) 
  LogicalTableScan(table=[[scott, DEPT]])
{noformat}
If they do not exist, perhaps it would be nice to add these examples of 
Left/Right Join where the rule can be applied (still after your patch) in 
RelOptRulesTest.java, wdyt?

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-09-04 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761799#comment-17761799
 ] 

Ruben Q L commented on CALCITE-5952:


Thanks [~lchistov1987].
IMO both tests are complementary (and relevant), if I had to chose, I'd prefer 
the runtime test; but the more tests, the better.
Normally, the process that we follow to fix a bug is:
- Produce a unit test that shows the bug, i.e. it fails.
- Correct the issue.
- The test from the 1st point now passes.

In terms of the test for the 1st point, IMO it is better to show the issue on 
an actual query (ideally in SQL, but I agree that Quidem tests do have some 
limitation and there is some discussion going on). So I think it is more clear 
(and closer to a real-life scenario) producing a query plan that "suffers from 
the problem", either it throws and exception or it produces a wrong the result.
I don't know what others might think, but these are my 2 cents on this topic.

Regarding the PR, it looks good. Perhaps the Jira/PR tittle (and the commit 
message) is not entirely accurate: it talks about LeftJoin, but the problem can 
also occur with RightJoin. What about something along the lines: 
"SemiJoinJoinTransposeRule must not be applied if SemiJoin has keys from Join's 
LHS/RHS and Join type does not support pushing predicates into its left/right 
input"? (a bit long, I know, but I cant come up with a better title, 
suggestions are welcome :)  

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >