[
https://issues.apache.org/jira/browse/DERBY-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610473#comment-17610473
]
Stamatis Zampetakis commented on DERBY-7132:
--------------------------------------------
I spend some more time on this problem and it turns out that the problem is not
really specific to CASE WHEN evaluation as I initially thought. As [~rhillegas]
mentioned the problem comes from the fact that the filter/restriction including
the CAST is pushed down too far.
In [^Derby-7132-plan.pdf], I sketched how the query plan looks like
highlighting the {{ProjectRestrictResultSet}} operator where the CASE WHEN
clause is evaluated. I naively assumed that the CAST will be evaluated after
all joins but that's not the case here. According to SQL standard the WHERE
clause is applied on the result of the FROM clause so pushing the CAST so far
down is still a bug. Derby, as most (if not all) well-known databases, is
pushing filters aggressively below inner joins. In 99% of the time this is a
valid optimization but when the condition can throw an exception (e.g., CAST)
this optimization shouldn't be done (or at least should be done differently).
Just for the sake of completeness the combination of rows triggering the
exception is the following:
{noformat}
@2723: { 1, default, hive, 8, 1, src_bucket_tbl } @2239 { 1, 2008-04-08, 0 }
{noformat}
Observed that all conditions in the {{CASE WHEN}} are satisfied so what happens
is that we will attempt to cast 2008-04-08 (PART_KEY_VAL) to a DECIMAL(21,0)
and the latter obviously fails.
Apologies for not attaching the actual Derby plan and coming up with this
sketch but I didn't find a way to get the plan when there is an
{{RuntimeException}}. If you have any tips around this I may use it in the
future.
> SQLDataException when executing CAST inside a CASE WHEN clause
> --------------------------------------------------------------
>
> Key: DERBY-7132
> URL: https://issues.apache.org/jira/browse/DERBY-7132
> Project: Derby
> Issue Type: Bug
> Components: SQL
> Affects Versions: 10.14.2.0, 10.15.2.0
> Reporter: Stamatis Zampetakis
> Priority: Major
> Attachments: Derby-7132-plan.pdf, derby-dump.tar.gz, schemaddl.sql,
> uml_schema.svg
>
>
> {code:sql}
> SELECT "PARTITIONS"."PART_ID"
> FROM "PARTITIONS"
> INNER JOIN "TBLS" ON "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
> INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID"
> INNER JOIN "PARTITION_KEY_VALS" "FILTER0" ON "FILTER0"."PART_ID" =
> "PARTITIONS"."PART_ID"
> WHERE "DBS"."CTLG_NAME" = 'hive'
> AND "TBLS"."TBL_NAME" = 'src_bucket_tbl'
> AND "DBS"."NAME" = 'default'
> AND "FILTER0"."INTEGER_IDX" = 0
> AND (((CASE
> WHEN "FILTER0"."PART_KEY_VAL" <> '__HIVE_DEFAULT_PARTITION__'
> AND "TBLS"."TBL_NAME" = 'src_bucket_tbl'
> AND "DBS"."NAME" = 'default'
> AND "DBS"."CTLG_NAME" = 'hive'
> AND "FILTER0"."INTEGER_IDX" = 0 THEN
> cast("FILTER0"."PART_KEY_VAL" AS decimal(21, 0))
> END) = 10))
> {code}
> The SQL query above fails with the following stacktrace when attempting to
> evaluate the CAST expression. Note that the condition inside the CASE WHEN
> clause guarantees that only legal values (numbers) should be passed inside
> the CAST function. Apparently, the operations are somehow re-ordered and the
> CAST is evaluated before the condition in the WHEN clause which has a result
> a non-number to be passed in the CAST and cause the exception below.
> {noformat}
> Exception in thread "main" java.sql.SQLDataException: Invalid character
> string format for type DECIMAL.
> at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:84)
> at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:230)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:424)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:353)
> at
> org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:2405)
> at
> org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:88)
> at
> org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1436)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1709)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(EmbedPreparedStatement.java:286)
> at com.github.zabetak.CaseProblem.main(CaseProblem.java:63)
> Caused by: ERROR 22018: Invalid character string format for type DECIMAL.
> at
> org.apache.derby.iapi.error.StandardException.newException(StandardException.java:290)
> at
> org.apache.derby.iapi.error.StandardException.newException(StandardException.java:285)
> at
> org.apache.derby.iapi.types.DataType.invalidFormat(DataType.java:1280)
> at org.apache.derby.iapi.types.DataType.setValue(DataType.java:552)
> at
> org.apache.derby.exe.acf81e0010x017fx0812xbaa5x00003a07fe880.e3(Unknown
> Source)
> at
> org.apache.derby.impl.services.reflect.DirectCall.invoke(ReflectGeneratedClass.java:107)
> at
> org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java:302)
> at
> org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(NestedLoopJoinResultSet.java:119)
> at
> org.apache.derby.impl.sql.execute.JoinResultSet.openCore(JoinResultSet.java:149)
> at
> org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.openCore(ProjectRestrictResultSet.java:182)
> at
> org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(BasicNoPutResultSetImpl.java:266)
> at
> org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(GenericPreparedStatement.java:472)
> at
> org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:351)
> at
> org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1344)
> {noformat}
> The problem can be reproduced by running the query above in the derby
> database attached to the case.
> {code:sql}
> try (Connection c =
> DriverManager.getConnection("jdbc:derby:;databaseName=repro_derby_db")) {
> try (PreparedStatement ps = c.prepareStatement(sql)) {
> try (ResultSet rs = ps.executeQuery()) {
> while (rs.next()) {
> System.out.println(rs.getInt(1));
> }
> }
> }
> }
> {code}
> Unfortunately, I couldn't write a minimal reproducer cause slight changes to
> the order of performing the operations in the database has an impact on the
> plan and may hide the problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)