[
https://issues.apache.org/jira/browse/DRILL-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882203#comment-16882203
]
Arina Ielchiieva commented on DRILL-7322:
-----------------------------------------
The difference in the results because cast is using {{BooleanType.get}} while
schema file is using {{BooleanType.fromString}}. They process boolean literals
differently: first accepts only supported literals for true and false values
and does the trim, fails is no match is found, second does not do the trim and
converts to true all supported true literal values, others treats as false.
I think it would be more natural to use one common way to transform boolean
literals.
[~paul-rogers] what do you think?
> Align cast boolean and schema boolean conversion
> ------------------------------------------------
>
> Key: DRILL-7322
> URL: https://issues.apache.org/jira/browse/DRILL-7322
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.16.0
> Reporter: Denys Ordynskiy
> Priority: Major
>
> Information schema file allows converting any string to the boolean data type.
> But "case(.. as boolean)" statement throws an error:
> {color:#d04437}UserRemoteException : SYSTEM ERROR: IllegalArgumentException:
> Invalid value for boolean: a
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: Invalid value for boolean: a{color}
> *Information Schema file should allow using the same range of boolean
> literals as cast statement.*
> *Steps to reproduce:*
> Upload text file all_types.csvh to the DFS /tmp/ischema/all_types:
> {noformat}
> boolean_col,boolean_col_for_cast
> true,true
> 1,1
> t,t
> y,y
> yes,yes
> on,on
> false,false
> 0,0
> f,f
> n,n
> no,no
> off,off
> a,a
> -,-
> !,!
> `,`
> 7,7
> @,@
> ^,^
> *,*
> {noformat}
> *Create schema:*
> {noformat}
> create schema (boolean_col boolean, boolean_col_for_cast varchar) for table
> dfs.tmp.`ischema/all_types`
> {noformat}
> *Run the query without cast:*
> select boolean_col, sqlTypeOf(boolean_col) boolean_col_type,
> boolean_col_for_cast, sqlTypeOf(boolean_col_for_cast)
> boolean_col_for_cast_type from dfs.tmp.`ischema/all_types`
> |boolean_col|boolean_col_type|boolean_col_for_cast|boolean_col_for_cast_type|
> |true|BOOLEAN|true|CHARACTER VARYING|
> |true|BOOLEAN|1|CHARACTER VARYING|
> |true|BOOLEAN|t|CHARACTER VARYING|
> |true|BOOLEAN|y|CHARACTER VARYING|
> |true|BOOLEAN|yes|CHARACTER VARYING|
> |true|BOOLEAN|on|CHARACTER VARYING|
> |false|BOOLEAN|false|CHARACTER VARYING|
> |false|BOOLEAN|0|CHARACTER VARYING|
> |false|BOOLEAN|f|CHARACTER VARYING|
> |false|BOOLEAN|n|CHARACTER VARYING|
> |false|BOOLEAN|no|CHARACTER VARYING|
> |false|BOOLEAN|off|CHARACTER VARYING|
> |false|BOOLEAN|a|CHARACTER VARYING|
> |false|BOOLEAN|-|CHARACTER VARYING|
> |false|BOOLEAN|!|CHARACTER VARYING|
> |false|BOOLEAN|`|CHARACTER VARYING|
> |false|BOOLEAN|7|CHARACTER VARYING|
> |false|BOOLEAN|@|CHARACTER VARYING|
> |false|BOOLEAN|^|CHARACTER VARYING|
> |false|BOOLEAN|*|CHARACTER VARYING|
> *Run the query with cast:*
> select boolean_col, sqlTypeOf(boolean_col) boolean_col_type,
> cast(boolean_col_for_cast as boolean) boolean_col_for_cast,
> sqlTypeOf(cast(boolean_col_for_cast as boolean)) boolean_col_for_cast_type
> from dfs.tmp.`ischema/all_types`
> {color:#d04437}UserRemoteException : SYSTEM ERROR: IllegalArgumentException:
> Invalid value for boolean: a
>
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: Invalid value for boolean: *a*
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: b9deab6f-7fd4-40c0-acdf-b2e31747e16f on cv1:31010]{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)