[ 
https://issues.apache.org/jira/browse/DRILL-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975198#comment-16975198
 ] 

benj commented on DRILL-7375:
-----------------------------

Waiting for a possible official version of such a feature, it is possible to 
use an own UDF like:
{code:java}
package org.apache.drill.contrib.function;

import io.netty.buffer.DrillBuf;
import org.apache.drill.exec.expr.DrillSimpleFunc;
import org.apache.drill.exec.expr.annotations.FunctionTemplate;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
import org.apache.drill.exec.expr.annotations.Output;
import org.apache.drill.exec.expr.annotations.Param;
import org.apache.drill.exec.vector.complex.reader.FieldReader;
import org.apache.drill.exec.expr.holders.*;
import javax.inject.Inject;

public class ToString {
 
@FunctionTemplate(
    name = "to_string",
    scope = FunctionScope.SIMPLE,
    nulls = NullHandling.NULL_IF_NULL)
public static class NullableVarChar_Field_ToString implements DrillSimpleFunc {
  @Param FieldReader input;
  @Output VarCharHolder out;
  @Inject DrillBuf buffer;

  @Override public void setup() {
  }
  
  @Override public void eval() {
      String rowString = input.readObject().toString();

      buffer = buffer.reallocIfNeeded(rowString.length());
      buffer.setBytes(0, rowString.getBytes(), 0, rowString.length());
      out.start  = 0;
      out.end    = rowString.length();
      out.buffer = buffer;
  }
}
}
{code}

Example of use:
{code:sql}
apache drill> SELECT j, typeof(j) AS tj, to_string(j) AS strj, 
typeof(to_string(j)) AS tstrj 
FROM (SELECT convert_fromJSON('{a:["1","2","3"]}' ) j);
+---------------------+-----+---------------------+---------+
|          j          | tj  |        strj         |  tstrj  |
+---------------------+-----+---------------------+---------+
| {"a":["1","2","3"]} | MAP | {"a":["1","2","3"]} | VARCHAR |
+---------------------+-----+---------------------+---------+
1 row selected (0.132 seconds)
{code}

With this function it's possible to "cast" anything in varchar and avoid 
storage problem in Parquet due to certain types. And it is eventually possible 
to cast the other way when requesting the Parquet file. 



> composite/nested type map/array convert_to/cast to varchar
> ----------------------------------------------------------
>
>                 Key: DRILL-7375
>                 URL: https://issues.apache.org/jira/browse/DRILL-7375
>             Project: Apache Drill
>          Issue Type: Wish
>          Components: Functions - Drill
>    Affects Versions: 1.16.0
>            Reporter: benj
>            Priority: Major
>
> As it possible to cast varchar to map (convert_from + JSON) with convert_from 
> or transform a varchar to array (split)
> {code:sql}
> SELECT a, typeof(a), sqltypeof(a) from (SELECT CONVERT_FROM('{a : 100, b: 
> 200}' ,'JSON') a);
> +-------------------+---------+--------+
> |         a         | EXPR$1  | EXPR$2 |
> +-------------------+---------+--------+
> | {"a":100,"b":200} | MAP     | STRUCT |
> +-------------------+---------+--------+
> SELECT a, typeof(a), sqltypeof(a)FROM (SELECT split(str,',') AS a FROM ( 
> SELECT 'foo,bar' AS str));
> +-------------------+---------+--------+
> |        a          | EXPR$1  | EXPR$2 |
> +-------------------+---------+--------+
> | ["foo","bar"]     | VARCHAR | ARRAY  |
> +-------------------+---------+--------+
> {code}
> It will be very usefull :
>  # to have the capacity to "cast" the +_MAP_ into VARCHAR+ with a "cast 
> syntax" or with a "convert_to" possibility
>  Expected:
> {code:sql}
> SELECT a, typeof(a) ta, va, typeof(va) tva FROM (
> SELECT a, CAST(a AS varchar) va FROM (SELECT CONVERT_FROM('{a : 100, b: 200}' 
> ,'JSON') a));
> +-------------------+------+-------------------+---------+
> |         a         | ta   | va                | tva     |
> +-------------------+------+-------------------+---------+
> | {"a":100,"b":200} | MAP  | {"a":100,"b":200} | VARCHAR |
> +-------------------+------+-------------------+---------+
> {code}
>  # to have the capacity to "cast" the +_ARRAY_ into VARCHAR+ with a "cast 
> syntax" or any other method
>  Expected
> {code:sql}
> SELECT a, sqltypeof(a) ta, va, sqltypeof(va) tva FROM (
> SELECT a, CAST(a AS varchar) va FROM (SELECT split(str,',') AS a FROM ( 
> SELECT 'foo,bar' AS str));
> +-------------------+------+-------------------+---------+
> |         a         | ta   | va                | tva     |
> +-------------------+------+-------------------+---------+
> | ["foo","bar"]     | ARRAY| ["foo","bar"]     | VARCHAR |
> +-------------------+------+-------------------+---------+
> {code}
> Please note that these possibility of course exists in other database systems
>  Example with Postgres:
> {code:sql}
> SELECT '{"a":100,"b":200}'::json::text;
> => {"a":100,"b":200}
> SELECT array[1,2,3]::text;
> => {1,2,3}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to