[ 
https://issues.apache.org/jira/browse/HIVE-18524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16376347#comment-16376347
 ] 

Ke Jia commented on HIVE-18524:
-------------------------------

[~mmccline]:

{quote}I want to see the IF expr support for THEN/ELSE expressions general not 
just for a special case for VectorUDFAdaptor children.

{quote}

I think the IF expr support for THEN/ELSE expressions has two cases:row mode 
and vector mode. Here, the specicial case for VectorUDFAdaptor is to handle the 
row mode. And For the vector mode is handled in the IfExprXX classes.

{quote}Ok, so the test framework is on me.  I'm still concerned about quality 
here.  Historically our vectorization testing has been too minimal.

{quote}

Thanks for your help.

{quote}I don't want to see IfExprColumnNull and IfExprNullColumn and other 
classes made subclasses of IfExprConditionalFilter and made more complicated.  
I'd rather see new classes IfExprExprExpr and perhaps IfExprExprNull and 
IfExprNullExpr each of which can be read straight through.

{quote}

I will add three new classes  IfExprExprExpr , IfExprExprNull and 
IfExprNullExpr to handle different case later.

{quote}Why are any changes to VectorUDFAdaptor necessary at all?

{quote}

The change in VectorUDFApaptor is to handle If expression in row mode. In hive, 
some row mode expression(deciminal operation) does not support vector mode. And 
if the IF expression contain decimal operation, hive will make vectorized 
operator by VectorUDFAdaptor.

If any questions, Please tell me. Thanks for your help.

> Vectorization: Execution failure related to non-standard embedding of 
> IfExprConditionalFilter inside VectorUDFAdaptor (Revert HIVE-17139)
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18524
>                 URL: https://issues.apache.org/jira/browse/HIVE-18524
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.0.0
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: HIVE-18524.01.patch, HIVE-18524.02.patch
>
>
> {noformat}
> insert overwrite table insert_10_1
>     select cast(gpa as float),
>            age,
>            IF(age>40,cast('2011-01-01 01:01:01' as timestamp),NULL),
>            IF(LENGTH(name)>10,cast(name as binary),NULL)
>     from studentnull10k
> vectorizationSchemaColumns: [0:name:string, 1:age:int, 2:gpa:double]
> ExprNodeDescs:
>     UDFToFloat(gpa) (type: float),
>     age (type: int),
>     if((age > 40), 2011-01-01 01:01:01.0, null) (type: timestamp),
>     if((length(name) > 10), CAST( name AS BINARY), null) (type: binary)
> selectExpressions:
>     VectorUDFAdaptor(if((age > 40), 2011-01-01 01:01:01.0, null))
>         (children: LongColGreaterLongScalar(col 1:int, val 40) -> 4:boolean) 
> -> 5:timestamp,
>     VectorUDFAdaptor(if((length(name) > 10), CAST( name AS BINARY), null))
>         (children: LongColGreaterLongScalar(col 4:int, val 10)(children: 
> StringLength(col 0:string) -> 4:int) -> 6:boolean,
>         VectorUDFAdaptor(CAST( name AS BINARY)) -> 7:binary) -> 8:binary
> {noformat}
> *// Notice there is no vector expression shown for the last IF stmt.*  It has 
> been magically embedded inside the VectorUDFAdaptor object...
> Execution results in this call stack.
> {nocode}
> Caused by: java.lang.NullPointerException
>       at java.util.Arrays.copyOfRange(Arrays.java:3521)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$9.writeValue(VectorExpressionWriterFactory.java:1101)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterBytes.writeValue(VectorExpressionWriterFactory.java:343)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFArgDesc.getDeferredJavaObject(VectorUDFArgDesc.java:123)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:211)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:177)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:145)
>       ... 22 more
> {nocode}
> Change is due to:
> HIVE-17139: Conditional expressions optimization: skip the expression 
> evaluation if the condition is not satisfied for vectorization engine. (Jia 
> Ke, reviewed by Ferdinand Xu)
> Embedding a raw vector expression outside of VectorizationContext is quite 
> non-standard and evidently buggy.
> [~Ferd] [~Ke Jia] I am inclined to revert this change.  Comments?  CC: 
> [~ashutoshc] [~hagleitn]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to