[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905520#comment-16905520
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


Yes, it should work, the result will be in the same column.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905514#comment-16905514
 ] 

Charles Givre commented on DRILL-7345:
--

So, let's say we have a UDF called foo(x) which returns a list.  Let's say our 
data looks like this:  [2,4,5,null,8]

Are you saying that for that to work, I'd have to create an additional UDF with 
nullable input that returns an empty list or something like that?  Would that 
return an additional column or would the result be in the same column?
Thanks,
-- C




> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905511#comment-16905511
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


{quote}
If I were to create an additional UDF which perhaps accepts a NullableVarChar 
as an input parameter, and returns null, wouldn't that cause Drill to either 
add extra columns or otherwise cause problems?
{quote}
You can create additional UDF which accepts NullableVarChar as an input 
parameter and Drill will choose between them both, which one should be used. A 
lot of inbuilt UDFs which uses internal nulls handling use this approach.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905462#comment-16905462
 ] 

Charles Givre commented on DRILL-7345:
--

Hi Volodymyr, 
I've been working on a bunch of UDFs, but let's take a simple one for example. 
`parse_user_agent`.  This function takes as an argument, a user agent and 
returns a map of the various fields such as browser name, version , os, etc. 
The issue arises when there are blank or null rows in the data.  If that 
happens, the function errors out. I would prefer to include null handling so 
that if the function encounters an empty row, it simply returns an empty list 
or map, but right now that doesn't seem feasible.  Here is the UDF: 
https://github.com/apache/drill/pull/1840/files 
 I have a few others as well, 
but this one is basically done.

If you add any null handling instruction to the function header (either NULL IF 
NULL or INTERNAL) the function will not be recognized.  If you set the input 
parameter to NullableVarChar, you get an error about Drill not finding the 
function.
If I were to create an additional UDF which perhaps accepts a NullableVarChar 
as an input parameter, and returns null, wouldn't that cause Drill to either 
add extra columns or otherwise cause problems?
-- C






> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905443#comment-16905443
 ] 

Volodymyr Vysotskyi commented on DRILL-7345:


Thanks, [~IhorHuzenko] for pointing to the Javadocs and specifying Jira ticket 
where this Limitation was added.

[~cgivre], as pointed in DRILL-6810, currently Drill does not support NULL 
values for list/map so it is incorrect to allow usage of NULL_IF_NULL 
NullHandling for functions with ComplexWriter.
But you can create two UDF implementations: one of them accepts nullable values 
and another - required and then handle nulls inside UDF in the way you choose - 
set default values, return empty lists, etc.

In Jira description, you have written:
{quote}
if the input to the UDF is nullable, Drill doesn't recognize the UDF at all
{quote}

Is it meant that this UDF wasn't used for values with required data mode? In 
the opposite case, it may be another issue, but we need to see UDF 
implementation to find a root cause.

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Charles Givre (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905389#comment-16905389
 ] 

Charles Givre commented on DRILL-7345:
--

Hi [~IhorHuzenko], 
I took a look at the javadoc, I think that is a new behavior, and an annoying 
one.  

In practice what this means is that if you have a UDF that returns a complex 
field, you really have no way to deal with empty or null rows.  Is there a way 
you could suggest so that we can deal with this situation?

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output

2019-08-12 Thread Igor Guzenko (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905362#comment-16905362
 ] 

Igor Guzenko commented on DRILL-7345:
-

Hi [~cgivre], could you please check that the issue is not caused by changes 
which was added as part of DRILL-6810 ? 
[Here|https://github.com/apache/drill/blob/85c77134d5d1bb9f96a5417036cccfb263ae8ae7/exec/java-exec/src/main/java/org/apache/drill/exec/expr/annotations/FunctionTemplate.java#L150]
 in javadoc described some limitations related to ComplexWriter output. 

> Strange Behavior for UDFs with ComplexWriter Output
> ---
>
> Key: DRILL-7345
> URL: https://issues.apache.org/jira/browse/DRILL-7345
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Minor
>
> I wrote some UDFs recently and noticed some strange behavior when debugging 
> them. 
> This behavior only occurs when there is ComplexWriter as output.  
> Basically, if the input to the UDF is nullable, Drill doesn't recognize the 
> UDF at all.  I've found that the only way to get Drill to recognize UDFs that 
> have ComplexWriters as output is:
> * Use a non-nullable holder as input
> * Remove the null setting completely from the function parameters.
> This approach has a drawback in that if the function receives a null value, 
> it will throw an error and halt execution.  My preference would be to allow 
> null handling, but I've not figured out how to make that happen.
> Note:  This behavior ONLY occurs when using a ComplexWriter as output.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)