[ 
https://issues.apache.org/jira/browse/IMPALA-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541808#comment-16541808
 ] 

Tim Armstrong commented on IMPALA-7278:
---------------------------------------

[~shabnam] that UDF code is buggy regardless of what release you're running 
against. You may have gotten lucky for some reason in 2.3 but the code was 
still buggy. So yes, you need to fix the UDFs, Impala's runtime can't determine 
that you didn't *actually* mean to free the memory that your code freed.

> distinct clause is not working as expected with custom UDFs
> -----------------------------------------------------------
>
>                 Key: IMPALA-7278
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7278
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0
>            Reporter: shabnam perween
>            Priority: Critical
>
> Distinct clause when executed with custom UDF returns unexpected results.
> Custom UDF Definition:
> udf.h file:
> {code}
> #ifndef IMPALA_UDF_SAMPLE_UDF_H
> #define IMPALA_UDF_SAMPLE_UDF_H
> #include "udf.h"
> using namespace impala_udf;
> #ifdef __cplusplus
> extern "C"
> {
> #endif
> StringVal udf_clear(FunctionContext* context, StringVal& sInput);
> #ifdef __cplusplus
> }
> #endif
> #endif
> {code}
> udf.cpp:
> {code}
> #include "clear.h"
> StringVal udf_clear(
>  FunctionContext* context,
>  StringVal& sInput /* String to encrypt */
>  )
> {
>  unsigned char* pReturnData = context->Allocate( 100 );
>  memset( pReturnData, NULL, 100);
>  memcpy(pReturnData, sInput.ptr, sInput.len );
>  StringVal sResult( pReturnData );
>  sResult.len = sInput.len;
>  context->Free( (uint8_t*)pReturnData );
>  return sResult;
> }
> {code}
> CMakeLists.txt:
> {code}
> project (clear)
>  ADD_LIBRARY (clear2.8_RHEL SHARED clear.cpp )
>  TARGET_LINK_LIBRARIES (clear2.8_RHEL libImpalaUdf.a )
>  SET_TARGET_PROPERTIES (clear2.8_RHEL PROPERTIES SUFFIX ".so")
>  SET_TARGET_PROPERTIES (clear2.8_RHEL PROPERTIES PREFIX "")
>  INSTALL ( TARGETS clear2.8_RHEL DESTINATION . )
> Query Syntax:
> CREATE TABLE clear (c1 STRING, c2 STRING) row format delimited fields 
> terminated by ',' stored as textfile;
> LOAD DATA INPATH '/user/clear.csv' OVERWRITE INTO TABLE clear;
> Query: describe clear
> +------+--------+---------+
> | name | type | comment |
> +------+--------+---------+
> | c1 | string | |
> | c2 | string | |
> +------+--------+---------+
> Fetched 2 row(s) in 0.04s
> select * from clear;
> +---------+---------+
> | c1 | c2 |
> +---------+---------+
> | 1111111 | 1111111 |
> | 1111111 | 1111111 |
> | 222222 | 222222 |
> | 444444 | 444444 |
> | 222222 | 222222 |
> | 3333333 | 3333333 |
> | 3333333 | 3333333 |
> +---------+---------+
> Fetched 7 row(s) in 0.14s
> select distinct udf_clear(c1),c2 from clear;
> +-----------------------+---------+
> | default.udf_clear(c1) | c2 |
> +-----------------------+---------+
> | {color:#d04437}*222222* {color}| 444444 |   <== this should be *444444* 
> | 222222 | 222222 |
> | 3333333 | 3333333 |
> | 1111111 | 1111111 |
> +-----------------------+---------+
> Fetched 4 row(s) in 0.24s
> {code}
>  
> Expected result:
> {code}
> select distinct c1,c2 from clear;
> +---------+---------+
> | c1 | c2 |
> +---------+---------+
> | 444444 | 444444 |
> | 222222 | 222222 |
> | 3333333 | 3333333 |
> | 1111111 | 1111111 |
> +---------+---------+
> Fetched 4 row(s) in 0.25s
>  {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to