The first string in concat_ws is used as a separator. Since the third value
in the last concat_ws is null, it skipped that and simply concactenated
second and fourth value by **. Hence you got  abc****8675309

Ritesh


On Fri, Dec 27, 2013 at 12:13 AM, Alexey Zotov <[email protected]>wrote:

> Sorry, but in my previous email there was some misleading due to text
> formatting.
>
> So, there are tests for concat_ws() UDF:
> https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out.
> One of tests (78 line) contains the following query:
>
>> SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>>        concat_ws('**', dest1.c1, NULL, dest1.c3) FROM dest1
>
> and expects the following results:
>
>> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309
>
>
> I'm confused by the result of the last concat_ws. Why it returns abc****
> 8675309 instead of abc8675309? It looks like NULL VALUE has not been
> skipped.
>
> Thanks.
>
>
> On Thu, Dec 26, 2013 at 3:36 PM, Alexey Zotov <[email protected]>wrote:
>
>> Hello Guys,
>>
>> historically concat_ws() UDF was added in the scope of
>> https://issues.apache.org/jira/browse/HIVE-682 ticket. I have a simple
>> question about its implementation. According to above ticket's description
>> it should have behavior like MySQL implementation (
>> http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws).
>> MySQL documentation says: 
>> "CONCAT_WS()<http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat-ws>
>>  does
>> not skip empty strings. However, it does skip any NULL values after the
>> separator argument.". I have performed a simple test:
>>
>>> create table test (col1 string);
>>> select concat_ws(',', '10', NULL, '4', '', 'a') from test limit 1;
>>
>>
>> as a result I have got:
>>
>>> 10,4,,a
>>
>> which looks good for me.
>>
>> But in tests (
>> https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/udf_concat_ws.q.out
>>  :
>> 78 line) I see the following lines:
>>
>>
>>>
>>> POSTHOOK: query: SELECT concat_ws(dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws(',', dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws(NULL, dest1.c1, dest1.c2, dest1.c3),
>>>        concat_ws('**', dest1.c1, *NULL*, dest1.c3) FROM dest1
>>>
>>>
>>> POSTHOOK: type: QUERY
>>> POSTHOOK: Input: default@dest1
>>> #### A masked pattern was here ####
>>> POSTHOOK: Lineage: dest1.c1 SIMPLE []
>>> POSTHOOK: Lineage: dest1.c2 SIMPLE []
>>> POSTHOOK: Lineage: dest1.c3 SIMPLE []
>>>
>>>
>>> xyzabc8675309        abc,xyz,8675309        NULL        abc****8675309
>>
>> which looks like NULLs are not skipped.
>>
>> What have I missed?
>>
>> Thanks, Alexey.
>>
>
>
>
> --
>
> Best regards
>
> Zotov Alexey
> Grid Dynamics
> Skype: azotcsit
>

Reply via email to