Wrong result in select with multiple identical UDF call

François Méthot Thu, 14 Apr 2016 10:21:18 -0700

I was able to reproduce this on 1.5 running on a cluster
and on 1.6 in embedded mode.


Within a single select, if I select the same udf(value) multiple time,
different result may get outputted for each columns.

ex:
select name, ilike(name, 'jack'), ilike(name, 'jack'), ilike(name, 'jack'),
ilike(name, 'jack'), ilike(name, 'jack') from hdfs.`/data/` where
ilike(name, 'jack');

I get

jack | false | true | false
jack | true | true | true
jack | true | true | false
.....
most of them are jack | true | true | true

I observed this on parquet files as well as CSV file. I restart drill,
perform the query and it happens. Sometime it does not!



If I do
select count(1) from hdfs.`/data/` where ilike(name, 'jack') = true;
or
select count(1) from hdfs.`/data/` where ilike(name, 'jack') = true and
like(name, 'jack') = true and like(name, 'jack') = true and like(name,
'jack') = true;

The count is always the same, which is good. It looks like the select part
is crippled with some issue.

Francois
P.S. I ended up doing these weird tests because I was getting those same
inconsistent result from my own UDF, at some point I started testing the
built-in UDF in drill for my own sanity because I could see what could be
wrong with my code...

Wrong result in select with multiple identical UDF call

Reply via email to