Re: Using an UDF in the WHERE (IN) clause

Petter von Dolwitz (Hem) Tue, 11 Mar 2014 00:34:33 -0700

Hi Young,

I must argue that the partition pruning do actually work if I don't use the
IN clause. What I wanted to achieve in my original query was to specify a
range of partitions in a simple way. The same query can be expressed as


SELECT * FROM mytable WHERE partitionCol >= UDF("2014-03-10") and
partitionCol <= UDF("2014-03-11");

This UDF returns an INT (rather than an INT array). Both this UDF and the
original one are annotated with @UDFType(deterministic = true) (if that has
any impact) . This variant works fine and does partition pruning. Note that
I don't have another column as input to my UDF but a static value.

Thanks,
Petter




2014-03-11 0:16 GMT+01:00 java8964 <java8...@hotmail.com>:

> I don't know from syntax point of view, if Hive will allow to do "columnA
> IN UDF(columnB)".
>
> What I do know that even let's say above work, it won't do the partition
> pruning.
>
> The partition pruning in Hive is strict static, any dynamic values
> provided to partition column won't enable partition pruning, even though it
> is a feature I missed too.
>
> Yong
>
> ------------------------------
> Date: Mon, 10 Mar 2014 16:23:01 +0100
> Subject: Using an UDF in the WHERE (IN) clause
> From: petter.von.dolw...@gmail.com
> To: user@hive.apache.org
>
>
> Hi,
>
> I'm trying to get the following query to work. The parser don't like it.
> Anybody aware of a workaround?
>
> SELECT * FROM mytable WHERE partitionCol IN my_udf("2014-03-10");
>
> partitionCol is my partition column of type INT and I want to achieve
> early pruning. I've tried returning an array of INTs from my_udf and also a
> plain string in the format (1,2,3). It seems like the parser wont allow me
> to put an UDF in this place.
>
> Any help appreciated.
>
> Thanks,
> Petter
>

Re: Using an UDF in the WHERE (IN) clause

Reply via email to