Hi Young, I must argue that the partition pruning do actually work if I don't use the IN clause. What I wanted to achieve in my original query was to specify a range of partitions in a simple way. The same query can be expressed as
SELECT * FROM mytable WHERE partitionCol >= UDF("2014-03-10") and partitionCol <= UDF("2014-03-11"); This UDF returns an INT (rather than an INT array). Both this UDF and the original one are annotated with @UDFType(deterministic = true) (if that has any impact) . This variant works fine and does partition pruning. Note that I don't have another column as input to my UDF but a static value. Thanks, Petter 2014-03-11 0:16 GMT+01:00 java8964 <java8...@hotmail.com>: > I don't know from syntax point of view, if Hive will allow to do "columnA > IN UDF(columnB)". > > What I do know that even let's say above work, it won't do the partition > pruning. > > The partition pruning in Hive is strict static, any dynamic values > provided to partition column won't enable partition pruning, even though it > is a feature I missed too. > > Yong > > ------------------------------ > Date: Mon, 10 Mar 2014 16:23:01 +0100 > Subject: Using an UDF in the WHERE (IN) clause > From: petter.von.dolw...@gmail.com > To: user@hive.apache.org > > > Hi, > > I'm trying to get the following query to work. The parser don't like it. > Anybody aware of a workaround? > > SELECT * FROM mytable WHERE partitionCol IN my_udf("2014-03-10"); > > partitionCol is my partition column of type INT and I want to achieve > early pruning. I've tried returning an array of INTs from my_udf and also a > plain string in the format (1,2,3). It seems like the parser wont allow me > to put an UDF in this place. > > Any help appreciated. > > Thanks, > Petter >