Re: Partition Pruning using UDF

2018-05-20 Thread Alberto Ramón
is this behavior for Hive on TEZ only?
or for Hive on MapReduce too?

On Tez, I think yes: evaluate functions and sub-queries and recompile to
partition pruning.
but in CDH compile all one time


On 16 May 2018 at 07:58, Furcy Pin  wrote:

> Yes, I believe that's what Constant Object Inspector are used for.
>
> The initialize method returns object inspectors that are used at query
> compilation time to check the type safety and perform potential
> optimizations. If you return a StringConstantObjectInspector containing the
> value that your method is supposed to return, the Hive optimiser will know
> that it can safely perform partition pruning on it.
>
> On Tue, 15 May 2018, 23:32 Alberto Ramón, 
> wrote:
>
>> Yes, I checked, by default all UDF are deterministic (LINK
>> 
>> )
>>
>> I think that I need something like 'eager evaluation' --> evaluate UDFs
>> before build physical plan (if not you can't do partition pruning)
>>
>> On 15 May 2018 at 09:21, Furcy Pin  wrote:
>>
>>> Hi Alberto,
>>>
>>>
>>> If I'm not mistaken, to make sure that this work you need to give the
>>> proper annotation in your UDF code (deterministic, and maybe some other).
>>> You may also need to return a Constant Object Inspector in the unit
>>> method so that Hive knows that it can perform partition pruning with it.
>>>
>>> On Wed, 9 May 2018, 19:23 Alberto Ramón, 
>>> wrote:
>>>
 Hello

 We have a UDP to select the correct partition to read 'FindPartition':
 Select * from TB where partitionCol =FindPartition();

 How I can avoid a full scan of all partitions?


 (Set MyPartition=FindPartition();  // Is not valid in Hive)

>>>
>>


Re: Partition Pruning using UDF

2018-05-16 Thread Furcy Pin
Yes, I believe that's what Constant Object Inspector are used for.

The initialize method returns object inspectors that are used at query
compilation time to check the type safety and perform potential
optimizations. If you return a StringConstantObjectInspector containing the
value that your method is supposed to return, the Hive optimiser will know
that it can safely perform partition pruning on it.

On Tue, 15 May 2018, 23:32 Alberto Ramón,  wrote:

> Yes, I checked, by default all UDF are deterministic (LINK
> 
> )
>
> I think that I need something like 'eager evaluation' --> evaluate UDFs
> before build physical plan (if not you can't do partition pruning)
>
> On 15 May 2018 at 09:21, Furcy Pin  wrote:
>
>> Hi Alberto,
>>
>>
>> If I'm not mistaken, to make sure that this work you need to give the
>> proper annotation in your UDF code (deterministic, and maybe some other).
>> You may also need to return a Constant Object Inspector in the unit
>> method so that Hive knows that it can perform partition pruning with it.
>>
>> On Wed, 9 May 2018, 19:23 Alberto Ramón, 
>> wrote:
>>
>>> Hello
>>>
>>> We have a UDP to select the correct partition to read 'FindPartition':
>>> Select * from TB where partitionCol =FindPartition();
>>>
>>> How I can avoid a full scan of all partitions?
>>>
>>>
>>> (Set MyPartition=FindPartition();  // Is not valid in Hive)
>>>
>>
>


Re: Partition Pruning using UDF

2018-05-15 Thread Alberto Ramón
Yes, I checked, by default all UDF are deterministic (LINK

)

I think that I need something like 'eager evaluation' --> evaluate UDFs
before build physical plan (if not you can't do partition pruning)

On 15 May 2018 at 09:21, Furcy Pin  wrote:

> Hi Alberto,
>
>
> If I'm not mistaken, to make sure that this work you need to give the
> proper annotation in your UDF code (deterministic, and maybe some other).
> You may also need to return a Constant Object Inspector in the unit method
> so that Hive knows that it can perform partition pruning with it.
>
> On Wed, 9 May 2018, 19:23 Alberto Ramón, 
> wrote:
>
>> Hello
>>
>> We have a UDP to select the correct partition to read 'FindPartition':
>> Select * from TB where partitionCol =FindPartition();
>>
>> How I can avoid a full scan of all partitions?
>>
>>
>> (Set MyPartition=FindPartition();  // Is not valid in Hive)
>>
>


Re: Partition Pruning using UDF

2018-05-15 Thread Furcy Pin
Hi Alberto,


If I'm not mistaken, to make sure that this work you need to give the
proper annotation in your UDF code (deterministic, and maybe some other).
You may also need to return a Constant Object Inspector in the unit method
so that Hive knows that it can perform partition pruning with it.

On Wed, 9 May 2018, 19:23 Alberto Ramón,  wrote:

> Hello
>
> We have a UDP to select the correct partition to read 'FindPartition':
> Select * from TB where partitionCol =FindPartition();
>
> How I can avoid a full scan of all partitions?
>
>
> (Set MyPartition=FindPartition();  // Is not valid in Hive)
>