Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-07 Thread Lefty Leverenz
Sounds like the wikidoc needs some work.  I'm open to suggestions.  If
Sanjay's simple UDF helps, I could put it in the wiki along with any advice
you think would help.

Does anyone else have use cases to contribute?

-- Lefty


On Mon, Aug 5, 2013 at 2:45 PM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.com wrote:

  Hi Ritesh

  To help u get started , I am writing a simple HelloWorld-ish UDF that
 might help…If it doesn't please ask for more clarifications...

  Good Luck
 Thanks

  sanjay


 
 *ToUpperCase.java*

  *package* com.sanjaysubramanian.utils.hive.udf;

   *import* org.apache.hadoop.hive.ql.exec.UDF;


  *public* *final* *class* ToUpperCase *extends* UDF{

 *protected* *final* Log logger = LogFactory.*getLog*(toUpperCase.*
 class*);


  *public* *String* evaluate(*final* String inputString) {

  if (inputString != null){

 *return* inputString.toUpper;

  }

  else {

 *return* inputString;

 }

}

 }

 

  *Usage in a Hive script*
 *
 *
 hive -e 

 create temporary function toupper  as
 'com.sanjaysubramanian.utils.hive.udf.ToUpperCase';
 SELECT
   first_name,
   toupper(first_name)
 FROM
   company_names
 


 ***

   From: Ritesh Agrawal ragra...@netflix.com
 Reply-To: user@hive.apache.org user@hive.apache.org
 Date: Monday, August 5, 2013 9:41 AM
 To: user@hive.apache.org user@hive.apache.org
 Subject: Re: Hive UDAF extending UDAF class: iterate or evaluate method

   Hi Lefty,

  I used the wiki you sent to write my first version of UDAF. However, I
 found it to be utterly complex, especially for storing partial results as I
 am not very familiar with hive API. Then I found another example of UDAF in
 the hadoop the definitive guide book and it had much simpler code but using
 different method. Instead of using iterate it was using evaluate method and
 so I am getting confused.

  Ritesh


 On Sun, Aug 4, 2013 at 2:18 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 You might find this wikidoc useful:  
 GenericUDAFCaseStudyhttps://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy.


  The O'Reilly book Programming Hive also has a section called
 User-Defined Aggregate Functions in chapter 13 (Functions), pages 172 to
 176.

  -- Lefty


 On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal ragra...@netflix.comwrote:

 Hi all,

  I am trying to write a UDAF function. I found an example that shows
 how to implement a UDAF in Hadoop The Definitive Guide book. However I am
 little confused. In the book, the author extends UDAF class and implements
 init, iterate, terminatePartial,  merge and terminate function. However
 looking at the hive docs (
 http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial,
 aggregatePartial and evaluate function. Please let me know what are the
 write functions to implement.

  Ritesh




   --
 Lefty



 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.




-- 
Lefty


Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-07 Thread Ritesh Agrawal
Hi Sanjay, Lefty

Thanks for the help but none of above responses directly answering my
question (probably I am not asking clear enough :-(  ).

Below I have two different structure of a UDAF (aggregation function). My
question is which one is the preferred/right approach

http://pastebin.com/QCgd4Hxc  : This version is based on based on what I
could understand from API docs about UDAF class.

http://pastebin.com/Uctamtek : This version is based on the book Hadoop The
definitive guide. Notice the function names for different from the first
one.

I hope this clarifies my question.

Thanks
Ritesh





On Wed, Aug 7, 2013 at 5:34 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 Sounds like the wikidoc needs some work.  I'm open to suggestions.  If
 Sanjay's simple UDF helps, I could put it in the wiki along with any advice
 you think would help.

 Does anyone else have use cases to contribute?

 -- Lefty


 On Mon, Aug 5, 2013 at 2:45 PM, Sanjay Subramanian 
 sanjay.subraman...@wizecommerce.com wrote:

  Hi Ritesh

  To help u get started , I am writing a simple HelloWorld-ish UDF that
 might help…If it doesn't please ask for more clarifications...

  Good Luck
 Thanks

  sanjay


 
 *ToUpperCase.java*

  *package* com.sanjaysubramanian.utils.hive.udf;

   *import* org.apache.hadoop.hive.ql.exec.UDF;


  *public* *final* *class* ToUpperCase *extends* UDF{

 *protected* *final* Log logger = LogFactory.*getLog*(toUpperCase.*
 class*);


  *public* *String* evaluate(*final* String inputString) {

  if (inputString != null){

 *return* inputString.toUpper;

  }

  else {

 *return* inputString;

 }

}

 }

 

  *Usage in a Hive script*
 *
 *
 hive -e 

 create temporary function toupper  as
 'com.sanjaysubramanian.utils.hive.udf.ToUpperCase';
 SELECT
   first_name,
   toupper(first_name)
 FROM
   company_names
 


 ***

   From: Ritesh Agrawal ragra...@netflix.com
 Reply-To: user@hive.apache.org user@hive.apache.org
 Date: Monday, August 5, 2013 9:41 AM
 To: user@hive.apache.org user@hive.apache.org
 Subject: Re: Hive UDAF extending UDAF class: iterate or evaluate method

   Hi Lefty,

  I used the wiki you sent to write my first version of UDAF. However, I
 found it to be utterly complex, especially for storing partial results as I
 am not very familiar with hive API. Then I found another example of UDAF in
 the hadoop the definitive guide book and it had much simpler code but using
 different method. Instead of using iterate it was using evaluate method and
 so I am getting confused.

  Ritesh


 On Sun, Aug 4, 2013 at 2:18 PM, Lefty Leverenz 
 leftylever...@gmail.comwrote:

 You might find this wikidoc useful:  
 GenericUDAFCaseStudyhttps://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy.


  The O'Reilly book Programming Hive also has a section called
 User-Defined Aggregate Functions in chapter 13 (Functions), pages 172 to
 176.

  -- Lefty


 On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal ragra...@netflix.comwrote:

 Hi all,

  I am trying to write a UDAF function. I found an example that shows
 how to implement a UDAF in Hadoop The Definitive Guide book. However I am
 little confused. In the book, the author extends UDAF class and implements
 init, iterate, terminatePartial,  merge and terminate function. However
 looking at the hive docs (
 http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial,
 aggregatePartial and evaluate function. Please let me know what are the
 write functions to implement.

  Ritesh




   --
 Lefty



 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.




 --
 Lefty



Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-07 Thread Sanjay Subramanian
Please follow the guidance for UDF provided in the Hive Programming book by 
Wampler/Capriolo. That will work for u.

I can say with confidence that their book was mighty helpful to me in my 
project from start to production...

And I would recommend go ahead with a way, implement and then fine tune 
otherwise u will be in analysis paralysis mode…

We are all on a  path of discovery here ...

Regards

sanjay

From: Ritesh Agrawal ragra...@netflix.commailto:ragra...@netflix.com
Reply-To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Date: Wednesday, August 7, 2013 5:57 PM
To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Hive UDAF extending UDAF class: iterate or evaluate method

Hi Sanjay, Lefty

Thanks for the help but none of above responses directly answering my question 
(probably I am not asking clear enough :-(  ).

Below I have two different structure of a UDAF (aggregation function). My 
question is which one is the preferred/right approach

http://pastebin.com/QCgd4Hxc  : This version is based on based on what I could 
understand from API docs about UDAF class.

http://pastebin.com/Uctamtek : This version is based on the book Hadoop The 
definitive guide. Notice the function names for different from the first one.

I hope this clarifies my question.

Thanks
Ritesh





On Wed, Aug 7, 2013 at 5:34 PM, Lefty Leverenz 
leftylever...@gmail.commailto:leftylever...@gmail.com wrote:
Sounds like the wikidoc needs some work.  I'm open to suggestions.  If Sanjay's 
simple UDF helps, I could put it in the wiki along with any advice you think 
would help.

Does anyone else have use cases to contribute?

-- Lefty


On Mon, Aug 5, 2013 at 2:45 PM, Sanjay Subramanian 
sanjay.subraman...@wizecommerce.commailto:sanjay.subraman...@wizecommerce.com
 wrote:
Hi Ritesh

To help u get started , I am writing a simple HelloWorld-ish UDF that might 
help…If it doesn't please ask for more clarifications...

Good Luck
Thanks

sanjay


ToUpperCase.java


package com.sanjaysubramanian.utils.hive.udf;


import org.apache.hadoop.hive.ql.exec.UDF;


public finalclass ToUpperCase extends UDF{

protected final Log logger = LogFactory.getLog(toUpperCase.class);


publicString evaluate(final String inputString) {

 if (inputString != null){

return inputString.toUpper;

 }

 else {

return inputString;

}

   }

}



Usage in a Hive script

hive -e 

create temporary function toupper  as 
'com.sanjaysubramanian.utils.hive.udf.ToUpperCase';

SELECT
  first_name,
  toupper(first_name)
FROM
  company_names


***

From: Ritesh Agrawal ragra...@netflix.commailto:ragra...@netflix.com
Reply-To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Date: Monday, August 5, 2013 9:41 AM
To: user@hive.apache.orgmailto:user@hive.apache.org 
user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Hive UDAF extending UDAF class: iterate or evaluate method

Hi Lefty,

I used the wiki you sent to write my first version of UDAF. However, I found it 
to be utterly complex, especially for storing partial results as I am not very 
familiar with hive API. Then I found another example of UDAF in the hadoop the 
definitive guide book and it had much simpler code but using different method. 
Instead of using iterate it was using evaluate method and so I am getting 
confused.

Ritesh


On Sun, Aug 4, 2013 at 2:18 PM, Lefty Leverenz 
leftylever...@gmail.commailto:leftylever...@gmail.com wrote:
You might find this wikidoc useful:  
GenericUDAFCaseStudyhttps://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy.

The O'Reilly book Programming Hive also has a section called User-Defined 
Aggregate Functions in chapter 13 (Functions), pages 172 to 176.

-- Lefty


On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal 
ragra...@netflix.commailto:ragra...@netflix.com wrote:
Hi all,

I am trying to write a UDAF function. I found an example that shows how to 
implement a UDAF in Hadoop The Definitive Guide book. However I am little 
confused. In the book, the author extends UDAF class and implements init, 
iterate, terminatePartial,  merge and terminate function. However looking at 
the hive docs 
(http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial, 
aggregatePartial and evaluate function. Please let me know what are the write 
functions to implement.

Ritesh



--
Lefty


CONFIDENTIALITY NOTICE
==
This email message and any attachments are for the exclusive use of the 
intended recipient(s) and may

Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-05 Thread Ritesh Agrawal
Hi Lefty,

I used the wiki you sent to write my first version of UDAF. However, I
found it to be utterly complex, especially for storing partial results as I
am not very familiar with hive API. Then I found another example of UDAF in
the hadoop the definitive guide book and it had much simpler code but using
different method. Instead of using iterate it was using evaluate method and
so I am getting confused.

Ritesh


On Sun, Aug 4, 2013 at 2:18 PM, Lefty Leverenz leftylever...@gmail.comwrote:

 You might find this wikidoc useful:  
 GenericUDAFCaseStudyhttps://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy
 .

 The O'Reilly book Programming Hive also has a section called
 User-Defined Aggregate Functions in chapter 13 (Functions), pages 172 to
 176.

 -- Lefty


 On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal ragra...@netflix.comwrote:

 Hi all,

 I am trying to write a UDAF function. I found an example that shows how
 to implement a UDAF in Hadoop The Definitive Guide book. However I am
 little confused. In the book, the author extends UDAF class and implements
 init, iterate, terminatePartial,  merge and terminate function. However
 looking at the hive docs (
 http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial,
 aggregatePartial and evaluate function. Please let me know what are the
 write functions to implement.

 Ritesh




 --
 Lefty



Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-04 Thread Ritesh Agrawal
Hi all,

I am trying to write a UDAF function. I found an example that shows how to 
implement a UDAF in Hadoop The Definitive Guide book. However I am little 
confused. In the book, the author extends UDAF class and implements init, 
iterate, terminatePartial,  merge and terminate function. However looking at 
the hive docs 
(http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial, 
aggregatePartial and evaluate function. Please let me know what are the write 
functions to implement. 

Ritesh

Re: Hive UDAF extending UDAF class: iterate or evaluate method

2013-08-04 Thread Lefty Leverenz
You might find this wikidoc useful:
GenericUDAFCaseStudyhttps://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy
.

The O'Reilly book Programming Hive also has a section called
User-Defined Aggregate Functions in chapter 13 (Functions), pages 172 to
176.

-- Lefty


On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal ragra...@netflix.com wrote:

 Hi all,

 I am trying to write a UDAF function. I found an example that shows how to
 implement a UDAF in Hadoop The Definitive Guide book. However I am little
 confused. In the book, the author extends UDAF class and implements init,
 iterate, terminatePartial,  merge and terminate function. However looking
 at the hive docs (
 http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html),
 it seems I need to implement init, aggregate, evaluatePartial,
 aggregatePartial and evaluate function. Please let me know what are the
 write functions to implement.

 Ritesh




-- 
Lefty