Re: How can I get the constant value from the ObjectInspector in the UDF

2012-09-26 Thread Chen Song
With my limited knowledge of hive, I don't think it is possible to get the
actual value of the argument and I don't think it is or should be designed
to provide that information either. *initialize* is intended only for
decoding the meta structure (type and its associated evaluation mechanism)
of arguments. Storing any specific values of arguments at runtime is
anti-pattern in my opinion. Can you elaborate more on why you really need
the constant value in your case?

On your 2nd question, you can get the type information from object
inspector. For example, if you expect the 1st argument as a string. You can
use the following code snippet.


>   Category category = arguments[0].getCategory();
>
> String typeName = arguments[0].getTypeName();
> if (category == Category.PRIMITIVE && ((typeName ==
> Constants.STRING_TYPE_NAME) || (typeName == Constants.VOID_TYPE_NAME))) {
> if (typeName == Constants.STRING_TYPE_NAME) {
> stringObjectInspector = (StringObjectInspector) arguments[0];
> }
> } else {
> throw new UDFArgumentTypeException(0, "The " +
> GenericUDFUtils.getOrdinal(1) + " argument is expected to be \"" +
> Constants.STRING_TYPE_NAME + "|"
> + Constants.VOID_TYPE_NAME + "\" but \"" + typeName + "\" is found");
> }
>
>
Chen

On Thu, Sep 27, 2012 at 5:04 AM, java8964 java8964 wrote:

>  I understand your message. But in this situation, I want to do the
> following:
>
> 1) I want to get the value 10 in the initialization stage. I understand
> your point that the value will only available in the evaluate stage, but
> keep in mind that for this 10 in my example, it is a constants value. It
> won't change for every evaluating. It is kind of value I should be able to
> get in the initialization stage, right? The hive Query analyzer should
> understand this parameter in the function in fact is a constants value, and
> will be able to provide to me during the initialization stage.
> 2) Further question, can I get more information from the object inspector?
> For example, when I write the UDF, I want to make sure the first parameter
> is a numeric type. I can get the type, which I am able to valid it based on
> the type. But the question is if I want to error in some case, I want to
> show the end user the NAME of the parameter in my error message, instead of
> just position.
>
> For example, in the UDF as msum(column_name, 10), if I find out the type
> of the column_name is NOT a numeric type, I want in the error message I
> give to the end user, that 'column_name' should be numeric type. But right
> now, in the API, I can not get this information. Only thing I can get is
> the category type information, but I want more.
>
> Is it possible to do that in hive 0.7.1?
>
> Thanks for your help.
>
> Yong
>
> --
> Date: Thu, 27 Sep 2012 02:32:19 +0900
> Subject: Re: How can I get the constant value from the ObjectInspector in
> the UDF
> From: chen.song...@gmail.com
> To: user@hive.apache.org
>
>
> Hi Yong
>
> The way GenericUDF works is as follows.
>
> *ObjectInspector initialize(ObjectInspector[] arguments) *is called only
> once for one GenericUDF instance used in your Hive query. This phase is for
> preparation steps of UDF, such as syntax check and type inference.
>
> *Object evaluate(DeferredObject[] arguments)* is called to evaluate
> against actual arguments. This should be where the actual calculation
> happens and where you can get the real values you talked about.
>
> Thanks,
> Chen
>
> On Wed, Sep 26, 2012 at 4:17 AM, java8964 java8964 
> wrote:
>
>  Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
>
> I am trying to write a hive UDF function as to calculate the moving sum.
> Right now, I am having trouble to get the constrant value passed in in the
> initialization stage.
>
> For example, let's assume the function is like the following format:
>
> msum(salary, 10) - salary is a int type column
>
> which means the end user wants to calculate the last 10 rows of salary.
>
> I kind of know how to implement this UDF. But I have one problem right now.
>
> 1) This is not a UDAF, as each row will return one data back as the moving
> sum.
> 2) I create an UDF class extends from the GenericUDF.
> 3) I can get the column type from the ObjectInspector[] passed to me in
> the initialize() method to verify that 'salary' and 10 both needs to be
> numeric type (later one needs to be integer)
> 4) But I also want to get the real value of 10, in this case, in the
> initialize() stage, so I can create the corresponding data structure based
> on the value end user s

RE: How can I get the constant value from the ObjectInspector in the UDF

2012-09-26 Thread java8964 java8964

I understand your message. But in this situation, I want to do the following:
1) I want to get the value 10 in the initialization stage. I understand your 
point that the value will only available in the evaluate stage, but keep in 
mind that for this 10 in my example, it is a constants value. It won't change 
for every evaluating. It is kind of value I should be able to get in the 
initialization stage, right? The hive Query analyzer should understand this 
parameter in the function in fact is a constants value, and will be able to 
provide to me during the initialization stage.2) Further question, can I get 
more information from the object inspector? For example, when I write the UDF, 
I want to make sure the first parameter is a numeric type. I can get the type, 
which I am able to valid it based on the type. But the question is if I want to 
error in some case, I want to show the end user the NAME of the parameter in my 
error message, instead of just position.
For example, in the UDF as msum(column_name, 10), if I find out the type of the 
column_name is NOT a numeric type, I want in the error message I give to the 
end user, that 'column_name' should be numeric type. But right now, in the API, 
I can not get this information. Only thing I can get is the category type 
information, but I want more.
Is it possible to do that in hive 0.7.1?
Thanks for your help.
Yong

Date: Thu, 27 Sep 2012 02:32:19 +0900
Subject: Re: How can I get the constant value from the ObjectInspector in the 
UDF
From: chen.song...@gmail.com
To: user@hive.apache.org

Hi Yong
The way GenericUDF works is as follows.
ObjectInspector initialize(ObjectInspector[] arguments) is called only once for 
one GenericUDF instance used in your Hive query. This phase is for preparation 
steps of UDF, such as syntax check and type inference.









Object evaluate(DeferredObject[] arguments) is called to evaluate against 
actual arguments. This should be where the actual calculation happens and where 
you can get the real values you talked about.

Thanks,Chen

On Wed, Sep 26, 2012 at 4:17 AM, java8964 java8964  wrote:





Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
I am trying to write a hive UDF function as to calculate the moving sum. Right 
now, I am having trouble to get the constrant value passed in in the 
initialization stage.

For example, let's assume the function is like the following format:
msum(salary, 10) - salary is a int type column
which means the end user wants to calculate the last 10 rows of salary.

I kind of know how to implement this UDF. But I have one problem right now.
1) This is not a UDAF, as each row will return one data back as the moving 
sum.2) I create an UDF class extends from the GenericUDF.
3) I can get the column type from the ObjectInspector[] passed to me in the 
initialize() method to verify that 'salary' and 10 both needs to be numeric 
type (later one needs to be integer)4) But I also want to get the real value of 
10, in this case, in the initialize() stage, so I can create the corresponding 
data structure based on the value end user specified here.
5) I looks around the javadoc of ObjectInspector class. I know at run time the 
real class of the 2nd parameter is WritableIntObjectInspector. I can get the 
type, but how I can get the real value of it?6) This is kind of 
ConstantsObjectInspector, should be able to give the value to me, as it already 
knows the type is int. What how?
7) I don't want to try to get the value at the evaluate stage. Can I get this 
value at the initialize stage?
Thanks
Yong  


-- 
Chen Song



  

Re: How can I get the constant value from the ObjectInspector in the UDF

2012-09-26 Thread Chen Song
Hi Yong

The way GenericUDF works is as follows.

*ObjectInspector initialize(ObjectInspector[] arguments) *is called only
once for one GenericUDF instance used in your Hive query. This phase is for
preparation steps of UDF, such as syntax check and type inference.

*Object evaluate(DeferredObject[] arguments)* is called to evaluate against
actual arguments. This should be where the actual calculation happens and
where you can get the real values you talked about.

Thanks,
Chen

On Wed, Sep 26, 2012 at 4:17 AM, java8964 java8964 wrote:

>  Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
>
> I am trying to write a hive UDF function as to calculate the moving sum.
> Right now, I am having trouble to get the constrant value passed in in the
> initialization stage.
>
> For example, let's assume the function is like the following format:
>
> msum(salary, 10) - salary is a int type column
>
> which means the end user wants to calculate the last 10 rows of salary.
>
> I kind of know how to implement this UDF. But I have one problem right now.
>
> 1) This is not a UDAF, as each row will return one data back as the moving
> sum.
> 2) I create an UDF class extends from the GenericUDF.
> 3) I can get the column type from the ObjectInspector[] passed to me in
> the initialize() method to verify that 'salary' and 10 both needs to be
> numeric type (later one needs to be integer)
> 4) But I also want to get the real value of 10, in this case, in the
> initialize() stage, so I can create the corresponding data structure based
> on the value end user specified here.
> 5) I looks around the javadoc of ObjectInspector class. I know at run time
> the real class of the 2nd parameter is WritableIntObjectInspector. I can
> get the type, but how I can get the real value of it?
> 6) This is kind of ConstantsObjectInspector, should be able to give the
> value to me, as it already knows the type is int. What how?
> 7) I don't want to try to get the value at the evaluate stage. Can I get
> this value at the initialize stage?
>
> Thanks
>
> Yong
>



-- 
Chen Song


How can I get the constant value from the ObjectInspector in the UDF

2012-09-25 Thread java8964 java8964

Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
I am trying to write a hive UDF function as to calculate the moving sum. Right 
now, I am having trouble to get the constrant value passed in in the 
initialization stage.
For example, let's assume the function is like the following format:
msum(salary, 10) - salary is a int type column
which means the end user wants to calculate the last 10 rows of salary.
I kind of know how to implement this UDF. But I have one problem right now.
1) This is not a UDAF, as each row will return one data back as the moving 
sum.2) I create an UDF class extends from the GenericUDF.3) I can get the 
column type from the ObjectInspector[] passed to me in the initialize() method 
to verify that 'salary' and 10 both needs to be numeric type (later one needs 
to be integer)4) But I also want to get the real value of 10, in this case, in 
the initialize() stage, so I can create the corresponding data structure based 
on the value end user specified here.5) I looks around the javadoc of 
ObjectInspector class. I know at run time the real class of the 2nd parameter 
is WritableIntObjectInspector. I can get the type, but how I can get the real 
value of it?6) This is kind of ConstantsObjectInspector, should be able to give 
the value to me, as it already knows the type is int. What how?7) I don't want 
to try to get the value at the evaluate stage. Can I get this value at the 
initialize stage?
Thanks
Yong