(shifting to public list)


The plan is to support implicit type conversions as desired. Here's the last 
discussion on this topic I could find. It seems from the initial mail from 
Zheng and the issue you described that this has not been fixed yet ..



[hive-devel] A question about implicit type conversions
Ashish Thusoo athusoo at facebook.com 
<mailto:hive-devel%40lists.facebook.com?Subject=Re%3A%20%5Bhive-devel%5D%20A%20question%20about%20implicit%20type%20conversions&In-Reply-To=%3C68B7689C98024D43B4C2709456F0B520079D68F5AA%40SC-MBXC1.TheFacebook.com%3E>
Tue Oct 14 10:39:32 PDT 2008
________________________________

+1 on that. I thought the double->string rule was a stop gap solution because 
we are only passing strings between map/reduce and all the partial aggregate 
functions generate strings for that reason. And the final aggregates take 
string as input. For that reason we needed double->string conversion. Correct 
me if this is not the case...



Otherwise I don't see a case where a user would want to implicitly convert from 
double to string.



What it means though is that we should support implicit conversions for



string->x where x is a number type.



That would take care of lossy transformations... no?



Ashish



________________________________

From: hive-devel-bounces at 
lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel> 
[mailto:hive-devel-bounces at 
lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel>] On 
Behalf Of Joydeep Sen Sarma

Sent: Tuesday, October 14, 2008 12:17 AM

To: Zheng Shao; hive

Subject: Re: [hive-devel] A question about implicit type conversions



Dunno. So I guess the number type hierarchy is pretty clear.



>From whatever I have seen so far - when a user compares a string to a byte - 
>they are likely implying that the string encodes a byte. If the string does 
>not encode a byte - then the comparison is problematic no matter even if the 
>byte is converted to a string ..



Mysql seems even more brute force about this:



http://mysql2.mirrors-r-us.net/doc/refman/5.0/en/type-conversion.html



but the general direction seems to upconvert strings to numbers ..



________________________________

From: Zheng Shao

Sent: Tuesday, October 14, 2008 12:05 AM

To: Joydeep Sen Sarma; hive

Subject: RE: A question about implicit type conversions



Yes, my option B looks very similar to this (special logic for comparison).



The only difference is that my current partial rules are:

byte -> short -> int -> long -> float -> double -> string



I agree I should disallow long -> float conversions because that might cause 
precision problems.

byte -> short -> int -> long -> double -> string

int -> float -> double



But I still feel I should put string at the end instead of the beginning, 
because string -> byte is lossy. "1.1" = CAST(1 AS BYTE) should evaluate to 
false.

Makes sense?



Zheng

From: Joydeep Sen Sarma

Sent: Monday, October 13, 2008 11:52 PM

To: Zheng Shao; hive

Subject: RE: A question about implicit type conversions



Why do we allow int->string implicit conversion (at least for comparison)?



there seems to be a partial order in terms of promotion rules for the types 
listed below. Would it be fair to say that:

-          we need to implement eva(x,x) for all x

-          for all eval (x,y) - figure out a common type (z) to promote to so 
we can convert to form eval(z,z)



the partial order in terms of promotion rules seems to be



string -> byte -> short -> int -> long -> double

                                     int -> float -> double



(combine the int nodes together)



eval(int, double) should find double as the common type that's the minimum 
distance away. Is this option B?



________________________________

From: hive-devel-bounces at 
lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel> 
[mailto:hive-devel-bounces at 
lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel>] On 
Behalf Of Zheng Shao

Sent: Monday, October 13, 2008 9:13 PM

To: hive

Subject: [hive-devel] A question about implicit type conversions



Hi,



I've hit a problem with type conversion that I didn't think of when first 
implement it.



REQUIREMENT: The thing is for comparison operators, we want:



1.       evaluate(int, int) to use java int comparison



2.       evaluate(int, double) to convert int to double



3.       evaluate(string, double) to convert string to double



4.       evaluate(string, string) to do java string comparison



The implicit conversions that we allow are (the implicitConvertable function):



1.       byte -> short -> int -> long -> float -> double -> string



2.       string -> double



The rule of finding the matching method is to choose the one that does the 
least number of implicit conversions. If there are multiple with the same least 
number of implicit conversions, reports ambiguity.





The methods implemented in comparison operators are:



1.       evaluate(int, int)



2.       [evaluate(x, x) for x in byte short long float]



3.       evaluate(double, double)



4.       evaluate(string, string)



5.       evaluate(double, string)



6.       evaluate(string, double)

4 and 5 are used to make sure REQUIREMENT 3 is supported (otherwise there will 
be ambiguity). However, I didn't notice that 4 and 5 created ambiguity for 
evaluate(double, int). Now with 1 implicit type conversion, we can map that to 
either evaluate(double, double), and evaluate(double, string).





The potential solutions are:



A.      Implement all situations and don't rely on implicit type conversion. 
However given the 6 numeric types and 1 string, we will need 7*6/2 = 21 
methods. That does not seem to look good (although it is definitely doable, and 
we only need to do that in a base class and all comparison functions can derive 
from it)



B.      Add a special logic for comparison functions. For comparison functions, 
we call "getCommonClass" to get the "best" common class that both of the 
arguments can be converted to. We are already doing this for join conditions.



C.      Let the implicitConvertable method return a "distance" or "cost" 
instead of just a Boolean. Then we can tweak the "cost" to make sure what we 
want is achieve.



Thoughts?



BTW, as a result of this, the user will see the ambiguity right now if they do 
a comparison with a double and an int. (They can use CAST to avoid that 
problem, so it's not a blocker). But it's still a pretty fundamental question 
we need to answer.



Zheng











-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ding Zhou
Sent: Saturday, November 15, 2008 5:59 PM
To: Ding Zhou; [EMAIL PROTECTED]
Subject: Re: [hive-devel] Implicit type conversion in Hive2



As a follow up, the query went through with str->long conversion:



SELECT source, count(1)

WHERE a.ds='2008-11-05' AND

a.app_id <> '0'



Thanks,

-Ding.





On 11/15/08 5:54 PM, "Ding Zhou" <[EMAIL PROTECTED]> wrote:



> This is the related field:

>

> app_id    bigint    'from deserializer'

> key    bigint    'from deserializer'

>

>

> This is the query:

>

> DROP TABLE dzhou_exp;

> CREATE TABLE dzhou_exp (source string, counts int);

> FROM mf_user_actions a

> INSERT OVERWRITE TABLE dzhou_exp

> SELECT source, count(1)

> WHERE a.ds='2008-11-05' AND

> a.app_id <> 0

> GROUP BY source;

>

> This is the error:

>

> 08/11/15 17:45:36 INFO parse.SemanticAnalyzer: Completed getting MetaData in

> Semantic Analysis

> FAILED: Error in semantic analysis: line 1:120 Operator Argument Type

> Mismatch 0: Looking for Operator "<>" with parameters [class java.lang.Long,

> class java.lang.Integer]

> Setting total progress to FAILED - line 1:120 Operator Argument Type

> Mismatch 0: Looking for Operator <> with parameters [class java.lang.Long,

> class java.lang.Integer]

> 08/11/15 17:45:37 ERROR ql.Driver: FAILED: Error in semantic analysis: line

> 1:120 Operator Argument Type Mismatch 0: Looking for Operator "<>" with

> parameters [class java.lang.Long, class java.lang.Integer]

> org.apache.hadoop.hive.ql.parse.SemanticException: line 1:120 Operator

> Argument Type Mismatch 0: Looking for Operator "<>" with parameters [class

> java.lang.Long, class java.lang.Integer]

> Setting total progress to FAILED - line 1:120 Operator Argument Type

> Mismatch 0: Looking for Operator <> with parameters [class java.lang.Long,

> class java.lang.Integer]

>

>

> Shall we suport implicit conversion among int/bigint/smallint?

>

> Thanks,

> -Ding.

>

> _______________________________________________

> [EMAIL PROTECTED]

> http://lists.facebook.com/mailman/listinfo/hive-devel



_______________________________________________

[EMAIL PROTECTED]

http://lists.facebook.com/mailman/listinfo/hive-devel

Reply via email to