(shifting to public list)
The plan is to support implicit type conversions as desired. Here's the last discussion on this topic I could find. It seems from the initial mail from Zheng and the issue you described that this has not been fixed yet .. [hive-devel] A question about implicit type conversions Ashish Thusoo athusoo at facebook.com <mailto:hive-devel%40lists.facebook.com?Subject=Re%3A%20%5Bhive-devel%5D%20A%20question%20about%20implicit%20type%20conversions&In-Reply-To=%3C68B7689C98024D43B4C2709456F0B520079D68F5AA%40SC-MBXC1.TheFacebook.com%3E> Tue Oct 14 10:39:32 PDT 2008 ________________________________ +1 on that. I thought the double->string rule was a stop gap solution because we are only passing strings between map/reduce and all the partial aggregate functions generate strings for that reason. And the final aggregates take string as input. For that reason we needed double->string conversion. Correct me if this is not the case... Otherwise I don't see a case where a user would want to implicitly convert from double to string. What it means though is that we should support implicit conversions for string->x where x is a number type. That would take care of lossy transformations... no? Ashish ________________________________ From: hive-devel-bounces at lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel> [mailto:hive-devel-bounces at lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel>] On Behalf Of Joydeep Sen Sarma Sent: Tuesday, October 14, 2008 12:17 AM To: Zheng Shao; hive Subject: Re: [hive-devel] A question about implicit type conversions Dunno. So I guess the number type hierarchy is pretty clear. >From whatever I have seen so far - when a user compares a string to a byte - >they are likely implying that the string encodes a byte. If the string does >not encode a byte - then the comparison is problematic no matter even if the >byte is converted to a string .. Mysql seems even more brute force about this: http://mysql2.mirrors-r-us.net/doc/refman/5.0/en/type-conversion.html but the general direction seems to upconvert strings to numbers .. ________________________________ From: Zheng Shao Sent: Tuesday, October 14, 2008 12:05 AM To: Joydeep Sen Sarma; hive Subject: RE: A question about implicit type conversions Yes, my option B looks very similar to this (special logic for comparison). The only difference is that my current partial rules are: byte -> short -> int -> long -> float -> double -> string I agree I should disallow long -> float conversions because that might cause precision problems. byte -> short -> int -> long -> double -> string int -> float -> double But I still feel I should put string at the end instead of the beginning, because string -> byte is lossy. "1.1" = CAST(1 AS BYTE) should evaluate to false. Makes sense? Zheng From: Joydeep Sen Sarma Sent: Monday, October 13, 2008 11:52 PM To: Zheng Shao; hive Subject: RE: A question about implicit type conversions Why do we allow int->string implicit conversion (at least for comparison)? there seems to be a partial order in terms of promotion rules for the types listed below. Would it be fair to say that: - we need to implement eva(x,x) for all x - for all eval (x,y) - figure out a common type (z) to promote to so we can convert to form eval(z,z) the partial order in terms of promotion rules seems to be string -> byte -> short -> int -> long -> double int -> float -> double (combine the int nodes together) eval(int, double) should find double as the common type that's the minimum distance away. Is this option B? ________________________________ From: hive-devel-bounces at lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel> [mailto:hive-devel-bounces at lists.facebook.com<http://lists.facebook.com/mailman/listinfo/hive-devel>] On Behalf Of Zheng Shao Sent: Monday, October 13, 2008 9:13 PM To: hive Subject: [hive-devel] A question about implicit type conversions Hi, I've hit a problem with type conversion that I didn't think of when first implement it. REQUIREMENT: The thing is for comparison operators, we want: 1. evaluate(int, int) to use java int comparison 2. evaluate(int, double) to convert int to double 3. evaluate(string, double) to convert string to double 4. evaluate(string, string) to do java string comparison The implicit conversions that we allow are (the implicitConvertable function): 1. byte -> short -> int -> long -> float -> double -> string 2. string -> double The rule of finding the matching method is to choose the one that does the least number of implicit conversions. If there are multiple with the same least number of implicit conversions, reports ambiguity. The methods implemented in comparison operators are: 1. evaluate(int, int) 2. [evaluate(x, x) for x in byte short long float] 3. evaluate(double, double) 4. evaluate(string, string) 5. evaluate(double, string) 6. evaluate(string, double) 4 and 5 are used to make sure REQUIREMENT 3 is supported (otherwise there will be ambiguity). However, I didn't notice that 4 and 5 created ambiguity for evaluate(double, int). Now with 1 implicit type conversion, we can map that to either evaluate(double, double), and evaluate(double, string). The potential solutions are: A. Implement all situations and don't rely on implicit type conversion. However given the 6 numeric types and 1 string, we will need 7*6/2 = 21 methods. That does not seem to look good (although it is definitely doable, and we only need to do that in a base class and all comparison functions can derive from it) B. Add a special logic for comparison functions. For comparison functions, we call "getCommonClass" to get the "best" common class that both of the arguments can be converted to. We are already doing this for join conditions. C. Let the implicitConvertable method return a "distance" or "cost" instead of just a Boolean. Then we can tweak the "cost" to make sure what we want is achieve. Thoughts? BTW, as a result of this, the user will see the ambiguity right now if they do a comparison with a double and an int. (They can use CAST to avoid that problem, so it's not a blocker). But it's still a pretty fundamental question we need to answer. Zheng -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ding Zhou Sent: Saturday, November 15, 2008 5:59 PM To: Ding Zhou; [EMAIL PROTECTED] Subject: Re: [hive-devel] Implicit type conversion in Hive2 As a follow up, the query went through with str->long conversion: SELECT source, count(1) WHERE a.ds='2008-11-05' AND a.app_id <> '0' Thanks, -Ding. On 11/15/08 5:54 PM, "Ding Zhou" <[EMAIL PROTECTED]> wrote: > This is the related field: > > app_id bigint 'from deserializer' > key bigint 'from deserializer' > > > This is the query: > > DROP TABLE dzhou_exp; > CREATE TABLE dzhou_exp (source string, counts int); > FROM mf_user_actions a > INSERT OVERWRITE TABLE dzhou_exp > SELECT source, count(1) > WHERE a.ds='2008-11-05' AND > a.app_id <> 0 > GROUP BY source; > > This is the error: > > 08/11/15 17:45:36 INFO parse.SemanticAnalyzer: Completed getting MetaData in > Semantic Analysis > FAILED: Error in semantic analysis: line 1:120 Operator Argument Type > Mismatch 0: Looking for Operator "<>" with parameters [class java.lang.Long, > class java.lang.Integer] > Setting total progress to FAILED - line 1:120 Operator Argument Type > Mismatch 0: Looking for Operator <> with parameters [class java.lang.Long, > class java.lang.Integer] > 08/11/15 17:45:37 ERROR ql.Driver: FAILED: Error in semantic analysis: line > 1:120 Operator Argument Type Mismatch 0: Looking for Operator "<>" with > parameters [class java.lang.Long, class java.lang.Integer] > org.apache.hadoop.hive.ql.parse.SemanticException: line 1:120 Operator > Argument Type Mismatch 0: Looking for Operator "<>" with parameters [class > java.lang.Long, class java.lang.Integer] > Setting total progress to FAILED - line 1:120 Operator Argument Type > Mismatch 0: Looking for Operator <> with parameters [class java.lang.Long, > class java.lang.Integer] > > > Shall we suport implicit conversion among int/bigint/smallint? > > Thanks, > -Ding. > > _______________________________________________ > [EMAIL PROTECTED] > http://lists.facebook.com/mailman/listinfo/hive-devel _______________________________________________ [EMAIL PROTECTED] http://lists.facebook.com/mailman/listinfo/hive-devel