[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627638#comment-13627638
 ] 

Hudson commented on HIVE-4271:
--

Integrated in Hive-trunk-hadoop2 #149 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/149/])
HIVE-4271 : Limit precision of decimal type (Gunther Hagleitner via 
Ashutosh Chauhan) (Revision 1466305)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466305
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type
* 
/hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java
* /hive/trunk/data/files/kv8.txt
* /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAbs.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFloor.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMinus.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMultiply.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPlus.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPositive.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRound.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect2.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToDecimal.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java
* /hive/trunk/ql/src/test/queries/clientpositive/decimal_precision.q
* /hive/trunk/ql/src/test/results/clientpositive/decimal_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_precision.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_serde.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_udf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/literal_decimal.q.out
* /hive/trunk/ql/src/test/results/clientpositive/serde_regex.q.out
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/BigDecimalWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBigDecimal.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyHiveDecimal.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBigDecimalObjectInspector.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyHiveDecimalObjectInspector.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryBigDecimal.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryHiveDecimal.java
* 

[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628544#comment-13628544
 ] 

Hudson commented on HIVE-4271:
--

Integrated in Hive-trunk-h0.21 #2055 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2055/])
HIVE-4271 : Limit precision of decimal type (Gunther Hagleitner via 
Ashutosh Chauhan) (Revision 1466305)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1466305
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type
* 
/hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java
* /hive/trunk/data/files/kv8.txt
* /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAbs.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFloor.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMinus.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMultiply.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPlus.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPositive.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRound.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect2.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToDecimal.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java
* /hive/trunk/ql/src/test/queries/clientpositive/decimal_precision.q
* /hive/trunk/ql/src/test/results/clientpositive/decimal_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_precision.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_serde.q.out
* /hive/trunk/ql/src/test/results/clientpositive/decimal_udf.q.out
* /hive/trunk/ql/src/test/results/clientpositive/literal_decimal.q.out
* /hive/trunk/ql/src/test/results/clientpositive/serde_regex.q.out
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/BigDecimalWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBigDecimal.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyHiveDecimal.java
* /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBigDecimalObjectInspector.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyHiveDecimalObjectInspector.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryBigDecimal.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryHiveDecimal.java
* 

[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Carter Shanklin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626636#comment-13626636
 ] 

Carter Shanklin commented on HIVE-4271:
---

Gunther, one thing to consider is Teradata interoperability and they support up 
to 38 rather than 36. They claim to do this with 16 bytes. See 
http://developer.teradata.com/tools/articles/how-many-digits-in-a-decimal

I believe SQL Server is also 38 but I'm not sure. If we can get 38 that would 
be ideal from a compatibility point of view. If there is a big performance hit 
due to encoding or whatever other reason that's a good reason to go with 36 
rather than 38 since there's probably not too many apps using 37 or 38. There 
are bound to be some out there somewhere though.

Last thought, starting with 18 is fine since it futureproofed from a DDL point 
of view but there is good upside to being able to make stronger compatibility 
statements.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627100#comment-13627100
 ] 

Gunther Hagleitner commented on HIVE-4271:
--

Thanks Carter and Eric for the feedback. I've opened HIVE-4320 in response to 
your comments. I'd like to think through how we're going to do math on these 
numbers, but you make a great point.

I'd still like to move forward with this jira and do the rest in HIVE-4320.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627112#comment-13627112
 ] 

Ashutosh Chauhan commented on HIVE-4271:


The exact precision and how to implement in performant way we can discuss on 
separate jira. Lets use this jira to limit the max precision.
+1 to the latest patch.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-05 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624094#comment-13624094
 ] 

Eric Hanson commented on HIVE-4271:
---

I like this proposal. It will make life easier when the time comes have to 
implement support for vectorized comparisons and arithmetic 
(https://issues.apache.org/jira/browse/HIVE-4160) for decimal, because the data 
can be stored in an array of LONG or a pair of LONG values. This will enable 
faster query execution. If there are defaults, please make the default be such 
that the value will fit in 18 digits or less (a single LONG). Then the standard 
integer arithmetic code path can be used for vectorized QE for the common case 
for decimal. Users should be coached to use 18 digits or less for decimal 
unless their app really needs more.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-05 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624237#comment-13624237
 ] 

Gunther Hagleitner commented on HIVE-4271:
--

[~ehans]: Thanks for the feedback. I like your idea of the single LONG default, 
however I'm not sure we'll be able to easily do that. Right now there are no 
defaults. Just one decimal type with a max precision (of 36) - reducing that 
to 18 would be a really small range. The plan is to eventually extend to 
decimal(p)/decimal(p,s), but at that point I'm thinking the default decimal 
type will still have to be the max precision (e.g.: that'll be the fallback for 
decimal operations where we don't know the exact precision returned). We'll 
see, hopefully we can at least coach folks to use as little as they need for 
improved performance.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-05 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624259#comment-13624259
 ] 

Eric Hanson commented on HIVE-4271:
---

Okay, that sounds fine. If we document from the beginning to use 18 digits or 
less in your schema design unless you need more, that'll help. It should be 
possible to make the case for 19..36 digits quite fast too, though there will 
be a penalty.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, 
 HIVE-4271.4.patch, HIVE-4271.5.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621089#comment-13621089
 ] 

Ashutosh Chauhan commented on HIVE-4271:


+1 will commit if tests pass.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621662#comment-13621662
 ] 

Ashutosh Chauhan commented on HIVE-4271:


Following two tests failed in CliDriver
* literal_decimal.q
[junit] Running: diff -a 
/home/ashutosh/hive/build/ql/test/logs/clientpositive/literal_decimal.q.out 
/home/ashutosh/hive/ql/src/test/results/clientpositive/literal_decimal.q.out
[junit] 61c61
[junit]  -10   1   3.14-3.14   9   
9.9 NULLNULL
[junit] ---
[junit]  -10   1   3.14-3.14   9   
9.9 1E-99   1E+99
Looks like need to update the golden file, now that we have lost ability to 
read in values like 1E-99

* serde_regex.q 
Hadoop job failed. We recently added support for decimals in regex_serde in 
HIVE-3951 , possibly related to that.


 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620556#comment-13620556
 ] 

Ashutosh Chauhan commented on HIVE-4271:


Left some comments on phabricator.

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4271) Limit precision of decimal type

2013-04-01 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618687#comment-13618687
 ] 

Gunther Hagleitner commented on HIVE-4271:
--

Review: https://reviews.facebook.net/D9855

 Limit precision of decimal type
 ---

 Key: HIVE-4271
 URL: https://issues.apache.org/jira/browse/HIVE-4271
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4271.1.patch


 The current decimal implementation does not limit the precision of the 
 numbers. This has a number of drawbacks. A maximum precision would allow us 
 to:
 - Have SerDes/filformats store decimals more efficiently
 - Speed up processing by implementing operations w/o generating java 
 BigDecimals
 - Simplify extending the datatype to allow for decimal(p) and decimal(p,s)
 - Write a more efficient BinarySortable SerDe for sorting/grouping/joining
 Exact numeric datatype are typically used to represent money, so if the limit 
 is high enough it doesn't really become an issue.
 A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 
 longs we can represent 36 digits - which is what I propose as the limit.
 Final thought: It's easier to restrict this now and have the option to do the 
 things above than to try to do so once people start using the datatype.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira