[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
[ https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118896#comment-14118896 ] Eric Hanson commented on HIVE-7901: --- Thanks, [~sushanth]. Will you commit this or do you want me to do it? -Eric CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version) Key: HIVE-7901 URL: https://issues.apache.org/jira/browse/HIVE-7901 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Eric Hanson Attachments: hive-7901.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
[ https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118909#comment-14118909 ] Eric Hanson commented on HIVE-7901: --- Okay, thanks CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version) Key: HIVE-7901 URL: https://issues.apache.org/jira/browse/HIVE-7901 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Eric Hanson Attachments: hive-7901.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
[ https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-7901: -- Attachment: hive-7901.01.patch I modified the original HIVE-6633 patch to put the changes in the right place, under apache/hive. This is a new patch for those changes based directly off the current hive trunk. CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version) Key: HIVE-7901 URL: https://issues.apache.org/jira/browse/HIVE-7901 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Eric Hanson Attachments: hive-7901.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
[ https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-7901: -- Status: Patch Available (was: Open) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version) Key: HIVE-7901 URL: https://issues.apache.org/jira/browse/HIVE-7901 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Eric Hanson Attachments: hive-7901.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)
[ https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115550#comment-14115550 ] Eric Hanson commented on HIVE-7901: --- [~sushanth], please have a look and +1/commit if you think it's ready. Thanks! CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version) Key: HIVE-7901 URL: https://issues.apache.org/jira/browse/HIVE-7901 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Eric Hanson Attachments: hive-7901.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114367#comment-14114367 ] Eric Hanson commented on HIVE-6633: --- Thanks Sushanth for tracking down the problem. I'll regenerate the patch and track that on HIVE-7901. pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7357) Add vectorized support for BINARY data type
[ https://issues.apache.org/jira/browse/HIVE-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065831#comment-14065831 ] Eric Hanson commented on HIVE-7357: --- Hi Matt. This looks good overall. Please see my comments on ReviewBoard. Add vectorized support for BINARY data type --- Key: HIVE-7357 URL: https://issues.apache.org/jira/browse/HIVE-7357 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7357.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057078#comment-14057078 ] Eric Hanson commented on HIVE-7262: --- [~mmccline] put a code review at: https://reviews.apache.org/r/23186/. Matt, if you could attach this to your JIRAs in the future, that'd be great. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057085#comment-14057085 ] Eric Hanson commented on HIVE-7262: --- Matt, can you upload your patch to your ReviewBoard page? I didn't see a View Diff button. I see you did include a link above -- sorry I missed that. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending
[ https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042330#comment-14042330 ] Eric Hanson commented on HIVE-7266: --- Also, I recall in a past an error that looked similar to this, which I think was related to incorrect column re-use within batches. The code for that was in VectorizationContext. Optimized HashTable with vectorized map-joins results in String columns extending - Key: HIVE-7266 URL: https://issues.apache.org/jira/browse/HIVE-7266 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Matt McCline Attachments: hive-7266-small-test.tgz The following query returns different results when both vectorized mapjoin and the new optimized hashtable are enabled. {code} hive set hive.vectorized.execution.enabled=false; hive select s_suppkey, n_name from supplier, nation where s_nationkey = n_nationkey limit 25; ... 316869 JAPAN 1636869 RUSSIA 1096869 IRAN 7236869 RUSSIA 2276869 INDIA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIA 2616869 FRANCE {code} But when vectorization is enabled, the results are {code} 316869 JAPAN 1636869 RUSSIA 1096869 IRANIA 7236869 RUSSIA 2276869 INDIAA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIAQUE 2616869 FRANCEAQUE {code} it works correctly with vectorization when the new optimized map-join hashtable is disabled {code} hive set hive.vectorized.execution.enabled=true; hive set hive.mapjoin.optimized.hashtable=false; hive select s_suppkey, n_name from supplier, nation where s_nationkey = n_nationkey limit 25; 316869 JAPAN 1636869 RUSSIA 1096869 IRAN 7236869 RUSSIA 2276869 INDIA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIA 2616869 FRANCE {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending
[ https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039509#comment-14039509 ] Eric Hanson commented on HIVE-7266: --- This looks like it might be related to using setRef() in BytesColumnVector whe setVal() should be used. That is something to look into. Optimized HashTable with vectorized map-joins results in String columns extending - Key: HIVE-7266 URL: https://issues.apache.org/jira/browse/HIVE-7266 Project: Hive Issue Type: Bug Components: Tez, Vectorization Affects Versions: 0.14.0 Reporter: Gopal V Assignee: Jitendra Nath Pandey Attachments: hive-7266-small-test.tgz The following query returns different results when both vectorized mapjoin and the new optimized hashtable are enabled. {code} hive set hive.vectorized.execution.enabled=false; hive select s_suppkey, n_name from supplier, nation where s_nationkey = n_nationkey limit 25; ... 316869 JAPAN 1636869 RUSSIA 1096869 IRAN 7236869 RUSSIA 2276869 INDIA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIA 2616869 FRANCE {code} But when vectorization is enabled, the results are {code} 316869 JAPAN 1636869 RUSSIA 1096869 IRANIA 7236869 RUSSIA 2276869 INDIAA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIAQUE 2616869 FRANCEAQUE {code} it works correctly with vectorization when the new optimized map-join hashtable is disabled {code} hive set hive.vectorized.execution.enabled=true; hive set hive.mapjoin.optimized.hashtable=false; hive select s_suppkey, n_name from supplier, nation where s_nationkey = n_nationkey limit 25; 316869 JAPAN 1636869 RUSSIA 1096869 IRAN 7236869 RUSSIA 2276869 INDIA 8516869 ARGENTINA 2636869 MOZAMBIQUE 3836869 ROMANIA 2616869 FRANCE {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches
[ https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004882#comment-14004882 ] Eric Hanson commented on HIVE-7105: --- I agree with Remus. If you do want to get good performance with vectorization on the reduce side, you'll need to think carefully about how you can efficiently create full VectorizedRowBatches. Single-row or small VectorizedRowBatches will not give performance gains. Also, if it is expensive to load rows into the batches on the reduce side, that could dominate total runtime. Enable ReduceRecordProcessor to generate VectorizedRowBatches - Key: HIVE-7105 URL: https://issues.apache.org/jira/browse/HIVE-7105 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Rajesh Balamohan Assignee: Jitendra Nath Pandey Attachments: HIVE-7105.1.patch Currently, ReduceRecordProcessor sends one key,value pair at a time to its operator pipeline. It would be beneficial to send VectorizedRowBatch to downstream operators. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6918) ALTER TABLE using embedded metastore fails with duplicate key violation in 'dbo.SERDES'
Eric Hanson created HIVE-6918: - Summary: ALTER TABLE using embedded metastore fails with duplicate key violation in 'dbo.SERDES' Key: HIVE-6918 URL: https://issues.apache.org/jira/browse/HIVE-6918 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Environment: hive-0.11.0.1.3.7.0-01272 HDInsight version: 2.1.4.0.661685 Reporter: Eric Hanson An HDINSIGHT customer is doing some heavy metadata operations using an embedded metastore. They get an error with a duplicate key in a metastore table 'dbo.SERDES'. They have multiple concurrent jobs doing ALTER TABLE concurrently (on different tables, I believe) using the same metastore database, but with each job having an embedded metastore because they set hive.metastore.uris to the empty string. The script looks like: set hive.metastore.uris=; ... CREATE EXTERNAL TABLE IF NOT EXISTS InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ( ... ) PARTITIONED BY (tenant string, d string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ; ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ...; ... (several more like this); ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ADD IF NOT EXISTS PARTITION (tenant='8dddaf7c-2354-47ae-87a7-b781f14f8c11', d='20140414') LOCATION 'wasb://instancespaceb...@advisor27415020383770839.blob.core.windows.net/v0/tenant=8dddaf7c-2354-47ae-87a7-b781f14f8c11/d=20140414/'; ... several more like the above (14 ALTER TABLE statements in a row) ... Then they get this error: ... at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) NestedThrowablesStackTrace: java.sql.BatchUpdateException: Violation of PRIMARY KEY constraint 'PK_serdes_SERDE_ID'. Cannot insert duplicate key in object 'dbo.SERDES'. The duplicate key value is (209703). at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1160) at com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:469) at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:583) at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:291) at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:267) at org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.getValue(RDBMSJoinMapStore.java:656) at org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.putAll(RDBMSJoinMapStore.java:195) at org.datanucleus.store.mapped.mapping.MapMapping.postInsert(MapMapping.java:135) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:517) ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951057#comment-13951057 ] Eric Hanson commented on HIVE-6633: --- [~thejas] Can you commit this to 0.13 please? pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951169#comment-13951169 ] Eric Hanson commented on HIVE-6633: --- [~rhbutani] Can you approve this to go into 0.13 please? pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/#review38958 --- ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java https://reviews.apache.org/r/19718/#comment71328 please add a comment to explain why we use the sum of all the counts here to determine the array size. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java https://reviews.apache.org/r/19718/#comment71329 Consider for readability/encapsulation having a function to compute offset, e.g. isNull[decimalOffset(index)] = false; Please add a comment to explain offset logic. Does addition of decimal affect any other offsets? I guess not. ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java https://reviews.apache.org/r/19718/#comment71330 Timestamp is supposed to be represented as a long (# of nanos since epoch). So whey is this using a FilterStringColumnBetween? ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java https://reviews.apache.org/r/19718/#comment71331 Again, why string and not long not between operator? - Eric Hanson On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 28, 2014, 9:56 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 2229079 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 5ebab70 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951533#comment-13951533 ] Eric Hanson commented on HIVE-6752: --- Please see my comments on review board Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951552#comment-13951552 ] Eric Hanson commented on HIVE-6752: --- +1 Thanks for the response on review board. I agree that it is reasonable to take up the issues raised in separate JIRAs, which are not time-critical at this point. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, HIVE-6752.4.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951603#comment-13951603 ] Eric Hanson commented on HIVE-6633: --- Sushanth, thanks for getting this in to 0.13! pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Affects Version/s: 0.14.0 WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Fix Version/s: (was: 0.13.0) 0.14.0 WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/#review38752 --- Looks good overall. Only minor comments. ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt https://reviews.apache.org/r/19718/#comment71027 please remove all trailing whitespace in this file ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt https://reviews.apache.org/r/19718/#comment71034 add blank after // ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/19718/#comment71038 Couldn't determine common type ... sounds better ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71053 Change comment. This is not a filter, it is a Boolean-valued expression. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71052 Remove the comment about This is optimized for lookup of the data type of the column. because that doesn't apply here since you're using the standard HashSet. But it is still pretty good :-) ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71057 formatting: j=0 == j = 0 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71059 add blanks line before comment and space after // ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71062 remove This is optimized ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java https://reviews.apache.org/r/19718/#comment71061 see formatting comments for DecimalColumnInList - Eric Hanson On March 27, 2014, 7:02 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19718/ --- (Updated March 27, 2014, 7:02 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6752 https://issues.apache.org/jira/browse/HIVE-6752 Repository: hive-git Description --- Vectorized Between and IN expressions don't work with decimal, date types. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 96e74a9 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java c2240c0 ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19718/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.
[ https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949765#comment-13949765 ] Eric Hanson commented on HIVE-6752: --- +1 Conditional on addressing my comments in the code review. All of them are minor. Vectorized Between and IN expressions don't work with decimal, date types. -- Key: HIVE-6752 URL: https://issues.apache.org/jira/browse/HIVE-6752 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6752.1.patch Vectorized Between and IN expressions don't work with decimal, date types. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948523#comment-13948523 ] Eric Hanson commented on HIVE-6546: --- I'm not sure I understand what you mean. Can you elaborate? The placeholder is getting substituted or eliminated by the templeton controller job. If I run this simple Pig script from WebHCat: emp = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump emp; Then I see this in the templeton controller job configuration: templeton.args cmd,/c,call,C:\\apps\\dist\\pig-0.12.0.2.0.7.0-1551/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-execute,emp = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump emp; And I see this in the Pig job configuration for the job spawned by the templeton controller job: pig.cmd.args -Dmapreduce.job.credentials.binary=/c:/hdfs/nm-local-dir/usercache/ehans/appcache/application_1395867453549_0007/container_1395867453549_0007_01_02/container_tokens -execute emp = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump emp; WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Attachment: HIVE-6546.03.patch Uploading patch yet again to try to kick off pre-commit tests. WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945778#comment-13945778 ] Eric Hanson commented on HIVE-6546: --- [~thejas] Can you take a look? WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19218: Vectorization: some date expressions throw exception.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19218/#review37271 --- ql/src/test/results/clientpositive/vectorized_date_funcs.q.out https://reviews.apache.org/r/19218/#comment68694 it'd be good to remove trailing white space - Eric Hanson On March 14, 2014, 9:06 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19218/ --- (Updated March 14, 2014, 9:06 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6649 https://issues.apache.org/jira/browse/HIVE-6649 Repository: hive-git Description --- Query: select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; throws NPE. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java 901005e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringUnaryUDF.java 4875d0d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColCol.java 09f6e47 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColScalar.java 6578907 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddScalarCol.java d1156b6 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColCol.java 15e995c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColScalar.java 05b71ac ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffScalarCol.java 7c76901 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateString.java dd84de3 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldString.java 011a790 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java 8418587 ql/src/test/queries/clientpositive/vectorized_date_funcs.q 6c9515c ql/src/test/results/clientpositive/vectorized_date_funcs.q.out a9d7dde Diff: https://reviews.apache.org/r/19218/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935621#comment-13935621 ] Eric Hanson commented on HIVE-6649: --- +1 Please see my minor comments on ReviewBoard Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19216: Vectorized variance computation differs from row mode computation.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19216/#review37296 --- Ship it! Ship It! - Eric Hanson On March 14, 2014, 8:41 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19216/ --- (Updated March 14, 2014, 8:41 a.m.) Review request for hive, Eric Hanson and Remus Rusanu. Bugs: HIVE-6664 https://issues.apache.org/jira/browse/HIVE-6664 Repository: hive-git Description --- Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. Diffs - ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 507f798 Diff: https://reviews.apache.org/r/19216/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935852#comment-13935852 ] Eric Hanson commented on HIVE-6664: --- +1 Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.
[ https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935860#comment-13935860 ] Eric Hanson commented on HIVE-6664: --- In general, sum/avg/variance aggregate results that involve floating point arithmetic in the sum calculation will return different answers depending on execution order. This is due the nature of floating point arithmetic, where it is easy to show examples where (a + b) + c a + (b + c). So it is probably not critical that row-mode and vector mode have results that are compatible to the last decimal place. However, the change here is simple enough and it makes for better compatibility without any serious drawbacks for performance, so I think this is fine. Vectorized variance computation differs from row mode computation. -- Key: HIVE-6664 URL: https://issues.apache.org/jira/browse/HIVE-6664 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6664.1.patch Following query can show the difference: select var_samp(ss_sales_price), var_pop(ss_sales_price), stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales. The reason for the difference is that row mode converts the decimal value to double upfront to calculate sum of values, when computing variance. But the vector mode performs local aggregate sum as decimal and converts into double only at flush. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.
[ https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933506#comment-13933506 ] Eric Hanson commented on HIVE-6649: --- Can you put this up on ReviewBoard if you're ready for a review? Vectorization: some date expressions throw exception. - Key: HIVE-6649 URL: https://issues.apache.org/jira/browse/HIVE-6649 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6649.1.patch Query ran with hive.vectorized.execution.enabled=true: {code} select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)), datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)), datediff(date_add(dt, 2), date_sub(dt, 2)) from vectortab10korc limit 1; {code} fails with the following error: {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating datediff(date_add(dt, 2), date_sub(dt, 2)) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) ... 9 more Caused by: java.lang.NullPointerException at java.lang.String.checkBounds(String.java:400) at java.lang.String.init(String.java:569) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115) ... 13 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-6633: - Assignee: Eric Hanson pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
Eric Hanson created HIVE-6633: - Summary: pig -useHCatalog with embedded metastore fails to pass command line args to metastore Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Fix For: 0.14.0 This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6633: -- Status: Patch Available (was: Open) pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19140: pig -useHCatalog with embedded metastore fails to pass command line args to metastore
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19140/ --- Review request for hive. Bugs: HIVE-6633 https://issues.apache.org/jira/browse/HIVE-6633 Repository: hive-git Description --- see JIRA Diffs - hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatLoader.java a32149c hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/PigHCatUtil.java a01d9e3 Diff: https://reviews.apache.org/r/19140/diff/ Testing --- Thanks, Eric Hanson
[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore
[ https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932149#comment-13932149 ] Eric Hanson commented on HIVE-6633: --- Code review at https://reviews.apache.org/r/19140/ pig -useHCatalog with embedded metastore fails to pass command line args to metastore - Key: HIVE-6633 URL: https://issues.apache.org/jira/browse/HIVE-6633 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.14.0 Attachments: HIVE-6633.01.patch This fails because the embedded metastore can't connect to the database because the command line -D arguments passed to pig are not getting passed to the metastore when the embedded metastore is created. Using hive.metastore.uris set to the empty string causes creation of an embedded metastore. pig -useHCatalog -Dhive.metastore.uris= -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ The goal is to allow a pig job submitted via WebHCat to specify a metastore to use via job arguments. That is not working because it is not possible to pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to the embedded metastore. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18972/#review36803 --- Ship it! Ship It! - Eric Hanson On March 10, 2014, 9:51 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18972/ --- (Updated March 10, 2014, 9:51 p.m.) Review request for hive and Eric Hanson. Repository: hive-git Description --- Vectorized cast of decimal to string and timestamp produces incorrect result. Diffs - common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 34bd9d0 common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java debc270 common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 9ac68fe ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java 2e8c3a4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java df7e1ee ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java 832463d ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 Diff: https://reviews.apache.org/r/18972/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6568) Vectorized cast of decimal to string and timestamp produces incorrect result.
[ https://issues.apache.org/jira/browse/HIVE-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930574#comment-13930574 ] Eric Hanson commented on HIVE-6568: --- +1 Vectorized cast of decimal to string and timestamp produces incorrect result. - Key: HIVE-6568 URL: https://issues.apache.org/jira/browse/HIVE-6568 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6568.1.patch, HIVE-6568.2.patch, HIVE-6568.3.patch A decimal value 1.23 with scale 5 is represented in string as 1.23000. This behavior is different from HiveDecimal behavior. The difference in cast to timestamp is due to more aggressive rounding in vectorized expression. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Attachment: HIVE-6546.03.patch Upload again to try to kick off pre-commit tests WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18972/#review36703 --- Overall it looks good. Please see my specific comments. common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java https://reviews.apache.org/r/18972/#comment67785 Please add one or more tests with a large integer with trailing zeros, e.g. 1234123000 to make sure that comes our right (no zeros get lopped off). ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java https://reviews.apache.org/r/18972/#comment67784 Please comment why you're using this logic. - Eric Hanson On March 10, 2014, 5:02 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18972/ --- (Updated March 10, 2014, 5:02 p.m.) Review request for hive and Eric Hanson. Repository: hive-git Description --- Vectorized cast of decimal to string and timestamp produces incorrect result. Diffs - common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 34bd9d0 common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java debc270 common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 9ac68fe ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java 2e8c3a4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java df7e1ee ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java 832463d ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 Diff: https://reviews.apache.org/r/18972/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18808/#review36558 --- Ship it! looks good to me! common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java https://reviews.apache.org/r/18808/#comment67561 Can you open a bug for scaleDownTenDestructive, based on what you found? You can make it low priority since it is not getting called in the current code paths. But it will be good to have a record of it. - Eric Hanson On March 7, 2014, 4:39 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18808/ --- (Updated March 7, 2014, 4:39 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6511 https://issues.apache.org/jira/browse/HIVE-6511 Repository: hive-git Description --- Casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on. Diffs - common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 426c03d Diff: https://reviews.apache.org/r/18808/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924234#comment-13924234 ] Eric Hanson commented on HIVE-6511: --- +1 casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, HIVE-6511.4.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Fix Version/s: 0.13.0 Assignee: Eric Hanson Status: Patch Available (was: Open) WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.11.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Attachment: HIVE-6546.01.patch Changed constant placeholder to include = sign WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Attachment: HIVE-6546.02.patch removed trailing white space WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 18816: WebHCat job submission for pig with -useHCatalog argument fails on Windows
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18816/ --- Review request for hive. Bugs: HIVE-6546 https://issues.apache.org/jira/browse/HIVE-6546 Repository: hive-git Description --- See JIRA Diffs - hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java 482e993 Diff: https://reviews.apache.org/r/18816/diff/ Testing --- Thanks, Eric Hanson
[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921553#comment-13921553 ] Eric Hanson commented on HIVE-6546: --- Code review at https://reviews.apache.org/r/18816/ WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Attachment: HIVE-6546.03.patch fix typo WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, HIVE-6546.03.patch On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18808/#review36290 --- common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java https://reviews.apache.org/r/18808/#comment67244 Nice idea to special-case signum==0 and scale==0 cases to speed it up. common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java https://reviews.apache.org/r/18808/#comment67243 Decimal128.divideDestructive had a bug that we worked around by just rewriting it to use HiveDecimal divide. I am worried that UnsignedInt128.divideDestructive could have been the original source of the bug. That makes me think it might be safer to just use the HiveDecimal code here to do the divide by 10**scale. - Eric Hanson On March 5, 2014, 9:39 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18808/ --- (Updated March 5, 2014, 9:39 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6511 https://issues.apache.org/jira/browse/HIVE-6511 Repository: hive-git Description --- Casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on. Diffs - common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 426c03d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java d5f34d5 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java df7e1ee Diff: https://reviews.apache.org/r/18808/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Created] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
Eric Hanson created HIVE-6546: - Summary: WebHCat job submission for pig with -useHCatalog argument fails on Windows Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.11.0, 0.13.0 Environment: Windows Azure HDINSIGHT and Windows one-box installations. Reporter: Eric Hanson -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Description: On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: Windows Azure HDINSIGHT and Windows one-box installations. Reporter: Eric Hanson On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6546: -- Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. was:Windows Azure HDINSIGHT and Windows one-box installations. WebHCat job submission for pig with -useHCatalog argument fails on Windows -- Key: HIVE-6546 URL: https://issues.apache.org/jira/browse/HIVE-6546 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0 Environment: HDInsight deploying HDP 1.3: c:\apps\dist\pig-0.11.0.1.3.2.0-05 Also on Windows HDP 1.3 one-box configuration. Reporter: Eric Hanson On a one-box windows setup, do the following from a powershell prompt: cmd /c curl.exe -s ` -d user.name=hadoop ` -d arg=-useHCatalog ` -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; ` -d statusdir=/tmp/webhcat.output01 ` 'http://localhost:50111/templeton/v1/pig' -v The job fails with error code 7, but it should run. I traced this down to the following. In the job configuration for the TempletonJobController, we have templeton.args set to cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp = load '/data/emp/emp_0.dat'; dump emp; Notice the = sign before -useHCatalog. I think this should be a comma. The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(). It happens at line 434: {code} } else { if (i args.length - 1) { prop += = + args[++i]; // RIGHT HERE! at iterations i = 37, 38 } } {code} Bug is here: {code} if (prop != null) { if (prop.contains(=)) { // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does not contain equal, so else branch is run and appends =-useHCatalog, // everything good } else { if (i args.length - 1) { prop += = + args[++i]; } } newArgs.add(prop); } {code} One possible fix is to change the string constant org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER to have an = sign in it. Or, preProcessForWindows() itself could be changed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918290#comment-13918290 ] Eric Hanson commented on HIVE-6511: --- Can you put this up on ReviewBoard? casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on Key: HIVE-6511 URL: https://issues.apache.org/jira/browse/HIVE-6511 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6511.1.patch select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from vectortab10korc limit 20 generates following result when vectorization is enabled: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253559 -19895 73 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994718 31070 94 1408783849655.676758 34576568-26440 -72 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511544 -28088 72 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323068 -27332 60 NULL NULLNULLNULL {code} When vectorization is disabled, result looks like this: {code} 4619756289662.078125 -1628520834 -16770 126 1553532646710.316406 -1245514442 -2762 54 3367942487288.360352 688127224 -776-8 4386447830839.337891 1286221623 12087 55 -3234165331139.458008 -54957251 27453 61 -488378613475.326172 1247658269 -16099 29 -493942492598.691406 -21253558 -19894 74 3101852523586.039062 886135874 23618 66 2544105595941.381836 1484956709 -23515 37 -3997512403067.0625 1102149509 30597 -123 -1183754978977.589355 1655994719 31071 95 1408783849655.676758 34576567-26441 -73 -2993175106993.426758 417098319 27215 79 3004723551798.100586 -1753555402 -8650 54 1103792083527.786133 -14511545 -28089 71 469767055288.485352 1615620024 26552 -72 -1263700791098.294434 -980406074 12486 -58 -4244889766496.484375 -1462078048 30112 -96 -3962729491139.782715 1525323069 -27331 61 NULL NULLNULLNULL {code} This issue is visible only for certain decimal values. In above example, row 7,11,12, and 15 generates different results. vectortab10korc table schema: {code} t tinyint from deserializer sismallintfrom deserializer i int from deserializer b bigint from deserializer f float from deserializer d double from deserializer dcdecimal(38,18) from deserializer boboolean from deserializer s string from deserializer s2string from deserializer tstimestamp from deserializer # Detailed Table Information Database: default Owner:xyz CreateTime: Tue Feb 25 21:54:28 UTC 2014 LastAccessTime: UNKNOWN Protect Mode: None Retention:0 Location: hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 1 rawDataSize 0 totalSize 344748 transient_lastDdlTime 1393365281 # Storage Information SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat
RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang
Congratulations Xuefu! -Original Message- From: Remus Rusanu [mailto:rem...@microsoft.com] Sent: Friday, February 28, 2014 11:43 AM To: dev@hive.apache.org; u...@hive.apache.org Cc: Xuefu Zhang Subject: RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang Grats! From: Prasanth Jayachandran pjayachand...@hortonworks.com Sent: Friday, February 28, 2014 9:11 PM To: dev@hive.apache.org Cc: u...@hive.apache.org; Xuefu Zhang Subject: Re: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang Congratulations Xuefu! Thanks Prasanth Jayachandran On Feb 28, 2014, at 11:04 AM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congrats Xuefu! On Fri, Feb 28, 2014 at 9:20 AM, Prasad Mujumdar pras...@cloudera.comwrote: Congratulations Xuefu !! thanks Prasad On Fri, Feb 28, 2014 at 1:20 AM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Xuefu Zhang has been elected to the Hive Project Management Committee. Please join me in congratulating Xuefu! Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
need advice on debugging into TempletonJobController.java
I want to attach a debugger to TempletonJobController.java (code that runs in a map job started by templeton service, that in turn will start another job). Does anybody know how to make the job wait for a debugger to attach? i.e. what file to modify to change the java opts? Eric Details of what I tried: I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work: property namemapred.child.java.opts/name value-Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m/value /property I also tried this, in: %hcatalog_home%\etc\webhcat\webhcat-default.xml Adding: property nametempleton.controller.mr.child.opts/name value-Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m -Djava.net.preferIPv4Stack=true/value descriptionJava options to be passed to templeton controller map task. The default value of mapreduce child -Xmx (heap memory limit) might be close to what is allowed for a map task. Even if templeton controller map task does not need much memory, the jvm (with -server option?) allocates the max memory when it starts. This along with the memory used by pig/hive client it starts can end up exceeding the max memory configured to be allowed for a map task Use this option to set -Xmx to lower value /description /property But the job doesn't appear to wait, and I keep seeing this in my job config: mapred.child.java.opts -server -Xmx256m -Djava.net.preferIPv4Stack=true
RE: need advice on debugging into TempletonJobController.java
Hey, I found the solution. You need to add this to webhcat-site.xml. -Eric To attach the debugger to the templeton controller MR job started by the templeton service, go to %hcatalog_home%\conf\webhcat-site.xml and add the following block (copied from etc\webhcat\webhcat-default.xml, and enhanced with the highlighted options for debugging). property nametempleton.controller.mr.child.opts/name value-Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m -Djava.net.preferIPv4Stack=true/value descriptionJava options to be passed to templeton controller map task. The default value of mapreduce child -Xmx (heap memory limit) might be close to what is allowed for a map task. Even if templeton controller map task does not need much memory, the jvm (with -server option?) allocates the max memory when it starts. This along with the memory used by pig/hive client it starts can end up exceeding the max memory configured to be allowed for a map task Use this option to set -Xmx to lower value /description /property -Original Message- From: Eric Hanson (BIG DATA) [mailto:eric.n.han...@microsoft.com] Sent: Friday, February 28, 2014 12:06 PM To: dev@hive.apache.org Subject: need advice on debugging into TempletonJobController.java I want to attach a debugger to TempletonJobController.java (code that runs in a map job started by templeton service, that in turn will start another job). Does anybody know how to make the job wait for a debugger to attach? i.e. what file to modify to change the java opts? Eric Details of what I tried: I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work: property namemapred.child.java.opts/name value-Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m/value /property I also tried this, in: %hcatalog_home%\etc\webhcat\webhcat-default.xml Adding: property nametempleton.controller.mr.child.opts/name value-Xdebug -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m -Djava.net.preferIPv4Stack=true/value descriptionJava options to be passed to templeton controller map task. The default value of mapreduce child -Xmx (heap memory limit) might be close to what is allowed for a map task. Even if templeton controller map task does not need much memory, the jvm (with -server option?) allocates the max memory when it starts. This along with the memory used by pig/hive client it starts can end up exceeding the max memory configured to be allowed for a map task Use this option to set -Xmx to lower value /description /property But the job doesn't appear to wait, and I keep seeing this in my job config: mapred.child.java.opts -server -Xmx256m -Djava.net.preferIPv4Stack=true
Re: Review Request 18566: Queries fail to Vectorize.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18566/#review35682 --- Looks good. Please add unit tests to exercise the code you changed, or if this code is already covered by other tests, please explain in comments on the JIRA. common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java https://reviews.apache.org/r/18566/#comment66370 Please add comment saying the purpose of this method - Eric Hanson On Feb. 27, 2014, 6:43 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18566/ --- (Updated Feb. 27, 2014, 6:43 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6496 https://issues.apache.org/jira/browse/HIVE-6496 Repository: hive-git Description --- 1) NPE because row resolver is null. 2) VectorUDFAdapter doesn't handle decimal. 3) Decimal cast to boolean, timestamp, string fail because classes are not annotated appropriately. 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated. Diffs - common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java 09af28a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java 4de9f9f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 842994e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFAdaptor.java 3bc9493 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java e6be03f ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 54c665e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java db4eafa ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTimestamp.java e2529d2 Diff: https://reviews.apache.org/r/18566/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6496) Queries fail to Vectorize.
[ https://issues.apache.org/jira/browse/HIVE-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914962#comment-13914962 ] Eric Hanson commented on HIVE-6496: --- +1 conditional on addressing my review comments Queries fail to Vectorize. -- Key: HIVE-6496 URL: https://issues.apache.org/jira/browse/HIVE-6496 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6496.1.patch, HIVE-6496.2.patch, HIVE-6496.3.patch Following issues are causing many queries to fail to vectorize: 1) NPE because row resolver is null. 2) VectorUDFAdapter doesn't handle decimal. 3) Decimal cast to boolean, timestamp, string fail because classes are not annotated appropriately. 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
RE: [ANNOUNCE] New Hive Committer - Remus Rusanu
Fantastic! Welcome aboard, Remus! Eric From: Carl Steinbach [mailto:cwsteinb...@gmail.com] Sent: Wednesday, February 26, 2014 8:59 AM To: u...@hive.apache.org; dev@hive.apache.org Cc: Remus Rusanu Subject: [ANNOUNCE] New Hive Committer - Remus Rusanu The Apache Hive PMC has voted to make Remus Rusanu a committer on the Apache Hive Project. Please join me in congratulating Remus! Thanks. Carl
Re: Review Request 18184: Vectorized mathematical functions for decimal type.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18184/#review34783 --- ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt https://reviews.apache.org/r/18184/#comment65070 format comment better (blank after //, blank line before first comment line) ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt https://reviews.apache.org/r/18184/#comment65066 I think you could speed this up with an array fill operation for outputIsNull before the loop, but that is a nice-to-have and not essential. ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt https://reviews.apache.org/r/18184/#comment65071 remove trailing white space ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt https://reviews.apache.org/r/18184/#comment65073 remove trailing white space ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/18184/#comment65132 Please add comment to explain what method does. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java https://reviews.apache.org/r/18184/#comment65136 delete trailing white space ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java https://reviews.apache.org/r/18184/#comment65137 delete trailing white space ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java https://reviews.apache.org/r/18184/#comment65138 fix comment format ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java https://reviews.apache.org/r/18184/#comment65139 remove trailing white space ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java https://reviews.apache.org/r/18184/#comment65140 remove trailing white space ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java https://reviews.apache.org/r/18184/#comment65144 please add cases for non-zero values close to 0 like -0.3 and 0.3 for floor and ceiling ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java https://reviews.apache.org/r/18184/#comment65147 Please add test to negate 0 and make sure you still get 0 ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q https://reviews.apache.org/r/18184/#comment65148 please remove trailing white space in .q file (several locations) - Eric Hanson On Feb. 17, 2014, 9:05 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18184/ --- (Updated Feb. 17, 2014, 9:05 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6416 https://issues.apache.org/jira/browse/HIVE-6416 Repository: hive-git Description --- Vectorized mathematical functions for decimal type. Diffs - ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 1b76fc9 common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 2e0f058 ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java f69bfc0 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalUtil.java 589450f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 628f06d ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java 1c1bcfe ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCeil.java ceb56bb ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFloor.java a95a263 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNegative.java f355a82 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java 5cc8025 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java PRE-CREATION ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q PRE-CREATION ql/src/test/results/clientpositive/vector_decimal_math_funcs.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18184/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904829#comment-13904829 ] Eric Hanson commented on HIVE-6416: --- Looks good to me. +1 conditional on addressing my review comments (all of which are minor) Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Components: Query Processor, Vectorization Reporter: Eric Hanson Assignee: Eric Hanson Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive
Eric Hanson created HIVE-6452: - Summary: fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive Key: HIVE-6452 URL: https://issues.apache.org/jira/browse/HIVE-6452 Project: Hive Issue Type: Sub-task Affects Versions: 0.13.0 Reporter: Eric Hanson UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply failures, one of which appears in TestDecimal128.testKnownPriorErrors. Fix the bug by finishing the TODO section in UnsignedInt128.multiplyArrays4And4To8 in the provided multiplyArrays4And4To8-start.patch. Make it fast and make it work with no per-operation storage allocations. Retain the rest of the work (the new tests) in multiplyArrays4And4To8-start.patch as much as possible. Revert the changes to Decimal128.multiplyDestructive so it doesn't use the short-term, slow fix based on HiveDecimal. I.e. use the implementation in multiplyDestructiveNativeDecimal128. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive
[ https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6452: -- Attachment: multiplyArrays4And4To8-start.patch fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive --- Key: HIVE-6452 URL: https://issues.apache.org/jira/browse/HIVE-6452 Project: Hive Issue Type: Sub-task Affects Versions: 0.13.0 Reporter: Eric Hanson Attachments: multiplyArrays4And4To8-start.patch UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply failures, one of which appears in TestDecimal128.testKnownPriorErrors. Fix the bug by finishing the TODO section in UnsignedInt128.multiplyArrays4And4To8 in the provided multiplyArrays4And4To8-start.patch. Make it fast and make it work with no per-operation storage allocations. Retain the rest of the work (the new tests) in multiplyArrays4And4To8-start.patch as much as possible. Revert the changes to Decimal128.multiplyDestructive so it doesn't use the short-term, slow fix based on HiveDecimal. I.e. use the implementation in multiplyDestructiveNativeDecimal128. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive
[ https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6452: -- Assignee: Jitendra Nath Pandey fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive --- Key: HIVE-6452 URL: https://issues.apache.org/jira/browse/HIVE-6452 Project: Hive Issue Type: Sub-task Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Jitendra Nath Pandey Attachments: multiplyArrays4And4To8-start.patch UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply failures, one of which appears in TestDecimal128.testKnownPriorErrors. Fix the bug by finishing the TODO section in UnsignedInt128.multiplyArrays4And4To8 in the provided multiplyArrays4And4To8-start.patch. Make it fast and make it work with no per-operation storage allocations. Retain the rest of the work (the new tests) in multiplyArrays4And4To8-start.patch as much as possible. Revert the changes to Decimal128.multiplyDestructive so it doesn't use the short-term, slow fix based on HiveDecimal. I.e. use the implementation in multiplyDestructiveNativeDecimal128. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6435) Allow specification of alternate metastore in WebHCat job
[ https://issues.apache.org/jira/browse/HIVE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6435: -- Description: Allow a user to specify with their WebHCat Hive and Pig jobs a metastore database JDBC connection string. For the job, this overrides the default metastore configured for the cluster. (was: Allow a user to specify with their WebHCat jobs a metastore database JDBC connection string. For the job, this overrides the default metastore configured for the cluster.) Allow specification of alternate metastore in WebHCat job - Key: HIVE-6435 URL: https://issues.apache.org/jira/browse/HIVE-6435 Project: Hive Issue Type: Improvement Components: CLI, WebHCat Reporter: Eric Hanson Assignee: Eric Hanson Allow a user to specify with their WebHCat Hive and Pig jobs a metastore database JDBC connection string. For the job, this overrides the default metastore configured for the cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5759) Implement vectorized support for COALESCE conditional expression
[ https://issues.apache.org/jira/browse/HIVE-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13901749#comment-13901749 ] Eric Hanson commented on HIVE-5759: --- +1 Also, the failure in testHighPrecisionDecimal128Multiply is external to this patch. Implement vectorized support for COALESCE conditional expression Key: HIVE-5759 URL: https://issues.apache.org/jira/browse/HIVE-5759 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Jitendra Nath Pandey Attachments: HIVE-5759.1.patch, HIVE-5759.2.patch Implement full, end-to-end support for COALESCE in vectorized mode, including new VectorExpression class(es), VectorizationContext translation to a VectorExpression, and unit tests for these, as well as end-to-end ad hoc testing. An end-to-end .q test is recommended. This is lower priority than IF and CASE but it is still a fairly popular expression. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-6399: - Assignee: Eric Hanson (was: Remus Rusanu) bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Components: Query Processor, Vectorization Reporter: Eric Hanson Assignee: Eric Hanson Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13901995#comment-13901995 ] Eric Hanson commented on HIVE-6399: --- Remus' patch is technically good. I have a question I'll raise with the PMC though about the comment about using the algorithm from BigInteger.multiplyToLen. For now I'm going to promote my original patch to get it in so we can get the bug failure out of trunk. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Components: Query Processor, Vectorization Reporter: Eric Hanson Assignee: Eric Hanson Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Attachment: HIVE-6399.05.patch Promoting patch 02 to first position to get committed, now as 05. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Components: Query Processor, Vectorization Reporter: Eric Hanson Assignee: Eric Hanson Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6435) Allow specification of alternate metastore in WebHCat job
Eric Hanson created HIVE-6435: - Summary: Allow specification of alternate metastore in WebHCat job Key: HIVE-6435 URL: https://issues.apache.org/jira/browse/HIVE-6435 Project: Hive Issue Type: Improvement Components: CLI, WebHCat Reporter: Eric Hanson Assignee: Eric Hanson Allow a user to specify with their WebHCat jobs a metastore database JDBC connection string. For the job, this overrides the default metastore configured for the cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job
Eric Hanson created HIVE-6436: - Summary: Allow specification of one or more additional Windows Azure storage accounts in WebHCat job Key: HIVE-6436 URL: https://issues.apache.org/jira/browse/HIVE-6436 Project: Hive Issue Type: Improvement Components: CLI, WebHCat Reporter: Eric Hanson Allow a user to specify one or more additional Windows Azure storage accounts, including account name and key, in a WebHCat Hive job submission. These would be in addition to any that were specified in the default cluster configuration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job
[ https://issues.apache.org/jira/browse/HIVE-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-6436: - Assignee: Eric Hanson Allow specification of one or more additional Windows Azure storage accounts in WebHCat job --- Key: HIVE-6436 URL: https://issues.apache.org/jira/browse/HIVE-6436 Project: Hive Issue Type: Improvement Components: CLI, WebHCat Reporter: Eric Hanson Assignee: Eric Hanson Allow a user to specify one or more additional Windows Azure storage accounts, including account name and key, in a WebHCat Hive job submission. These would be in addition to any that were specified in the default cluster configuration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18025/#review34370 --- ql/src/test/queries/clientpositive/vector_coalesce.q https://reviews.apache.org/r/18025/#comment64447 Can you do one with 3 arguments too? Will that vectorize? ql/src/test/queries/clientpositive/vector_coalesce.q https://reviews.apache.org/r/18025/#comment64450 Please also test for smallint and timestamp. ql/src/test/queries/clientpositive/vector_coalesce.q https://reviews.apache.org/r/18025/#comment64451 Please also test for expressions as arguments, not just columns. ql/src/test/queries/clientpositive/vector_coalesce.q https://reviews.apache.org/r/18025/#comment64448 It is not unusual to use COALESCE like this: COALESCE(col1, ..., colK, 0) So if arguments 1..K are NULL, the default value is the constant at the end, 0 in this case. Could you please make that work in this patch, or open a separate JIRA to do it later? - Eric Hanson On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18025/ --- (Updated Feb. 12, 2014, 7 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-5759 https://issues.apache.org/jira/browse/HIVE-5759 Repository: hive-git Description --- Implement vectorized support for COALESCE conditional expression Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java f1eef14 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java d0d8597 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java cb23129 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java aa05b19 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 7141d63 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 21fe8ca ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18025/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Attachment: HIVE-6399.02.patch Uploading again to trigger precommit tests. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18025/#review34336 --- ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java https://reviews.apache.org/r/18025/#comment64388 I think setRef is only safe for base vectors (that get data from table columns), not intermediate working results. There was bug there since those can get re-used during processing of a single vectorized row batch. So, use setVal here unless you know the source vector is a base vector loaded from a table column. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java https://reviews.apache.org/r/18025/#comment64389 use .update() instead of = assignment or you could have a bug. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/18025/#comment64392 please add comment to explain what method does ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/18025/#comment64395 This is the same code block as the previous case. Can you share the case and change the condition to an OR? Up to you... ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java https://reviews.apache.org/r/18025/#comment64396 I'm not sure this is always EOF. Consider deleting , this is EOF ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java https://reviews.apache.org/r/18025/#comment64397 This can have 1 argument. Please add comment to explain. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java https://reviews.apache.org/r/18025/#comment64398 What happens if one of the inputs is a scalar, not a column? ql/src/test/queries/clientpositive/vector_coalesce.q https://reviews.apache.org/r/18025/#comment64399 ERIC TODO: start reviewing here - Eric Hanson On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18025/ --- (Updated Feb. 12, 2014, 7 p.m.) Review request for hive and Eric Hanson. Bugs: HIVE-5759 https://issues.apache.org/jira/browse/HIVE-5759 Repository: hive-git Description --- Implement vectorized support for COALESCE conditional expression Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java f1eef14 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java d0d8597 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java cb23129 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java aa05b19 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 7141d63 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 21fe8ca ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18025/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson reassigned HIVE-6399: - Assignee: Eric Hanson bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Attachment: HIVE-6399.02.patch Patch with update to Decimal128.multiplyDestructive() to make it use HiveDecimal.multiply internally, plus updated tests. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6399 started by Eric Hanson. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work stopped] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6399 stopped by Eric Hanson. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6399 started by Eric Hanson. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Status: Patch Available (was: In Progress) bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13898550#comment-13898550 ] Eric Hanson commented on HIVE-6399: --- Review board entry: https://reviews.apache.org/r/17972/ bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17769: Generate vectorized plan for decimal expressions.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17769/#review34088 --- Ship it! The functionality looks good. Please address the minor issues about the comments that I pointed out. No need for me to do another review. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment64058 there - their ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment64060 Please add comment before method explaining what it does. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment64059 loose - lose - Eric Hanson On Feb. 8, 2014, 6:15 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17769/ --- (Updated Feb. 8, 2014, 6:15 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6333 https://issues.apache.org/jira/browse/HIVE-6333 Repository: hive-git Description --- Generate vectorized plan for decimal expressions. Diffs - common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt 699b7c5 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt 99366ca ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 2aa4152 ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 2e84334 ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt 9578d34 ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 6ee9d5f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java 1c70387 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java f5ab731 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java f513188 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java 4510368 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java 6a7762d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDecimal.java 14b91e1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDouble.java 2ba1509 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java 65a804d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java 5b2a658 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDoubleToDecimal.java 14e30c3 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastLongToDecimal.java 1d4d84d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDecimal.java 41762ed ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToDecimal.java 37e92e1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java cac1d80 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java 93052a1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncDoubleToDecimal.java 8b2a6f0 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncLongToDecimal.java 18d1dbb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpression.java d00d99b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java e5c3aa4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java a242fef ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java ad96fa5 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java 4f59125 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java e4dfcc9 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java 4e2d1d4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 6f9746c ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java e794e92 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 4e64d47 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 9a04e81 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqual.java 3479b13 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrGreaterThan.java edb1bf8 ql/src/java/org/apache/hadoop/hive/ql/udf/generic
[jira] [Created] (HIVE-6399) bug in high-precision Decimal128 multiply
Eric Hanson created HIVE-6399: - Summary: bug in high-precision Decimal128 multiply Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Fix For: 0.13.0 For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.
[ https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896935#comment-13896935 ] Eric Hanson commented on HIVE-6333: --- I opened bug HIVE-6399 to track the testHighPrecisionDecimal128Multiply failure. It is external to this patch. Generate vectorized plan for decimal expressions. - Key: HIVE-6333 URL: https://issues.apache.org/jira/browse/HIVE-6333 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, HIVE-6333.4.patch, HIVE-6333.5.patch Transform non-vector plan to vectorized plan for supported decimal expressions. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Description: For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failures. This is one example of such a failure. was: For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Fix For: 0.13.0 For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failures. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Description: For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. was: For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failures. This is one example of such a failure. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Fix For: 0.13.0 For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply
[ https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6399: -- Attachment: HIVE-6399.01.patch Attached patch with explicit test for this known bug in testKnownPriorErrors. No fix yet. A quick fix would be to use BigDecimal multiply inside Decimal128 multiply. Although this would not perform well, it'd be safe. bug in high-precision Decimal128 multiply - Key: HIVE-6399 URL: https://issues.apache.org/jira/browse/HIVE-6399 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Fix For: 0.13.0 Attachments: HIVE-6399.01.patch For operation -605044214913338382 * 55269579109718297360 expected: -33440539101030154945490585226577271520 but was: -33440539021801992431226247633033321184 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply many times, you'll get an occasional failure. This is one example of such a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.
[ https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897040#comment-13897040 ] Eric Hanson commented on HIVE-6333: --- +1 Generate vectorized plan for decimal expressions. - Key: HIVE-6333 URL: https://issues.apache.org/jira/browse/HIVE-6333 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, HIVE-6333.4.patch, HIVE-6333.5.patch, HIVE-6333.6.patch Transform non-vector plan to vectorized plan for supported decimal expressions. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17769: Generate vectorized plan for decimal expressions.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17769/#review33942 --- Overall this looks good. Please see my specific comments. I did find one bug (used an Add in place of Subtract in GenericUDFOpMinus), and possibly one design issue related to implicit cast precision and scale. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment63766 Please add a comment explaining what castExpressionUdfs is for ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment63767 Expand the comment to explain the kind of situations where this is necessary. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment63768 Add comment before method explain what it does. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment63808 Hive Java coding standard says put blank line before all comments. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java https://reviews.apache.org/r/17769/#comment63769 Because TypeInfo has decimal precision/scale, the output scale is not always the same as the input scale. E.g. I've seen that decimal(18,2)*decimal(18,2) might have scale=4 or something like that. Might it be better to have integers be cast to decimal(19,0) and floats to, say, decimal(38,18) or something like that, so you never or rarely lose information during the cast, or get a NULL due to overflow? But of course, you would not change the expression result precision/scale. What you have here looks pretty good, but it may be worth more thought. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java https://reviews.apache.org/r/17769/#comment63816 add comment saying briefly what method does ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java https://reviews.apache.org/r/17769/#comment63823 DecimalColAddDecimalScalar should be subtact ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java https://reviews.apache.org/r/17769/#comment63825 please add brief comment saying what this test checks ql/src/test/queries/clientpositive/vector_decimal_expressions.q https://reviews.apache.org/r/17769/#comment63828 I think we need a JIRA to add unary minus for vectorized decimal, plus a test. ql/src/test/results/clientpositive/vectorization_short_regress.q.out https://reviews.apache.org/r/17769/#comment63837 It looks like some new rows showed up in the output after you changed the test. Is this expected, or does it reveal a correctness issue? - Eric Hanson On Feb. 7, 2014, 2:31 a.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17769/ --- (Updated Feb. 7, 2014, 2:31 a.m.) Review request for hive and Eric Hanson. Bugs: HIVE-6333 https://issues.apache.org/jira/browse/HIVE-6333 Repository: hive-git Description --- Generate vectorized plan for decimal expressions. Diffs - common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt 699b7c5 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt 99366ca ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 2aa4152 ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 2e84334 ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt 9578d34 ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 6ee9d5f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java 1c70387 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java f5ab731 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java f513188 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java 4510368 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java 6a7762d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDecimal.java 14b91e1 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDouble.java 2ba1509 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java 65a804d ql
Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/#review33747 --- Ship it! Ship It! - Eric Hanson On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/ --- (Updated Jan. 31, 2014, 10:19 p.m.) Review request for hive and Eric Hanson. Repository: hive-git Description --- VectorExpressionWriter for date and decimal datatypes. Diffs - common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java f513188 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java e5c3aa4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java a242fef ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java ad96fa5 ql/src/test/queries/clientpositive/vectorization_decimal_date.q PRE-CREATION ql/src/test/results/clientpositive/vectorization_decimal_date.q.out PRE-CREATION Diff: https://reviews.apache.org/r/17622/diff/ Testing --- Thanks, Jitendra Pandey
Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/#review33378 --- Looks good to me. See one comment inline. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java https://reviews.apache.org/r/17622/#comment62885 Please add a comment why you are using decimal.* and why it's different than the others. - Eric Hanson On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/ --- (Updated Jan. 31, 2014, 10:19 p.m.) Review request for hive and Eric Hanson. Repository: hive-git Description --- VectorExpressionWriter for date and decimal datatypes. Diffs - common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java f513188 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java e5c3aa4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java a242fef ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java ad96fa5 ql/src/test/queries/clientpositive/vectorization_decimal_date.q PRE-CREATION ql/src/test/results/clientpositive/vectorization_decimal_date.q.out PRE-CREATION Diff: https://reviews.apache.org/r/17622/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Work stopped] (HIVE-6234) Implement fast vectorized InputFormat extension for text files
[ https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6234 stopped by Eric Hanson. Implement fast vectorized InputFormat extension for text files -- Key: HIVE-6234 URL: https://issues.apache.org/jira/browse/HIVE-6234 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text InputFormat design.docx, Vectorized Text InputFormat design.pdf, state-diagram.jpg Implement support for vectorized scan input of text files (plain text with configurable record and field separators). This should work for CSV files, tab delimited files, etc. The goal is to provide high-performance reading of these files using vectorized scans, and also to do it as an extension of existing Hive. Then, if vectorized query is enabled, existing tables based on text files will be able to benefit immediately without the need to use a different input format. After upgrading to new Hive bits that support this, faster, vectorized processing over existing text tables should just work, when vectorization is enabled. Another goal is to go beyond a simple layering of vectorized row batch iterator over the top of the existing row iterator. It should be possible to, say, read a chunk of data into a byte buffer (several thousand or even million rows), and then read data from it into vectorized row batches directly. Object creations should be minimized to save allocation time and GC overhead. If it is possible to save CPU for values like dates and numbers by caching the translation from string to the final data type, that should ideally be implemented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files
[ https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6234: -- Attachment: HIVE-6234.03.patch Non-working code, with some top-down refinement of how to get a batch/line/field. See comments in the code about open questions about the mapping from table columns into the batch. Need to determine how to get the types of the columns for use by the text reader too. E.g. even though a field uses a LongColumnVector, it might need to treat the text data as an integer, Boolean, date, or timestamp. Implement fast vectorized InputFormat extension for text files -- Key: HIVE-6234 URL: https://issues.apache.org/jira/browse/HIVE-6234 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text InputFormat design.docx, Vectorized Text InputFormat design.pdf Implement support for vectorized scan input of text files (plain text with configurable record and field separators). This should work for CSV files, tab delimited files, etc. The goal is to provide high-performance reading of these files using vectorized scans, and also to do it as an extension of existing Hive. Then, if vectorized query is enabled, existing tables based on text files will be able to benefit immediately without the need to use a different input format. After upgrading to new Hive bits that support this, faster, vectorized processing over existing text tables should just work, when vectorization is enabled. Another goal is to go beyond a simple layering of vectorized row batch iterator over the top of the existing row iterator. It should be possible to, say, read a chunk of data into a byte buffer (several thousand or even million rows), and then read data from it into vectorized row batches directly. Object creations should be minimized to save allocation time and GC overhead. If it is possible to save CPU for values like dates and numbers by caching the translation from string to the final data type, that should ideally be implemented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6234) Implement fast vectorized InputFormat extension for text files
[ https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888279#comment-13888279 ] Eric Hanson commented on HIVE-6234: --- This is just getting started. I need to put this aside for a while (probably at least until the end of Feb.). I parked the latest information here on the JIRA. Implement fast vectorized InputFormat extension for text files -- Key: HIVE-6234 URL: https://issues.apache.org/jira/browse/HIVE-6234 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text InputFormat design.docx, Vectorized Text InputFormat design.pdf, state-diagram.jpg Implement support for vectorized scan input of text files (plain text with configurable record and field separators). This should work for CSV files, tab delimited files, etc. The goal is to provide high-performance reading of these files using vectorized scans, and also to do it as an extension of existing Hive. Then, if vectorized query is enabled, existing tables based on text files will be able to benefit immediately without the need to use a different input format. After upgrading to new Hive bits that support this, faster, vectorized processing over existing text tables should just work, when vectorization is enabled. Another goal is to go beyond a simple layering of vectorized row batch iterator over the top of the existing row iterator. It should be possible to, say, read a chunk of data into a byte buffer (several thousand or even million rows), and then read data from it into vectorized row batches directly. Object creations should be minimized to save allocation time and GC overhead. If it is possible to save CPU for values like dates and numbers by caching the translation from string to the final data type, that should ideally be implemented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files
[ https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6234: -- Attachment: state-diagram.jpg State diagram for finding line breaks. May be of use for future reference. Not done. Just a working document. Implement fast vectorized InputFormat extension for text files -- Key: HIVE-6234 URL: https://issues.apache.org/jira/browse/HIVE-6234 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text InputFormat design.docx, Vectorized Text InputFormat design.pdf, state-diagram.jpg Implement support for vectorized scan input of text files (plain text with configurable record and field separators). This should work for CSV files, tab delimited files, etc. The goal is to provide high-performance reading of these files using vectorized scans, and also to do it as an extension of existing Hive. Then, if vectorized query is enabled, existing tables based on text files will be able to benefit immediately without the need to use a different input format. After upgrading to new Hive bits that support this, faster, vectorized processing over existing text tables should just work, when vectorization is enabled. Another goal is to go beyond a simple layering of vectorized row batch iterator over the top of the existing row iterator. It should be possible to, say, read a chunk of data into a byte buffer (several thousand or even million rows), and then read data from it into vectorized row batches directly. Object creations should be minimized to save allocation time and GC overhead. If it is possible to save CPU for values like dates and numbers by caching the translation from string to the final data type, that should ideally be implemented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic
[ https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-6257: -- Resolution: Implemented Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk Add more unit tests for high-precision Decimal128 arithmetic Key: HIVE-6257 URL: https://issues.apache.org/jira/browse/HIVE-6257 Project: Hive Issue Type: Sub-task Affects Versions: 0.13.0 Reporter: Eric Hanson Assignee: Eric Hanson Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6257.02.patch, HIVE-6257.03.patch, HIVE-6257.04.patch Add more unit tests for high-precision Decimal128 arithmetic, with arguments close to or at 38 digit limit. Consider some random stress tests for broader coverage. Coverage is pretty good now (after HIVE-6243) for precision up to about 18. This is to go beyond that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)