[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-09-02 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118896#comment-14118896
 ] 

Eric Hanson commented on HIVE-7901:
---

Thanks, [~sushanth]. Will you commit this or do you want me to do it? -Eric

 CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
 args to metastore (org.apache.hive.hcatalog version)
 

 Key: HIVE-7901
 URL: https://issues.apache.org/jira/browse/HIVE-7901
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Eric Hanson
 Attachments: hive-7901.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-09-02 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14118909#comment-14118909
 ] 

Eric Hanson commented on HIVE-7901:
---

Okay, thanks

 CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
 args to metastore (org.apache.hive.hcatalog version)
 

 Key: HIVE-7901
 URL: https://issues.apache.org/jira/browse/HIVE-7901
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Eric Hanson
 Attachments: hive-7901.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-7901:
--

Attachment: hive-7901.01.patch

I modified the original HIVE-6633 patch to put the changes in the right place, 
under apache/hive.  This is a new patch for those changes based directly off 
the current hive trunk.

 CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
 args to metastore (org.apache.hive.hcatalog version)
 

 Key: HIVE-7901
 URL: https://issues.apache.org/jira/browse/HIVE-7901
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Eric Hanson
 Attachments: hive-7901.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-7901:
--

Status: Patch Available  (was: Open)

 CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
 args to metastore (org.apache.hive.hcatalog version)
 

 Key: HIVE-7901
 URL: https://issues.apache.org/jira/browse/HIVE-7901
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Eric Hanson
 Attachments: hive-7901.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115550#comment-14115550
 ] 

Eric Hanson commented on HIVE-7901:
---

[~sushanth], please have a look and +1/commit if you think it's ready. Thanks! 

 CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
 args to metastore (org.apache.hive.hcatalog version)
 

 Key: HIVE-7901
 URL: https://issues.apache.org/jira/browse/HIVE-7901
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.14.0
Reporter: Sushanth Sowmyan
Assignee: Eric Hanson
 Attachments: hive-7901.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-08-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114367#comment-14114367
 ] 

Eric Hanson commented on HIVE-6633:
---

Thanks Sushanth for tracking down the problem. I'll regenerate the patch and 
track that on HIVE-7901.

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7357) Add vectorized support for BINARY data type

2014-07-17 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065831#comment-14065831
 ] 

Eric Hanson commented on HIVE-7357:
---

Hi Matt. This looks good overall. Please see my comments on ReviewBoard.

 Add vectorized support for BINARY data type
 ---

 Key: HIVE-7357
 URL: https://issues.apache.org/jira/browse/HIVE-7357
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7357.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-09 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057078#comment-14057078
 ] 

Eric Hanson commented on HIVE-7262:
---

[~mmccline] put a code review at: https://reviews.apache.org/r/23186/. Matt, if 
you could attach this to your JIRAs in the future, that'd be great.

 Partitioned Table Function (PTF) query fails on ORC table when attempting to 
 vectorize
 --

 Key: HIVE-7262
 URL: https://issues.apache.org/jira/browse/HIVE-7262
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch


 In ptf.q, create the part table with STORED AS ORC and SET 
 hive.vectorized.execution.enabled=true;
 Queries fail to find BLOCKOFFSET virtual column during vectorization and 
 suffers an exception.
 ERROR vector.VectorizationContext 
 (VectorizationContext.java:getInputColumnIndex(186)) - The column 
 BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
 Jitendra pointed to the routine that returns the VectorizationContext in 
 Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-09 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057085#comment-14057085
 ] 

Eric Hanson commented on HIVE-7262:
---

Matt, can you upload your patch to your ReviewBoard page? I didn't see a View 
Diff button. I see you did include a link above -- sorry I missed that.

 Partitioned Table Function (PTF) query fails on ORC table when attempting to 
 vectorize
 --

 Key: HIVE-7262
 URL: https://issues.apache.org/jira/browse/HIVE-7262
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
 Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch


 In ptf.q, create the part table with STORED AS ORC and SET 
 hive.vectorized.execution.enabled=true;
 Queries fail to find BLOCKOFFSET virtual column during vectorization and 
 suffers an exception.
 ERROR vector.VectorizationContext 
 (VectorizationContext.java:getInputColumnIndex(186)) - The column 
 BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
 Jitendra pointed to the routine that returns the VectorizationContext in 
 Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending

2014-06-24 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14042330#comment-14042330
 ] 

Eric Hanson commented on HIVE-7266:
---

Also, I recall in a past an error that looked similar to this, which I think 
was related to incorrect column re-use within batches. The code for that was in 
VectorizationContext.

 Optimized HashTable with vectorized map-joins results in String columns 
 extending
 -

 Key: HIVE-7266
 URL: https://issues.apache.org/jira/browse/HIVE-7266
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Matt McCline
 Attachments: hive-7266-small-test.tgz


 The following query returns different results when both vectorized mapjoin 
 and the new optimized hashtable are enabled.
 {code}
 hive set hive.vectorized.execution.enabled=false;
 hive select s_suppkey, n_name from supplier, nation where s_nationkey = 
 n_nationkey limit 25;
 ...
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRAN
 7236869 RUSSIA
 2276869 INDIA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIA
 2616869 FRANCE
 {code}
 But when vectorization is enabled, the results are 
 {code}
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRANIA
 7236869 RUSSIA
 2276869 INDIAA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIAQUE
 2616869 FRANCEAQUE
 {code}
 it works correctly with vectorization when the new optimized map-join 
 hashtable is disabled 
 {code}
 hive set hive.vectorized.execution.enabled=true; 
 
 hive set hive.mapjoin.optimized.hashtable=false; 
 
 hive select s_suppkey, n_name from supplier, nation where s_nationkey = 
 n_nationkey limit 25;
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRAN
 7236869 RUSSIA
 2276869 INDIA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIA
 2616869 FRANCE
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending

2014-06-20 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039509#comment-14039509
 ] 

Eric Hanson commented on HIVE-7266:
---

This looks like it might be related to using setRef() in BytesColumnVector whe 
setVal() should be used. That is something to look into.

 Optimized HashTable with vectorized map-joins results in String columns 
 extending
 -

 Key: HIVE-7266
 URL: https://issues.apache.org/jira/browse/HIVE-7266
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 0.14.0
Reporter: Gopal V
Assignee: Jitendra Nath Pandey
 Attachments: hive-7266-small-test.tgz


 The following query returns different results when both vectorized mapjoin 
 and the new optimized hashtable are enabled.
 {code}
 hive set hive.vectorized.execution.enabled=false;
 hive select s_suppkey, n_name from supplier, nation where s_nationkey = 
 n_nationkey limit 25;
 ...
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRAN
 7236869 RUSSIA
 2276869 INDIA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIA
 2616869 FRANCE
 {code}
 But when vectorization is enabled, the results are 
 {code}
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRANIA
 7236869 RUSSIA
 2276869 INDIAA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIAQUE
 2616869 FRANCEAQUE
 {code}
 it works correctly with vectorization when the new optimized map-join 
 hashtable is disabled 
 {code}
 hive set hive.vectorized.execution.enabled=true; 
 
 hive set hive.mapjoin.optimized.hashtable=false; 
 
 hive select s_suppkey, n_name from supplier, nation where s_nationkey = 
 n_nationkey limit 25;
 316869  JAPAN
 1636869 RUSSIA
 1096869 IRAN
 7236869 RUSSIA
 2276869 INDIA
 8516869 ARGENTINA
 2636869 MOZAMBIQUE
 3836869 ROMANIA
 2616869 FRANCE
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

2014-05-21 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004882#comment-14004882
 ] 

Eric Hanson commented on HIVE-7105:
---

I agree with Remus. If you do want to get good performance with vectorization 
on the reduce side, you'll need to think carefully about how you can 
efficiently create full VectorizedRowBatches. Single-row or small 
VectorizedRowBatches will not give performance gains. Also, if it is expensive 
to load rows into the batches on the reduce side, that could dominate total 
runtime.

 Enable ReduceRecordProcessor to generate VectorizedRowBatches
 -

 Key: HIVE-7105
 URL: https://issues.apache.org/jira/browse/HIVE-7105
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Rajesh Balamohan
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-7105.1.patch


 Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
 operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
 downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6918) ALTER TABLE using embedded metastore fails with duplicate key violation in 'dbo.SERDES'

2014-04-15 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6918:
-

 Summary: ALTER TABLE using embedded metastore fails with duplicate 
key violation in 'dbo.SERDES'
 Key: HIVE-6918
 URL: https://issues.apache.org/jira/browse/HIVE-6918
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
 Environment: hive-0.11.0.1.3.7.0-01272
HDInsight version: 2.1.4.0.661685
Reporter: Eric Hanson


An HDINSIGHT customer is doing some heavy metadata operations using an embedded 
metastore. They get an error with a duplicate key in a metastore table 
'dbo.SERDES'. They have multiple concurrent jobs doing ALTER TABLE concurrently 
(on different tables, I believe) using the same metastore database, but with 
each job having an embedded metastore because they set hive.metastore.uris to 
the empty string.

The script looks like:

set hive.metastore.uris=;
...
CREATE EXTERNAL TABLE IF NOT EXISTS 
InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 (
...
)
PARTITIONED BY (tenant string, d string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
;
ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ...;
... (several more like this);
ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ADD IF NOT 
EXISTS PARTITION (tenant='8dddaf7c-2354-47ae-87a7-b781f14f8c11', d='20140414') 
LOCATION 
'wasb://instancespaceb...@advisor27415020383770839.blob.core.windows.net/v0/tenant=8dddaf7c-2354-47ae-87a7-b781f14f8c11/d=20140414/';
... several more like the above (14 ALTER TABLE statements in a row)
...

Then they get this error:

...
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
NestedThrowablesStackTrace:
java.sql.BatchUpdateException: Violation of PRIMARY KEY constraint 
'PK_serdes_SERDE_ID'. Cannot insert duplicate key in object 'dbo.SERDES'. The 
duplicate key value is (209703).
at 
com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1160)
at 
com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:469)
at 
org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:583)
at 
org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:291)
at 
org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:267)
at 
org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.getValue(RDBMSJoinMapStore.java:656)
at 
org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.putAll(RDBMSJoinMapStore.java:195)
at 
org.datanucleus.store.mapped.mapping.MapMapping.postInsert(MapMapping.java:135)
at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:517)
...




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951057#comment-13951057
 ] 

Eric Hanson commented on HIVE-6633:
---

[~thejas] Can you commit this to 0.13 please?

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951169#comment-13951169
 ] 

Eric Hanson commented on HIVE-6633:
---

[~rhbutani] Can you approve this to go into 0.13 please?

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38958
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
https://reviews.apache.org/r/19718/#comment71328

please add a comment to explain why we use the sum of all the counts here 
to determine the array size.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
https://reviews.apache.org/r/19718/#comment71329

Consider for readability/encapsulation having a function to compute offset, 
e.g. 

isNull[decimalOffset(index)] = false;

Please add a comment to explain offset logic.

Does addition of decimal affect any other offsets? I guess not.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
https://reviews.apache.org/r/19718/#comment71330

Timestamp is supposed to be represented as a long (# of nanos since epoch). 
So whey is this using a FilterStringColumnBetween?



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
https://reviews.apache.org/r/19718/#comment71331

Again, why string and not long not between operator?


- Eric Hanson


On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19718/
 ---
 
 (Updated March 28, 2014, 9:56 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6752
 https://issues.apache.org/jira/browse/HIVE-6752
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized Between and IN expressions don't work with decimal, date types.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
 2229079 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 96e74a9 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c2240c0 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
  5ebab70 
   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19718/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951533#comment-13951533
 ] 

Eric Hanson commented on HIVE-6752:
---

Please see my comments on review board

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951552#comment-13951552
 ] 

Eric Hanson commented on HIVE-6752:
---

+1

Thanks for the response on review board. I agree that it is reasonable to take 
up the issues raised in separate JIRAs, which are not time-critical at this 
point.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
 HIVE-6752.4.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951603#comment-13951603
 ] 

Eric Hanson commented on HIVE-6633:
---

Sushanth, thanks for getting this in to 0.13!

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0, 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Affects Version/s: 0.14.0

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Fix Version/s: (was: 0.13.0)
   0.14.0

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38752
---


Looks good overall. Only minor comments.


ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
https://reviews.apache.org/r/19718/#comment71027

please remove all trailing whitespace in this file



ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
https://reviews.apache.org/r/19718/#comment71034

add blank after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/19718/#comment71038

Couldn't determine common type ...

sounds better



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71053

Change comment. This is not a filter, it is a Boolean-valued expression.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71052

Remove the comment about This is optimized for lookup of the data type of 
the column. 

because that doesn't apply here since you're using the standard HashSet.

But it is still pretty good :-)



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71057

formatting: j=0 == j = 0




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71059

add blanks line before comment and space after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71062

remove This is optimized



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
https://reviews.apache.org/r/19718/#comment71061

see formatting comments for DecimalColumnInList


- Eric Hanson


On March 27, 2014, 7:02 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19718/
 ---
 
 (Updated March 27, 2014, 7:02 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6752
 https://issues.apache.org/jira/browse/HIVE-6752
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized Between and IN expressions don't work with decimal, date types.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 96e74a9 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c2240c0 
   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19718/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13949765#comment-13949765
 ] 

Eric Hanson commented on HIVE-6752:
---

+1

Conditional on addressing my comments in the code review. All of them are minor.

 Vectorized Between and IN expressions don't work with decimal, date types.
 --

 Key: HIVE-6752
 URL: https://issues.apache.org/jira/browse/HIVE-6752
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6752.1.patch


 Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-26 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13948523#comment-13948523
 ] 

Eric Hanson commented on HIVE-6546:
---

I'm not sure I understand what you mean. Can you elaborate? The placeholder is 
getting substituted or eliminated by the templeton controller job. 

If I run this simple Pig script from WebHCat:

emp = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump 
emp;

Then I see this in the templeton controller job configuration:

templeton.args   
cmd,/c,call,C:\\apps\\dist\\pig-0.12.0.2.0.7.0-1551/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-execute,emp
 = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump 
emp; 

And I see this in the Pig job configuration for the job spawned by the 
templeton controller job:

pig.cmd.args
-Dmapreduce.job.credentials.binary=/c:/hdfs/nm-local-dir/usercache/ehans/appcache/application_1395867453549_0007/container_1395867453549_0007_01_02/container_tokens
 -execute emp = load 
'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump emp; 




 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-26 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

Uploading patch yet again to try to kick off pre-commit tests.

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-24 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945778#comment-13945778
 ] 

Eric Hanson commented on HIVE-6546:
---

[~thejas] Can you take a look?

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19218: Vectorization: some date expressions throw exception.

2014-03-14 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19218/#review37271
---



ql/src/test/results/clientpositive/vectorized_date_funcs.q.out
https://reviews.apache.org/r/19218/#comment68694

it'd be good to remove trailing white space


- Eric Hanson


On March 14, 2014, 9:06 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19218/
 ---
 
 (Updated March 14, 2014, 9:06 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6649
 https://issues.apache.org/jira/browse/HIVE-6649
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Query:
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 
 throws NPE.
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java
  901005e 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringUnaryUDF.java
  4875d0d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColCol.java
  09f6e47 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColScalar.java
  6578907 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddScalarCol.java
  d1156b6 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColCol.java
  15e995c 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColScalar.java
  05b71ac 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffScalarCol.java
  7c76901 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateString.java
  dd84de3 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldString.java
  011a790 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java
  8418587 
   ql/src/test/queries/clientpositive/vectorized_date_funcs.q 6c9515c 
   ql/src/test/results/clientpositive/vectorized_date_funcs.q.out a9d7dde 
 
 Diff: https://reviews.apache.org/r/19218/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935621#comment-13935621
 ] 

Eric Hanson commented on HIVE-6649:
---

+1

Please see my minor comments on ReviewBoard

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19216: Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19216/#review37296
---

Ship it!


Ship It!

- Eric Hanson


On March 14, 2014, 8:41 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19216/
 ---
 
 (Updated March 14, 2014, 8:41 a.m.)
 
 
 Review request for hive, Eric Hanson and Remus Rusanu.
 
 
 Bugs: HIVE-6664
 https://issues.apache.org/jira/browse/HIVE-6664
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Following query can show the difference:
 select var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.
 
 
 Diffs
 -
 
   ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 
   ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 507f798 
 
 Diff: https://reviews.apache.org/r/19216/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935852#comment-13935852
 ] 

Eric Hanson commented on HIVE-6664:
---

+1

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935860#comment-13935860
 ] 

Eric Hanson commented on HIVE-6664:
---

In general, sum/avg/variance aggregate results that involve floating point 
arithmetic in the sum calculation will return different answers depending on 
execution order. This is due the nature of floating point arithmetic, where it 
is easy to show examples where (a + b) + c  a + (b + c). So it is probably 
not critical that row-mode and vector mode have results that are compatible to 
the last decimal place. However, the change here is simple enough and it makes 
for better compatibility without any serious drawbacks for performance, so I 
think this is fine.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933506#comment-13933506
 ] 

Eric Hanson commented on HIVE-6649:
---

Can you put this up on ReviewBoard if you're ready for a review?

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6633:
-

Assignee: Eric Hanson

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6633:
-

 Summary: pig -useHCatalog with embedded metastore fails to pass 
command line args to metastore
 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
 Fix For: 0.14.0


This fails because the embedded metastore can't connect to the database because 
the command line -D arguments passed to pig are not getting passed to the 
metastore when the embedded metastore is created. Using hive.metastore.uris set 
to the empty string causes creation of an embedded metastore.

pig -useHCatalog -Dhive.metastore.uris= 
-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ

The goal is to allow a pig job submitted via WebHCat to specify a metastore to 
use via job arguments. That is not working because it is not possible to pass 
Djavax.jdo.option.ConnectionPassword and other necessary arguments to the 
embedded metastore.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6633:
--

Status: Patch Available  (was: Open)

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19140: pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19140/
---

Review request for hive.


Bugs: HIVE-6633
https://issues.apache.org/jira/browse/HIVE-6633


Repository: hive-git


Description
---

see JIRA


Diffs
-

  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatLoader.java
 a32149c 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/PigHCatUtil.java
 a01d9e3 

Diff: https://reviews.apache.org/r/19140/diff/


Testing
---


Thanks,

Eric Hanson



[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932149#comment-13932149
 ] 

Eric Hanson commented on HIVE-6633:
---

Code review at https://reviews.apache.org/r/19140/

 pig -useHCatalog with embedded metastore fails to pass command line args to 
 metastore
 -

 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.14.0

 Attachments: HIVE-6633.01.patch


 This fails because the embedded metastore can't connect to the database 
 because the command line -D arguments passed to pig are not getting passed to 
 the metastore when the embedded metastore is created. Using 
 hive.metastore.uris set to the empty string causes creation of an embedded 
 metastore.
 pig -useHCatalog -Dhive.metastore.uris= 
 -Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ
 The goal is to allow a pig job submitted via WebHCat to specify a metastore 
 to use via job arguments. That is not working because it is not possible to 
 pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
 the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-11 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18972/#review36803
---

Ship it!


Ship It!

- Eric Hanson


On March 10, 2014, 9:51 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18972/
 ---
 
 (Updated March 10, 2014, 9:51 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized cast of decimal to string and timestamp produces incorrect result.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 
   common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
 34bd9d0 
   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
 debc270 
   common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
 9ac68fe 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
  2e8c3a4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
  df7e1ee 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
  832463d 
   ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 
   ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 
 
 Diff: https://reviews.apache.org/r/18972/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6568) Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-11 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930574#comment-13930574
 ] 

Eric Hanson commented on HIVE-6568:
---

+1

 Vectorized cast of decimal to string and timestamp produces incorrect result.
 -

 Key: HIVE-6568
 URL: https://issues.apache.org/jira/browse/HIVE-6568
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6568.1.patch, HIVE-6568.2.patch, HIVE-6568.3.patch


 A decimal value 1.23 with scale 5 is represented in string as 1.23000. This 
 behavior is different from HiveDecimal behavior.
 The difference in cast to timestamp is due to more aggressive rounding in 
 vectorized expression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

Upload again to try to kick off pre-commit tests

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch, HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-10 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18972/#review36703
---


Overall it looks good. Please see my specific comments.


common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
https://reviews.apache.org/r/18972/#comment67785

Please add one or more tests with a large integer with trailing zeros, e.g.

1234123000

to make sure that comes our right (no zeros get lopped off).



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
https://reviews.apache.org/r/18972/#comment67784

Please comment why you're using this logic.


- Eric Hanson


On March 10, 2014, 5:02 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18972/
 ---
 
 (Updated March 10, 2014, 5:02 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized cast of decimal to string and timestamp produces incorrect result.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 
   common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
 34bd9d0 
   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
 debc270 
   common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
 9ac68fe 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
  2e8c3a4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
  df7e1ee 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
  832463d 
   ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 
   ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 
 
 Diff: https://reviews.apache.org/r/18972/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on

2014-03-07 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18808/#review36558
---

Ship it!


looks good to me!


common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
https://reviews.apache.org/r/18808/#comment67561

Can you open a bug for scaleDownTenDestructive, based on what you found? 
You can make it low priority since it is not getting called in the current code 
paths. But it will be good to have a record of it.


- Eric Hanson


On March 7, 2014, 4:39 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18808/
 ---
 
 (Updated March 7, 2014, 4:39 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6511
 https://issues.apache.org/jira/browse/HIVE-6511
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 
   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
 426c03d 
 
 Diff: https://reviews.apache.org/r/18808/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-07 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924234#comment-13924234
 ] 

Eric Hanson commented on HIVE-6511:
---

+1

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
 HIVE-6511.4.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Fix Version/s: 0.13.0
 Assignee: Eric Hanson
   Status: Patch Available  (was: Open)

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.11.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.01.patch

Changed constant placeholder to include = sign

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.02.patch

removed trailing white space

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 18816: WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18816/
---

Review request for hive.


Bugs: HIVE-6546
https://issues.apache.org/jira/browse/HIVE-6546


Repository: hive-git


Description
---

See JIRA


Diffs
-

  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java
 482e993 

Diff: https://reviews.apache.org/r/18816/diff/


Testing
---


Thanks,

Eric Hanson



[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13921553#comment-13921553
 ] 

Eric Hanson commented on HIVE-6546:
---

Code review at https://reviews.apache.org/r/18816/

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

fix typo

 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
 HIVE-6546.03.patch


 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18808/#review36290
---



common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
https://reviews.apache.org/r/18808/#comment67244

Nice idea to special-case signum==0 and scale==0 cases to speed it up.



common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
https://reviews.apache.org/r/18808/#comment67243

Decimal128.divideDestructive had a bug that we worked around by just 
rewriting it to use HiveDecimal divide.

I am worried that UnsignedInt128.divideDestructive could have been the 
original source of the bug.

That makes me think it might be safer to just use the HiveDecimal code here 
to do the divide by 10**scale.



- Eric Hanson


On March 5, 2014, 9:39 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18808/
 ---
 
 (Updated March 5, 2014, 9:39 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6511
 https://issues.apache.org/jira/browse/HIVE-6511
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 
   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
 426c03d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
  d5f34d5 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
  df7e1ee 
 
 Diff: https://reviews.apache.org/r/18808/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Created] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6546:
-

 Summary: WebHCat job submission for pig with -useHCatalog argument 
fails on Windows
 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.11.0, 0.13.0
 Environment: Windows Azure HDINSIGHT and Windows one-box installations.
Reporter: Eric Hanson






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Description: 
On a one-box windows setup, do the following from a powershell prompt:

cmd /c curl.exe -s `
  -d user.name=hadoop `
  -d arg=-useHCatalog `
  -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
  -d statusdir=/tmp/webhcat.output01 `
  'http://localhost:50111/templeton/v1/pig' -v

The job fails with error code 7, but it should run. 

I traced this down to the following. In the job configuration for the 
TempletonJobController, we have templeton.args set to

cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
 = load '/data/emp/emp_0.dat'; dump emp;

Notice the = sign before -useHCatalog. I think this should be a comma.

The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created in  
org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().

It happens at line 434:
{code}
  } else {
  if (i  args.length - 1) {
prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
  }
}
{code}

Bug is here:
{code}
  if (prop != null) {
if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
not contain equal, so else branch is run and appends =-useHCatalog,
  // everything good
} else {
  if (i  args.length - 1) {
prop += = + args[++i];
  }
}
newArgs.add(prop);
  }
{code}
One possible fix is to change the string constant 
org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
 to have an = sign in it. Or, preProcessForWindows() itself could be changed.


 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: Windows Azure HDINSIGHT and Windows one-box 
 installations.
Reporter: Eric Hanson

 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Environment: 
HDInsight deploying HDP 1.3:  c:\apps\dist\pig-0.11.0.1.3.2.0-05
Also on Windows HDP 1.3 one-box configuration.

  was:Windows Azure HDINSIGHT and Windows one-box installations.


 WebHCat job submission for pig with -useHCatalog argument fails on Windows
 --

 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0, 0.12.0, 0.13.0
 Environment: HDInsight deploying HDP 1.3:  
 c:\apps\dist\pig-0.11.0.1.3.2.0-05
 Also on Windows HDP 1.3 one-box configuration.
Reporter: Eric Hanson

 On a one-box windows setup, do the following from a powershell prompt:
 cmd /c curl.exe -s `
   -d user.name=hadoop `
   -d arg=-useHCatalog `
   -d execute=emp = load '/data/emp/emp_0.dat'; dump emp; `
   -d statusdir=/tmp/webhcat.output01 `
   'http://localhost:50111/templeton/v1/pig' -v
 The job fails with error code 7, but it should run. 
 I traced this down to the following. In the job configuration for the 
 TempletonJobController, we have templeton.args set to
 cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog,-execute,emp
  = load '/data/emp/emp_0.dat'; dump emp;
 Notice the = sign before -useHCatalog. I think this should be a comma.
 The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__=-useHCatalog gets created 
 in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
 It happens at line 434:
 {code}
   } else {
   if (i  args.length - 1) {
 prop += = + args[++i];   // RIGHT HERE! at iterations i = 37, 38
   }
 }
 {code}
 Bug is here:
 {code}
   if (prop != null) {
 if (prop.contains(=)) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
 not contain equal, so else branch is run and appends =-useHCatalog,
   // everything good
 } else {
   if (i  args.length - 1) {
 prop += = + args[++i];
   }
 }
 newArgs.add(prop);
   }
 {code}
 One possible fix is to change the string constant 
 org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
  to have an = sign in it. Or, preProcessForWindows() itself could be 
 changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-03 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918290#comment-13918290
 ] 

Eric Hanson commented on HIVE-6511:
---

Can you put this up on ReviewBoard?

 casting from decimal to tinyint,smallint, int and bigint generates different 
 result when vectorization is on
 

 Key: HIVE-6511
 URL: https://issues.apache.org/jira/browse/HIVE-6511
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6511.1.patch


 select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
 vectortab10korc limit 20 generates following result when vectorization is 
 enabled:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253559   -19895  73
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994718  31070   94
 1408783849655.676758  34576568-26440  -72
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511544   -28088  72
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323068  -27332  60
 NULL  NULLNULLNULL
 {code}
 When vectorization is disabled, result looks like this:
 {code}
 4619756289662.078125  -1628520834 -16770  126
 1553532646710.316406  -1245514442 -2762   54
 3367942487288.360352  688127224   -776-8
 4386447830839.337891  1286221623  12087   55
 -3234165331139.458008 -54957251   27453   61
 -488378613475.326172  1247658269  -16099  29
 -493942492598.691406  -21253558   -19894  74
 3101852523586.039062  886135874   23618   66
 2544105595941.381836  1484956709  -23515  37
 -3997512403067.0625   1102149509  30597   -123
 -1183754978977.589355 1655994719  31071   95
 1408783849655.676758  34576567-26441  -73
 -2993175106993.426758 417098319   27215   79
 3004723551798.100586  -1753555402 -8650   54
 1103792083527.786133  -14511545   -28089  71
 469767055288.485352   1615620024  26552   -72
 -1263700791098.294434 -980406074  12486   -58
 -4244889766496.484375 -1462078048 30112   -96
 -3962729491139.782715 1525323069  -27331  61
 NULL  NULLNULLNULL
 {code}
 This issue is visible only for certain decimal values. In above example, row 
 7,11,12, and 15 generates different results.
 vectortab10korc table schema:
 {code}
 t tinyint from deserializer   
 sismallintfrom deserializer   
 i int from deserializer   
 b bigint  from deserializer   
 f float   from deserializer   
 d double  from deserializer   
 dcdecimal(38,18)  from deserializer   
 boboolean from deserializer   
 s string  from deserializer   
 s2string  from deserializer   
 tstimestamp   from deserializer   

 # Detailed Table Information   
 Database: default  
 Owner:xyz  
 CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
 LastAccessTime:   UNKNOWN  
 Protect Mode: None 
 Retention:0
 Location: 
 hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
 Table Type:   MANAGED_TABLE
 Table Parameters:  
   COLUMN_STATS_ACCURATE   true
   numFiles1   
   numRows 1   
   rawDataSize 0   
   totalSize   344748  
   transient_lastDdlTime   1393365281  

 # Storage Information  
 SerDe Library:org.apache.hadoop.hive.ql.io.orc.OrcSerde
 InputFormat:  org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
 OutputFormat

RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

2014-02-28 Thread Eric Hanson (BIG DATA)
Congratulations Xuefu!

-Original Message-
From: Remus Rusanu [mailto:rem...@microsoft.com] 
Sent: Friday, February 28, 2014 11:43 AM
To: dev@hive.apache.org; u...@hive.apache.org
Cc: Xuefu Zhang
Subject: RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

Grats!

From: Prasanth Jayachandran pjayachand...@hortonworks.com
Sent: Friday, February 28, 2014 9:11 PM
To: dev@hive.apache.org
Cc: u...@hive.apache.org; Xuefu Zhang
Subject: Re: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

Congratulations Xuefu!

Thanks
Prasanth Jayachandran

On Feb 28, 2014, at 11:04 AM, Vaibhav Gumashta vgumas...@hortonworks.com 
wrote:

 Congrats Xuefu!


 On Fri, Feb 28, 2014 at 9:20 AM, Prasad Mujumdar pras...@cloudera.comwrote:

   Congratulations Xuefu !!

 thanks
 Prasad



 On Fri, Feb 28, 2014 at 1:20 AM, Carl Steinbach c...@apache.org wrote:

 I am pleased to announce that Xuefu Zhang has been elected to the 
 Hive Project Management Committee. Please join me in congratulating Xuefu!

 Thanks.

 Carl




 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or 
 entity to which it is addressed and may contain information that is 
 confidential, privileged and exempt from disclosure under applicable 
 law. If the reader of this message is not the intended recipient, you 
 are hereby notified that any printing, copying, dissemination, 
 distribution, disclosure or forwarding of this communication is 
 strictly prohibited. If you have received this communication in error, 
 please contact the sender immediately and delete it from your system. Thank 
 You.


--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


need advice on debugging into TempletonJobController.java

2014-02-28 Thread Eric Hanson (BIG DATA)
I want to attach a debugger to TempletonJobController.java (code that runs in a 
map job started by templeton service, that in turn will start another job). 
Does anybody know how to make the job wait for a debugger to attach? i.e. what 
file to modify to change the java opts?

Eric

Details of what I tried:

I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work:

  property
namemapred.child.java.opts/name
value-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m/value
  /property

I also tried this, in:

%hcatalog_home%\etc\webhcat\webhcat-default.xml

Adding:

property
nametempleton.controller.mr.child.opts/name
value-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true/value
descriptionJava options to be passed to templeton controller map task.
The default value of mapreduce child -Xmx (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value
/description
  /property

But the job doesn't appear to wait, and I keep seeing this in my job config:

mapred.child.java.opts

-server -Xmx256m -Djava.net.preferIPv4Stack=true




RE: need advice on debugging into TempletonJobController.java

2014-02-28 Thread Eric Hanson (BIG DATA)
Hey, I found the solution. You need to add this to webhcat-site.xml. -Eric


To attach the debugger to the templeton controller MR job started by the 
templeton service, go to %hcatalog_home%\conf\webhcat-site.xml and add the 
following block (copied from etc\webhcat\webhcat-default.xml, and enhanced with 
the highlighted options for debugging).
  property
nametempleton.controller.mr.child.opts/name
value-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true/value
descriptionJava options to be passed to templeton controller map task.
The default value of mapreduce child -Xmx (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much 
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the 
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value
/description
  /property

-Original Message-
From: Eric Hanson (BIG DATA) [mailto:eric.n.han...@microsoft.com] 
Sent: Friday, February 28, 2014 12:06 PM
To: dev@hive.apache.org
Subject: need advice on debugging into TempletonJobController.java

I want to attach a debugger to TempletonJobController.java (code that runs in a 
map job started by templeton service, that in turn will start another job). 
Does anybody know how to make the job wait for a debugger to attach? i.e. what 
file to modify to change the java opts?

Eric

Details of what I tried:

I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work:

  property
namemapred.child.java.opts/name
value-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m/value
  /property

I also tried this, in:

%hcatalog_home%\etc\webhcat\webhcat-default.xml

Adding:

property
nametempleton.controller.mr.child.opts/name
value-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true/value
descriptionJava options to be passed to templeton controller map task.
The default value of mapreduce child -Xmx (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value
/description
  /property

But the job doesn't appear to wait, and I keep seeing this in my job config:

mapred.child.java.opts

-server -Xmx256m -Djava.net.preferIPv4Stack=true




Re: Review Request 18566: Queries fail to Vectorize.

2014-02-27 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18566/#review35682
---


Looks good.

Please add unit tests to exercise the code you changed, or if this code is 
already covered by other tests, please explain in comments on the JIRA.



common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
https://reviews.apache.org/r/18566/#comment66370

Please add comment saying the purpose of this method


- Eric Hanson


On Feb. 27, 2014, 6:43 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18566/
 ---
 
 (Updated Feb. 27, 2014, 6:43 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6496
 https://issues.apache.org/jira/browse/HIVE-6496
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 1) NPE because row resolver is null.
 2) VectorUDFAdapter doesn't handle decimal.
 3) Decimal cast to boolean, timestamp, string fail because classes are not 
 annotated appropriately.
 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java 09af28a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
  4de9f9f 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 842994e 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFAdaptor.java 
 3bc9493 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 e6be03f 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 54c665e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 
 db4eafa 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTimestamp.java 
 e2529d2 
 
 Diff: https://reviews.apache.org/r/18566/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6496) Queries fail to Vectorize.

2014-02-27 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914962#comment-13914962
 ] 

Eric Hanson commented on HIVE-6496:
---

+1 conditional on addressing my review comments

 Queries fail to Vectorize.
 --

 Key: HIVE-6496
 URL: https://issues.apache.org/jira/browse/HIVE-6496
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6496.1.patch, HIVE-6496.2.patch, HIVE-6496.3.patch


 Following issues are causing many queries to fail to vectorize:
 1) NPE because row resolver is null.
 2) VectorUDFAdapter doesn't handle decimal.
 3) Decimal cast to boolean, timestamp, string fail because classes are not 
 annotated appropriately.
 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


RE: [ANNOUNCE] New Hive Committer - Remus Rusanu

2014-02-26 Thread Eric Hanson (BIG DATA)
Fantastic! Welcome aboard, Remus!

Eric

From: Carl Steinbach [mailto:cwsteinb...@gmail.com]
Sent: Wednesday, February 26, 2014 8:59 AM
To: u...@hive.apache.org; dev@hive.apache.org
Cc: Remus Rusanu
Subject: [ANNOUNCE] New Hive Committer - Remus Rusanu

The Apache Hive PMC has voted to make Remus Rusanu a committer on the Apache 
Hive Project.

Please join me in congratulating Remus!

Thanks.

Carl



Re: Review Request 18184: Vectorized mathematical functions for decimal type.

2014-02-18 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18184/#review34783
---



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
https://reviews.apache.org/r/18184/#comment65070

format comment better (blank after //, blank line before first comment line)



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
https://reviews.apache.org/r/18184/#comment65066

I think you could speed this up with an array fill operation for 
outputIsNull before the loop, but that is a nice-to-have and not essential.



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
https://reviews.apache.org/r/18184/#comment65071

remove trailing white space



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
https://reviews.apache.org/r/18184/#comment65073

remove trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/18184/#comment65132

Please add comment to explain what method does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
https://reviews.apache.org/r/18184/#comment65136

delete trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
https://reviews.apache.org/r/18184/#comment65137

delete trailing white space




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
https://reviews.apache.org/r/18184/#comment65138

fix comment format



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
https://reviews.apache.org/r/18184/#comment65139

remove trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
https://reviews.apache.org/r/18184/#comment65140

remove trailing white space



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
https://reviews.apache.org/r/18184/#comment65144

please add cases for non-zero values close to 0 like -0.3 and 0.3

for floor and ceiling



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
https://reviews.apache.org/r/18184/#comment65147

Please add test to negate 0 and make sure you still get 0



ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q
https://reviews.apache.org/r/18184/#comment65148

please remove trailing white space in .q file (several locations)


- Eric Hanson


On Feb. 17, 2014, 9:05 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18184/
 ---
 
 (Updated Feb. 17, 2014, 9:05 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6416
 https://issues.apache.org/jira/browse/HIVE-6416
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorized mathematical functions for decimal type.
 
 
 Diffs
 -
 
   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 1b76fc9 
   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 2e0f058 
   ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 f69bfc0 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalUtil.java
  589450f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 628f06d 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java 
 1c1bcfe 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCeil.java 
 ceb56bb 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFloor.java 
 a95a263 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNegative.java 
 f355a82 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java 
 5cc8025 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_decimal_math_funcs.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18184/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Commented] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-18 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904829#comment-13904829
 ] 

Eric Hanson commented on HIVE-6416:
---

Looks good to me.

+1 conditional on addressing my review comments (all of which are minor)

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-17 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Vectorization
Reporter: Eric Hanson
Assignee: Eric Hanson
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
 HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6452:
-

 Summary: fix bug in UnsignedInt128.multiplyArrays4And4To8 and 
revert temporary fix in Decimal128.multiplyDestructive
 Key: HIVE-6452
 URL: https://issues.apache.org/jira/browse/HIVE-6452
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson


UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
failures, one of which appears in TestDecimal128.testKnownPriorErrors.

Fix the bug by finishing the TODO section in 
UnsignedInt128.multiplyArrays4And4To8 in the provided 
multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
per-operation storage allocations.

Retain the rest of the work (the new tests) in 
multiplyArrays4And4To8-start.patch as much as possible.

Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
multiplyDestructiveNativeDecimal128.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6452:
--

Attachment: multiplyArrays4And4To8-start.patch

 fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in 
 Decimal128.multiplyDestructive
 ---

 Key: HIVE-6452
 URL: https://issues.apache.org/jira/browse/HIVE-6452
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
 Attachments: multiplyArrays4And4To8-start.patch


 UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
 failures, one of which appears in TestDecimal128.testKnownPriorErrors.
 Fix the bug by finishing the TODO section in 
 UnsignedInt128.multiplyArrays4And4To8 in the provided 
 multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
 per-operation storage allocations.
 Retain the rest of the work (the new tests) in 
 multiplyArrays4And4To8-start.patch as much as possible.
 Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
 short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
 multiplyDestructiveNativeDecimal128.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6452:
--

Assignee: Jitendra Nath Pandey

 fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in 
 Decimal128.multiplyDestructive
 ---

 Key: HIVE-6452
 URL: https://issues.apache.org/jira/browse/HIVE-6452
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: multiplyArrays4And4To8-start.patch


 UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
 failures, one of which appears in TestDecimal128.testKnownPriorErrors.
 Fix the bug by finishing the TODO section in 
 UnsignedInt128.multiplyArrays4And4To8 in the provided 
 multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
 per-operation storage allocations.
 Retain the rest of the work (the new tests) in 
 multiplyArrays4And4To8-start.patch as much as possible.
 Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
 short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
 multiplyDestructiveNativeDecimal128.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6435) Allow specification of alternate metastore in WebHCat job

2014-02-15 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6435:
--

Description: Allow a user to specify with their WebHCat Hive and Pig jobs a 
metastore database JDBC connection string. For the job, this overrides the 
default metastore configured for the cluster.  (was: Allow a user to specify 
with their WebHCat jobs a metastore database JDBC connection string. For the 
job, this overrides the default metastore configured for the cluster.)

 Allow specification of alternate metastore in WebHCat job
 -

 Key: HIVE-6435
 URL: https://issues.apache.org/jira/browse/HIVE-6435
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson
Assignee: Eric Hanson

 Allow a user to specify with their WebHCat Hive and Pig jobs a metastore 
 database JDBC connection string. For the job, this overrides the default 
 metastore configured for the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5759) Implement vectorized support for COALESCE conditional expression

2014-02-14 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13901749#comment-13901749
 ] 

Eric Hanson commented on HIVE-5759:
---

+1

Also, the failure in testHighPrecisionDecimal128Multiply is external to this 
patch.

 Implement vectorized support for COALESCE conditional expression
 

 Key: HIVE-5759
 URL: https://issues.apache.org/jira/browse/HIVE-5759
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5759.1.patch, HIVE-5759.2.patch


 Implement full, end-to-end support for COALESCE in vectorized mode, including 
 new VectorExpression class(es), VectorizationContext translation to a 
 VectorExpression, and unit tests for these, as well as end-to-end ad hoc 
 testing. An end-to-end .q test is recommended.
 This is lower priority than IF and CASE but it is still a fairly popular 
 expression.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6399:
-

Assignee: Eric Hanson  (was: Remus Rusanu)

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Vectorization
Reporter: Eric Hanson
Assignee: Eric Hanson
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
 HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13901995#comment-13901995
 ] 

Eric Hanson commented on HIVE-6399:
---

Remus' patch is technically good. I have a question I'll raise with the PMC 
though about the comment about using the algorithm from 
BigInteger.multiplyToLen. For now I'm going to promote my original patch to get 
it in so we can get the bug failure out of trunk.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Vectorization
Reporter: Eric Hanson
Assignee: Eric Hanson
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
 HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.05.patch

Promoting patch 02 to first position to get committed, now as 05.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Vectorization
Reporter: Eric Hanson
Assignee: Eric Hanson
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
 HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6435) Allow specification of alternate metastore in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6435:
-

 Summary: Allow specification of alternate metastore in WebHCat job
 Key: HIVE-6435
 URL: https://issues.apache.org/jira/browse/HIVE-6435
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson
Assignee: Eric Hanson


Allow a user to specify with their WebHCat jobs a metastore database JDBC 
connection string. For the job, this overrides the default metastore configured 
for the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6436:
-

 Summary: Allow specification of one or more additional Windows 
Azure storage accounts in WebHCat job
 Key: HIVE-6436
 URL: https://issues.apache.org/jira/browse/HIVE-6436
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson


Allow a user to specify one or more additional Windows Azure storage accounts, 
including account name and key, in a WebHCat Hive job submission. These would 
be in addition to any that were specified in the default cluster configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6436:
-

Assignee: Eric Hanson

 Allow specification of one or more additional Windows Azure storage accounts 
 in WebHCat job
 ---

 Key: HIVE-6436
 URL: https://issues.apache.org/jira/browse/HIVE-6436
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson
Assignee: Eric Hanson

 Allow a user to specify one or more additional Windows Azure storage 
 accounts, including account name and key, in a WebHCat Hive job submission. 
 These would be in addition to any that were specified in the default cluster 
 configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression

2014-02-13 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18025/#review34370
---



ql/src/test/queries/clientpositive/vector_coalesce.q
https://reviews.apache.org/r/18025/#comment64447

Can you do one with  3 arguments too? Will that vectorize?



ql/src/test/queries/clientpositive/vector_coalesce.q
https://reviews.apache.org/r/18025/#comment64450

Please also test for smallint and timestamp.



ql/src/test/queries/clientpositive/vector_coalesce.q
https://reviews.apache.org/r/18025/#comment64451

Please also test for expressions as arguments, not just columns.



ql/src/test/queries/clientpositive/vector_coalesce.q
https://reviews.apache.org/r/18025/#comment64448

It is not unusual to use COALESCE like this:

COALESCE(col1, ..., colK, 0)

So if arguments 1..K are NULL, the default value is the constant at the 
end, 0 in this case. Could you please make that work in this patch, or open a 
separate JIRA to do it later?


- Eric Hanson


On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18025/
 ---
 
 (Updated Feb. 12, 2014, 7 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-5759
 https://issues.apache.org/jira/browse/HIVE-5759
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Implement vectorized support for COALESCE conditional expression
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java 
 f1eef14 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
 d0d8597 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java 
 cb23129 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java 
 aa05b19 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 7141d63 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 21fe8ca 
   ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18025/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-12 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.02.patch

Uploading again to trigger precommit tests.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
 HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression

2014-02-12 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18025/#review34336
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
https://reviews.apache.org/r/18025/#comment64388

I think setRef is only safe for base vectors (that get data from table 
columns), not intermediate working results. There was bug there since those can 
get re-used during processing of a single vectorized row batch.

So, use setVal here unless you know the source vector is a base vector 
loaded from a table column.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java
https://reviews.apache.org/r/18025/#comment64389

use .update() instead of = assignment or you could have a bug.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/18025/#comment64392

please add comment to explain what method does



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/18025/#comment64395

This is the same code block as the previous case. Can you share the case 
and change the condition to an OR?

Up to you...



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
https://reviews.apache.org/r/18025/#comment64396

I'm not sure this is always EOF. Consider deleting , this is EOF



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
https://reviews.apache.org/r/18025/#comment64397

This can have  1 argument. Please add comment to explain.





ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
https://reviews.apache.org/r/18025/#comment64398

What happens if one of the inputs is a scalar, not a column? 



ql/src/test/queries/clientpositive/vector_coalesce.q
https://reviews.apache.org/r/18025/#comment64399

ERIC TODO: start reviewing here


- Eric Hanson


On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18025/
 ---
 
 (Updated Feb. 12, 2014, 7 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-5759
 https://issues.apache.org/jira/browse/HIVE-5759
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Implement vectorized support for COALESCE conditional expression
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java 
 f1eef14 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
 d0d8597 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java 
 cb23129 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java 
 aa05b19 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 7141d63 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 21fe8ca 
   ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION 
   ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18025/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6399:
-

Assignee: Eric Hanson

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.02.patch

Patch with update to Decimal128.multiplyDestructive() to make it use 
HiveDecimal.multiply internally, plus updated tests.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 started by Eric Hanson.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work stopped] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 stopped by Eric Hanson.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 started by Eric Hanson.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Status: Patch Available  (was: In Progress)

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13898550#comment-13898550
 ] 

Eric Hanson commented on HIVE-6399:
---

Review board entry: https://reviews.apache.org/r/17972/

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17769: Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17769/#review34088
---

Ship it!


The functionality looks good. Please address the minor issues about the 
comments that I pointed out. No need for me to do another review.


ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment64058

there - their



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment64060

Please add comment before method explaining what it does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment64059

loose - lose


- Eric Hanson


On Feb. 8, 2014, 6:15 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17769/
 ---
 
 (Updated Feb. 8, 2014, 6:15 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6333
 https://issues.apache.org/jira/browse/HIVE-6333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Generate vectorized plan for decimal expressions.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 
   
 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt
  699b7c5 
   
 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt
  99366ca 
   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 
 2aa4152 
   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 
 2e84334 
   
 ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt
  9578d34 
   ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 
 6ee9d5f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
  1c70387 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 f5ab731 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
 f513188 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java
  4510368 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java
  6a7762d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDecimal.java
  14b91e1 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDouble.java
  2ba1509 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
  65a804d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
  5b2a658 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDoubleToDecimal.java
  14e30c3 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastLongToDecimal.java
  1d4d84d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDecimal.java
  41762ed 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToDecimal.java
  37e92e1 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java
  cac1d80 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java
  93052a1 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncDoubleToDecimal.java
  8b2a6f0 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncLongToDecimal.java
  18d1dbb 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpression.java
  d00d99b 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
  e5c3aa4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
  a242fef 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 ad96fa5 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java 4f59125 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java e4dfcc9 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java 4e2d1d4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 6f9746c 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java e794e92 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 4e64d47 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
 9a04e81 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqual.java 
 3479b13 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrGreaterThan.java
  edb1bf8 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic

[jira] [Created] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)
Eric Hanson created HIVE-6399:
-

 Summary: bug in high-precision Decimal128 multiply
 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
 Fix For: 0.13.0


For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896935#comment-13896935
 ] 

Eric Hanson commented on HIVE-6333:
---

I opened bug HIVE-6399 to track the testHighPrecisionDecimal128Multiply 
failure. It is external to this patch.

 Generate vectorized plan for decimal expressions.
 -

 Key: HIVE-6333
 URL: https://issues.apache.org/jira/browse/HIVE-6333
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, 
 HIVE-6333.4.patch, HIVE-6333.5.patch


 Transform non-vector plan to vectorized plan for supported decimal 
 expressions. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Description: 
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failures. This is one example of such a 
failure.

  was:
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184




 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
 Fix For: 0.13.0


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failures. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Description: 
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failure. This is one example of such a 
failure.

  was:
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failures. This is one example of such a 
failure.


 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
 Fix For: 0.13.0


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.01.patch

Attached patch with explicit test for this known bug in testKnownPriorErrors. 
No fix yet. 

A quick fix would be to use BigDecimal multiply inside Decimal128 multiply. 
Although this would not perform well, it'd be safe.

 bug in high-precision Decimal128 multiply
 -

 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
 Fix For: 0.13.0

 Attachments: HIVE-6399.01.patch


 For operation -605044214913338382 * 55269579109718297360
 expected: -33440539101030154945490585226577271520
 but was:   -33440539021801992431226247633033321184
 More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
 many times, you'll get an occasional failure. This is one example of such a 
 failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13897040#comment-13897040
 ] 

Eric Hanson commented on HIVE-6333:
---

+1

 Generate vectorized plan for decimal expressions.
 -

 Key: HIVE-6333
 URL: https://issues.apache.org/jira/browse/HIVE-6333
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, 
 HIVE-6333.4.patch, HIVE-6333.5.patch, HIVE-6333.6.patch


 Transform non-vector plan to vectorized plan for supported decimal 
 expressions. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 17769: Generate vectorized plan for decimal expressions.

2014-02-07 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17769/#review33942
---


Overall this looks good. Please see my specific comments. I did find one bug 
(used an Add in place of Subtract in GenericUDFOpMinus), 
and possibly one design issue related to implicit cast precision and scale.


ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment63766

Please add a comment explaining what castExpressionUdfs is for



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment63767

Expand the comment to explain the kind of situations where this is 
necessary.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment63768

Add comment before method explain what it does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment63808

Hive Java coding standard says put blank line before all comments.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
https://reviews.apache.org/r/17769/#comment63769


Because TypeInfo has decimal precision/scale, the output scale is not 
always the same as the input scale. E.g. I've seen that 
decimal(18,2)*decimal(18,2) might have scale=4 or something like that. 

Might it be better to have integers be cast to decimal(19,0) and floats to, 
say, decimal(38,18) or something like that, so you never or rarely lose 
information during the cast, or get a NULL due to overflow? But of course, you 
would not change the expression result precision/scale.

What you have here looks pretty good, but it may be worth more thought.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java
https://reviews.apache.org/r/17769/#comment63816

add comment saying briefly what method does



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java
https://reviews.apache.org/r/17769/#comment63823

DecimalColAddDecimalScalar should be subtact



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
https://reviews.apache.org/r/17769/#comment63825

please add brief comment saying what this test checks



ql/src/test/queries/clientpositive/vector_decimal_expressions.q
https://reviews.apache.org/r/17769/#comment63828

I think we need a JIRA to add unary minus for vectorized decimal, plus a 
test.



ql/src/test/results/clientpositive/vectorization_short_regress.q.out
https://reviews.apache.org/r/17769/#comment63837

It looks like some new rows showed up in the output after you changed the 
test. Is this expected, or does it reveal a correctness issue?


- Eric Hanson


On Feb. 7, 2014, 2:31 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17769/
 ---
 
 (Updated Feb. 7, 2014, 2:31 a.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Bugs: HIVE-6333
 https://issues.apache.org/jira/browse/HIVE-6333
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Generate vectorized plan for decimal expressions.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 
   
 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt
  699b7c5 
   
 ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt
  99366ca 
   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 
 2aa4152 
   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 
 2e84334 
   
 ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt
  9578d34 
   ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 
 6ee9d5f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
  1c70387 
   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
 f5ab731 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
 f513188 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java
  4510368 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java
  6a7762d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDecimal.java
  14b91e1 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDouble.java
  2ba1509 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
  65a804d 
   
 ql

Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.

2014-02-05 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17622/#review33747
---

Ship it!


Ship It!

- Eric Hanson


On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17622/
 ---
 
 (Updated Jan. 31, 2014, 10:19 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 VectorExpressionWriter for date and decimal datatypes.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
 f513188 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
  e5c3aa4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
  a242fef 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 ad96fa5 
   ql/src/test/queries/clientpositive/vectorization_decimal_date.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/vectorization_decimal_date.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17622/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.

2014-02-03 Thread Eric Hanson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17622/#review33378
---


Looks good to me. See one comment inline.


ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
https://reviews.apache.org/r/17622/#comment62885

Please add a comment why you are using decimal.* and why it's different 
than the others.


- Eric Hanson


On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17622/
 ---
 
 (Updated Jan. 31, 2014, 10:19 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 VectorExpressionWriter for date and decimal datatypes.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
 f513188 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
  e5c3aa4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
  a242fef 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 ad96fa5 
   ql/src/test/queries/clientpositive/vectorization_decimal_date.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/vectorization_decimal_date.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17622/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey
 




[jira] [Work stopped] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-02-03 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6234 stopped by Eric Hanson.

 Implement fast vectorized InputFormat extension for text files
 --

 Key: HIVE-6234
 URL: https://issues.apache.org/jira/browse/HIVE-6234
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
 InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
 state-diagram.jpg


 Implement support for vectorized scan input of text files (plain text with 
 configurable record and field separators). This should work for CSV files, 
 tab delimited files, etc. 
 The goal is to provide high-performance reading of these files using 
 vectorized scans, and also to do it as an extension of existing Hive. Then, 
 if vectorized query is enabled, existing tables based on text files will be 
 able to benefit immediately without the need to use a different input format. 
 After upgrading to new Hive bits that support this, faster, vectorized 
 processing over existing text tables should just work, when vectorization is 
 enabled.
 Another goal is to go beyond a simple layering of vectorized row batch 
 iterator over the top of the existing row iterator. It should be possible to, 
 say, read a chunk of data into a byte buffer (several thousand or even 
 million rows), and then read data from it into vectorized row batches 
 directly. Object creations should be minimized to save allocation time and GC 
 overhead. If it is possible to save CPU for values like dates and numbers by 
 caching the translation from string to the final data type, that should 
 ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-31 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6234:
--

Attachment: HIVE-6234.03.patch

Non-working code, with some top-down refinement of how to get a 
batch/line/field. See comments in the code about open questions about the 
mapping from table columns into the batch.

Need to determine how to get the types of the columns for use by the text 
reader too. E.g. even though a field uses a LongColumnVector, it might need to 
treat the text data as an integer, Boolean, date, or timestamp.

 Implement fast vectorized InputFormat extension for text files
 --

 Key: HIVE-6234
 URL: https://issues.apache.org/jira/browse/HIVE-6234
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
 InputFormat design.docx, Vectorized Text InputFormat design.pdf


 Implement support for vectorized scan input of text files (plain text with 
 configurable record and field separators). This should work for CSV files, 
 tab delimited files, etc. 
 The goal is to provide high-performance reading of these files using 
 vectorized scans, and also to do it as an extension of existing Hive. Then, 
 if vectorized query is enabled, existing tables based on text files will be 
 able to benefit immediately without the need to use a different input format. 
 After upgrading to new Hive bits that support this, faster, vectorized 
 processing over existing text tables should just work, when vectorization is 
 enabled.
 Another goal is to go beyond a simple layering of vectorized row batch 
 iterator over the top of the existing row iterator. It should be possible to, 
 say, read a chunk of data into a byte buffer (several thousand or even 
 million rows), and then read data from it into vectorized row batches 
 directly. Object creations should be minimized to save allocation time and GC 
 overhead. If it is possible to save CPU for values like dates and numbers by 
 caching the translation from string to the final data type, that should 
 ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-31 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888279#comment-13888279
 ] 

Eric Hanson commented on HIVE-6234:
---

This is just getting started. I need to put this aside for a while (probably at 
least until the end of Feb.). I parked the latest information here on the JIRA.

 Implement fast vectorized InputFormat extension for text files
 --

 Key: HIVE-6234
 URL: https://issues.apache.org/jira/browse/HIVE-6234
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
 InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
 state-diagram.jpg


 Implement support for vectorized scan input of text files (plain text with 
 configurable record and field separators). This should work for CSV files, 
 tab delimited files, etc. 
 The goal is to provide high-performance reading of these files using 
 vectorized scans, and also to do it as an extension of existing Hive. Then, 
 if vectorized query is enabled, existing tables based on text files will be 
 able to benefit immediately without the need to use a different input format. 
 After upgrading to new Hive bits that support this, faster, vectorized 
 processing over existing text tables should just work, when vectorization is 
 enabled.
 Another goal is to go beyond a simple layering of vectorized row batch 
 iterator over the top of the existing row iterator. It should be possible to, 
 say, read a chunk of data into a byte buffer (several thousand or even 
 million rows), and then read data from it into vectorized row batches 
 directly. Object creations should be minimized to save allocation time and GC 
 overhead. If it is possible to save CPU for values like dates and numbers by 
 caching the translation from string to the final data type, that should 
 ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-31 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6234:
--

Attachment: state-diagram.jpg

State diagram for finding line breaks. May be of use for future reference. Not 
done. Just a working document.

 Implement fast vectorized InputFormat extension for text files
 --

 Key: HIVE-6234
 URL: https://issues.apache.org/jira/browse/HIVE-6234
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
 Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
 InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
 state-diagram.jpg


 Implement support for vectorized scan input of text files (plain text with 
 configurable record and field separators). This should work for CSV files, 
 tab delimited files, etc. 
 The goal is to provide high-performance reading of these files using 
 vectorized scans, and also to do it as an extension of existing Hive. Then, 
 if vectorized query is enabled, existing tables based on text files will be 
 able to benefit immediately without the need to use a different input format. 
 After upgrading to new Hive bits that support this, faster, vectorized 
 processing over existing text tables should just work, when vectorization is 
 enabled.
 Another goal is to go beyond a simple layering of vectorized row batch 
 iterator over the top of the existing row iterator. It should be possible to, 
 say, read a chunk of data into a byte buffer (several thousand or even 
 million rows), and then read data from it into vectorized row batches 
 directly. Object creations should be minimized to save allocation time and GC 
 overhead. If it is possible to save CPU for values like dates and numbers by 
 caching the translation from string to the final data type, that should 
 ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic

2014-01-31 Thread Eric Hanson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6257:
--

   Resolution: Implemented
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk

 Add more unit tests for high-precision Decimal128 arithmetic
 

 Key: HIVE-6257
 URL: https://issues.apache.org/jira/browse/HIVE-6257
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson
Assignee: Eric Hanson
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6257.02.patch, HIVE-6257.03.patch, 
 HIVE-6257.04.patch


 Add more unit tests for high-precision Decimal128 arithmetic, with arguments 
 close to or at 38 digit limit. Consider some random stress tests for broader 
 coverage. Coverage is pretty good now (after HIVE-6243) for precision up to 
 about 18. This is to go beyond that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   3   4   5   6   7   8   9   >