from:"Eric Hanson"

[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-09-02 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118909#comment-14118909
 ] 

Eric Hanson commented on HIVE-7901:
---

Okay, thanks

> CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
> args to metastore (org.apache.hive.hcatalog version)
> 
>
> Key: HIVE-7901
> URL: https://issues.apache.org/jira/browse/HIVE-7901
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Eric Hanson
> Attachments: hive-7901.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-09-02 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118896#comment-14118896
 ] 

Eric Hanson commented on HIVE-7901:
---

Thanks, [~sushanth]. Will you commit this or do you want me to do it? -Eric

> CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
> args to metastore (org.apache.hive.hcatalog version)
> 
>
> Key: HIVE-7901
> URL: https://issues.apache.org/jira/browse/HIVE-7901
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Eric Hanson
> Attachments: hive-7901.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115550#comment-14115550
 ] 

Eric Hanson commented on HIVE-7901:
---

[~sushanth], please have a look and +1/commit if you think it's ready. Thanks! 

> CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
> args to metastore (org.apache.hive.hcatalog version)
> 
>
> Key: HIVE-7901
> URL: https://issues.apache.org/jira/browse/HIVE-7901
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Eric Hanson
> Attachments: hive-7901.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-7901:
--

Status: Patch Available  (was: Open)

> CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
> args to metastore (org.apache.hive.hcatalog version)
> 
>
> Key: HIVE-7901
> URL: https://issues.apache.org/jira/browse/HIVE-7901
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Eric Hanson
> Attachments: hive-7901.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-7901) CLONE - pig -useHCatalog with embedded metastore fails to pass command line args to metastore (org.apache.hive.hcatalog version)

2014-08-29 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-7901:
--

Attachment: hive-7901.01.patch

I modified the original HIVE-6633 patch to put the changes in the right place, 
under apache/hive.  This is a new patch for those changes based directly off 
the current hive trunk.

> CLONE - pig -useHCatalog with embedded metastore fails to pass command line 
> args to metastore (org.apache.hive.hcatalog version)
> 
>
> Key: HIVE-7901
> URL: https://issues.apache.org/jira/browse/HIVE-7901
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.14.0
>Reporter: Sushanth Sowmyan
>Assignee: Eric Hanson
> Attachments: hive-7901.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-08-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114367#comment-14114367
 ] 

Eric Hanson commented on HIVE-6633:
---

Thanks Sushanth for tracking down the problem. I'll regenerate the patch and 
track that on HIVE-7901.

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7357) Add vectorized support for BINARY data type

2014-07-17 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065831#comment-14065831
 ] 

Eric Hanson commented on HIVE-7357:
---

Hi Matt. This looks good overall. Please see my comments on ReviewBoard.

> Add vectorized support for BINARY data type
> ---
>
> Key: HIVE-7357
> URL: https://issues.apache.org/jira/browse/HIVE-7357
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7357.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-09 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057085#comment-14057085
 ] 

Eric Hanson commented on HIVE-7262:
---

Matt, can you upload your patch to your ReviewBoard page? I didn't see a "View 
Diff" button. I see you did include a link above -- sorry I missed that.

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize

2014-07-09 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057078#comment-14057078
 ] 

Eric Hanson commented on HIVE-7262:
---

[~mmccline] put a code review at: https://reviews.apache.org/r/23186/. Matt, if 
you could attach this to your JIRAs in the future, that'd be great.

> Partitioned Table Function (PTF) query fails on ORC table when attempting to 
> vectorize
> --
>
> Key: HIVE-7262
> URL: https://issues.apache.org/jira/browse/HIVE-7262
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch
>
>
> In ptf.q, create the part table with STORED AS ORC and SET 
> hive.vectorized.execution.enabled=true;
> Queries fail to find BLOCKOFFSET virtual column during vectorization and 
> suffers an exception.
> ERROR vector.VectorizationContext 
> (VectorizationContext.java:getInputColumnIndex(186)) - The column 
> BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map.
> Jitendra pointed to the routine that returns the VectorizationContext in 
> Vectorize.java needing to add virtual columns to the map, too.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending

2014-06-24 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042330#comment-14042330
 ] 

Eric Hanson commented on HIVE-7266:
---

Also, I recall in a past an error that looked similar to this, which I think 
was related to incorrect column re-use within batches. The code for that was in 
VectorizationContext.

> Optimized HashTable with vectorized map-joins results in String columns 
> extending
> -
>
> Key: HIVE-7266
> URL: https://issues.apache.org/jira/browse/HIVE-7266
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: hive-7266-small-test.tgz
>
>
> The following query returns different results when both vectorized mapjoin 
> and the new optimized hashtable are enabled.
> {code}
> hive> set hive.vectorized.execution.enabled=false;
> hive> select s_suppkey, n_name from supplier, nation where s_nationkey = 
> n_nationkey limit 25;
> ...
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRAN
> 7236869 RUSSIA
> 2276869 INDIA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIA
> 2616869 FRANCE
> {code}
> But when vectorization is enabled, the results are 
> {code}
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRANIA
> 7236869 RUSSIA
> 2276869 INDIAA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIAQUE
> 2616869 FRANCEAQUE
> {code}
> it works correctly with vectorization when the new optimized map-join 
> hashtable is disabled 
> {code}
> hive> set hive.vectorized.execution.enabled=true; 
> 
> hive> set hive.mapjoin.optimized.hashtable=false; 
> 
> hive> select s_suppkey, n_name from supplier, nation where s_nationkey = 
> n_nationkey limit 25;
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRAN
> 7236869 RUSSIA
> 2276869 INDIA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIA
> 2616869 FRANCE
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7266) Optimized HashTable with vectorized map-joins results in String columns extending

2014-06-20 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039509#comment-14039509
 ] 

Eric Hanson commented on HIVE-7266:
---

This looks like it might be related to using setRef() in BytesColumnVector whe 
setVal() should be used. That is something to look into.

> Optimized HashTable with vectorized map-joins results in String columns 
> extending
> -
>
> Key: HIVE-7266
> URL: https://issues.apache.org/jira/browse/HIVE-7266
> Project: Hive
>  Issue Type: Bug
>  Components: Tez, Vectorization
>Affects Versions: 0.14.0
>Reporter: Gopal V
>Assignee: Jitendra Nath Pandey
> Attachments: hive-7266-small-test.tgz
>
>
> The following query returns different results when both vectorized mapjoin 
> and the new optimized hashtable are enabled.
> {code}
> hive> set hive.vectorized.execution.enabled=false;
> hive> select s_suppkey, n_name from supplier, nation where s_nationkey = 
> n_nationkey limit 25;
> ...
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRAN
> 7236869 RUSSIA
> 2276869 INDIA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIA
> 2616869 FRANCE
> {code}
> But when vectorization is enabled, the results are 
> {code}
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRANIA
> 7236869 RUSSIA
> 2276869 INDIAA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIAQUE
> 2616869 FRANCEAQUE
> {code}
> it works correctly with vectorization when the new optimized map-join 
> hashtable is disabled 
> {code}
> hive> set hive.vectorized.execution.enabled=true; 
> 
> hive> set hive.mapjoin.optimized.hashtable=false; 
> 
> hive> select s_suppkey, n_name from supplier, nation where s_nationkey = 
> n_nationkey limit 25;
> 316869  JAPAN
> 1636869 RUSSIA
> 1096869 IRAN
> 7236869 RUSSIA
> 2276869 INDIA
> 8516869 ARGENTINA
> 2636869 MOZAMBIQUE
> 3836869 ROMANIA
> 2616869 FRANCE
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

2014-05-21 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004882#comment-14004882
 ] 

Eric Hanson commented on HIVE-7105:
---

I agree with Remus. If you do want to get good performance with vectorization 
on the reduce side, you'll need to think carefully about how you can 
efficiently create full VectorizedRowBatches. Single-row or small 
VectorizedRowBatches will not give performance gains. Also, if it is expensive 
to load rows into the batches on the reduce side, that could dominate total 
runtime.

> Enable ReduceRecordProcessor to generate VectorizedRowBatches
> -
>
> Key: HIVE-7105
> URL: https://issues.apache.org/jira/browse/HIVE-7105
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-7105.1.patch
>
>
> Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
> operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
> downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6918) ALTER TABLE using embedded metastore fails with duplicate key violation in 'dbo.SERDES'

2014-04-15 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6918:
-

 Summary: ALTER TABLE using embedded metastore fails with duplicate 
key violation in 'dbo.SERDES'
 Key: HIVE-6918
 URL: https://issues.apache.org/jira/browse/HIVE-6918
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
 Environment: hive-0.11.0.1.3.7.0-01272
HDInsight version: 2.1.4.0.661685
Reporter: Eric Hanson


An HDINSIGHT customer is doing some heavy metadata operations using an embedded 
metastore. They get an error with a duplicate key in a metastore table 
'dbo.SERDES'. They have multiple concurrent jobs doing ALTER TABLE concurrently 
(on different tables, I believe) using the same metastore database, but with 
each job having an embedded metastore because they set hive.metastore.uris to 
the empty string.

The script looks like:

set hive.metastore.uris=;
...
CREATE EXTERNAL TABLE IF NOT EXISTS 
InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 (
...
)
PARTITIONED BY (tenant string, d string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
;
ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ...;
... (several more like this);
ALTER TABLE InstanceSpaceData_828c53de_ad24_928e_3db3_948cf821a3e0 ADD IF NOT 
EXISTS PARTITION (tenant='8dddaf7c-2354-47ae-87a7-b781f14f8c11', d='20140414') 
LOCATION 
'wasb://instancespaceb...@advisor27415020383770839.blob.core.windows.net/v0/tenant=8dddaf7c-2354-47ae-87a7-b781f14f8c11/d=20140414/';
... several more like the above (14 ALTER TABLE statements in a row)
...

Then they get this error:

...
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
NestedThrowablesStackTrace:
java.sql.BatchUpdateException: Violation of PRIMARY KEY constraint 
'PK_serdes_SERDE_ID'. Cannot insert duplicate key in object 'dbo.SERDES'. The 
duplicate key value is (209703).
at 
com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeBatch(SQLServerPreparedStatement.java:1160)
at 
com.jolbox.bonecp.StatementHandle.executeBatch(StatementHandle.java:469)
at 
org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:583)
at 
org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:291)
at 
org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:267)
at 
org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.getValue(RDBMSJoinMapStore.java:656)
at 
org.datanucleus.store.rdbms.scostore.RDBMSJoinMapStore.putAll(RDBMSJoinMapStore.java:195)
at 
org.datanucleus.store.mapped.mapping.MapMapping.postInsert(MapMapping.java:135)
at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:517)
...




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951603#comment-13951603
 ] 

Eric Hanson commented on HIVE-6633:
---

Sushanth, thanks for getting this in to 0.13!

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0, 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951552#comment-13951552
 ] 

Eric Hanson commented on HIVE-6752:
---

+1

Thanks for the response on review board. I agree that it is reasonable to take 
up the issues raised in separate JIRAs, which are not time-critical at this 
point.

> Vectorized Between and IN expressions don't work with decimal, date types.
> --
>
> Key: HIVE-6752
> URL: https://issues.apache.org/jira/browse/HIVE-6752
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
> HIVE-6752.4.patch
>
>
> Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951533#comment-13951533
 ] 

Eric Hanson commented on HIVE-6752:
---

Please see my comments on review board

> Vectorized Between and IN expressions don't work with decimal, date types.
> --
>
> Key: HIVE-6752
> URL: https://issues.apache.org/jira/browse/HIVE-6752
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6752.1.patch, HIVE-6752.2.patch, HIVE-6752.3.patch, 
> HIVE-6752.4.patch
>
>
> Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-28 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38958
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
<https://reviews.apache.org/r/19718/#comment71328>

please add a comment to explain why we use the sum of all the counts here 
to determine the array size.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
<https://reviews.apache.org/r/19718/#comment71329>

Consider for readability/encapsulation having a function to compute offset, 
e.g. 

isNull[decimalOffset(index)] = false;

Please add a comment to explain offset logic.

Does addition of decimal affect any other offsets? I guess not.



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
<https://reviews.apache.org/r/19718/#comment71330>

Timestamp is supposed to be represented as a long (# of nanos since epoch). 
So whey is this using a FilterStringColumnBetween?



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
<https://reviews.apache.org/r/19718/#comment71331>

Again, why string and not long "not between" operator?


- Eric Hanson


On March 28, 2014, 9:56 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19718/
> ---
> 
> (Updated March 28, 2014, 9:56 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6752
> https://issues.apache.org/jira/browse/HIVE-6752
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorized Between and IN expressions don't work with decimal, date types.
> 
> 
> Diffs
> -
> 
>   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
>   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
> 2229079 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 96e74a9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> c2240c0 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java
>  5ebab70 
>   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
>   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/19718/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951169#comment-13951169
 ] 

Eric Hanson commented on HIVE-6633:
---

[~rhbutani] Can you approve this to go into 0.13 please?

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-28 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951057#comment-13951057
 ] 

Eric Hanson commented on HIVE-6633:
---

[~thejas] Can you commit this to 0.13 please?

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6752) Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949765#comment-13949765
 ] 

Eric Hanson commented on HIVE-6752:
---

+1

Conditional on addressing my comments in the code review. All of them are minor.

> Vectorized Between and IN expressions don't work with decimal, date types.
> --
>
> Key: HIVE-6752
> URL: https://issues.apache.org/jira/browse/HIVE-6752
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6752.1.patch
>
>
> Vectorized Between and IN expressions don't work with decimal, date types.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19718: Vectorized Between and IN expressions don't work with decimal, date types.

2014-03-27 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19718/#review38752
---


Looks good overall. Only minor comments.


ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
<https://reviews.apache.org/r/19718/#comment71027>

please remove all trailing whitespace in this file



ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt
<https://reviews.apache.org/r/19718/#comment71034>

add blank after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/19718/#comment71038>

Couldn't determine common type ...

sounds better



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71053>

Change comment. This is not a filter, it is a Boolean-valued expression.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71052>

Remove the comment about "This is optimized for lookup of the data type of 
the column." 

because that doesn't apply here since you're using the standard HashSet.

But it is still pretty good :-)



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71057>

formatting: j=0 ==> j = 0




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71059>

add blanks line before comment and space after //



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71062>

remove "This is optimized"



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
<https://reviews.apache.org/r/19718/#comment71061>

see formatting comments for DecimalColumnInList


- Eric Hanson


On March 27, 2014, 7:02 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19718/
> -------
> 
> (Updated March 27, 2014, 7:02 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6752
> https://issues.apache.org/jira/browse/HIVE-6752
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorized Between and IN expressions don't work with decimal, date types.
> 
> 
> Diffs
> -
> 
>   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 44b0c59 
>   ql/src/gen/vectorization/ExpressionTemplates/FilterDecimalColumnBetween.txt 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 96e74a9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalColumnInList.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterDecimalColumnInList.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/IDecimalInExpr.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> c2240c0 
>   ql/src/test/queries/clientpositive/vector_between_in.q PRE-CREATION 
>   ql/src/test/results/clientpositive/vector_between_in.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/19718/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Fix Version/s: (was: 0.13.0)
   0.14.0

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-27 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Affects Version/s: 0.14.0

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-26 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

Uploading patch yet again to try to kick off pre-commit tests.

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-26 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948523#comment-13948523
 ] 

Eric Hanson commented on HIVE-6546:
---

I'm not sure I understand what you mean. Can you elaborate? The placeholder is 
getting substituted or eliminated by the templeton controller job. 

If I run this simple Pig script from WebHCat:

emp = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump 
emp;

Then I see this in the templeton controller job configuration:

templeton.args   
cmd,/c,call,C:\\apps\\dist\\pig-0.12.0.2.0.7.0-1551/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__=-execute,"emp
 = load 'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump 
emp;" 

And I see this in the Pig job configuration for the job spawned by the 
templeton controller job:

pig.cmd.args
-Dmapreduce.job.credentials.binary=/c:/hdfs/nm-local-dir/usercache/ehans/appcache/application_1395867453549_0007/container_1395867453549_0007_01_02/container_tokens
 -execute emp = load 
'wasbs://eha...@ehans7.blob.core.windows.net/data/emp_0.dat'; dump emp; 




> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-24 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945778#comment-13945778
 ] 

Eric Hanson commented on HIVE-6546:
---

[~thejas] Can you take a look?

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935860#comment-13935860
 ] 

Eric Hanson commented on HIVE-6664:
---

In general, sum/avg/variance aggregate results that involve floating point 
arithmetic in the sum calculation will return different answers depending on 
execution order. This is due the nature of floating point arithmetic, where it 
is easy to show examples where (a + b) + c <> a + (b + c). So it is probably 
not critical that row-mode and vector mode have results that are compatible to 
the last decimal place. However, the change here is simple enough and it makes 
for better compatibility without any serious drawbacks for performance, so I 
think this is fine.

> Vectorized variance computation differs from row mode computation.
> --
>
> Key: HIVE-6664
> URL: https://issues.apache.org/jira/browse/HIVE-6664
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6664.1.patch
>
>
> Following query can show the difference:
> select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
> stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
> The reason for the difference is that row mode converts the decimal value to 
> double upfront to calculate sum of values, when computing variance. But the 
> vector mode performs local aggregate sum as decimal and converts into double 
> only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935852#comment-13935852
 ] 

Eric Hanson commented on HIVE-6664:
---

+1

> Vectorized variance computation differs from row mode computation.
> --
>
> Key: HIVE-6664
> URL: https://issues.apache.org/jira/browse/HIVE-6664
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6664.1.patch
>
>
> Following query can show the difference:
> select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
> stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
> The reason for the difference is that row mode converts the decimal value to 
> double upfront to calculate sum of values, when computing variance. But the 
> vector mode performs local aggregate sum as decimal and converts into double 
> only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19216: Vectorized variance computation differs from row mode computation.

2014-03-14 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19216/#review37296
---

Ship it!


Ship It!

- Eric Hanson


On March 14, 2014, 8:41 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19216/
> ---
> 
> (Updated March 14, 2014, 8:41 a.m.)
> 
> 
> Review request for hive, Eric Hanson and Remus Rusanu.
> 
> 
> Bugs: HIVE-6664
> https://issues.apache.org/jira/browse/HIVE-6664
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Following query can show the difference:
> select var_samp(ss_sales_price), var_pop(ss_sales_price), 
> stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
> 
> The reason for the difference is that row mode converts the decimal value to 
> double upfront to calculate sum of values, when computing variance. But the 
> vector mode performs local aggregate sum as decimal and converts into double 
> only at flush.
> 
> 
> Diffs
> -
> 
>   ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 
>   ql/src/test/results/clientpositive/vector_decimal_aggregate.q.out 507f798 
> 
> Diff: https://reviews.apache.org/r/19216/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-14 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935621#comment-13935621
 ] 

Eric Hanson commented on HIVE-6649:
---

+1

Please see my minor comments on ReviewBoard

> Vectorization: some date expressions throw exception.
> -
>
> Key: HIVE-6649
> URL: https://issues.apache.org/jira/browse/HIVE-6649
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch
>
>
> Query ran with hive.vectorized.execution.enabled=true:
> {code}
> select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
>datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
>datediff(date_add(dt, 2), date_sub(dt, 2))
> from vectortab10korc limit 1;
> {code}
> fails with the following error:
> {noformat}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> datediff(date_add(dt, 2), date_sub(dt, 2))
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
>   ... 9 more
> Caused by: java.lang.NullPointerException
>   at java.lang.String.checkBounds(String.java:400)
>   at java.lang.String.(String.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
>   ... 13 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19218: Vectorization: some date expressions throw exception.

2014-03-14 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19218/#review37271
---



ql/src/test/results/clientpositive/vectorized_date_funcs.q.out
<https://reviews.apache.org/r/19218/#comment68694>

it'd be good to remove trailing white space


- Eric Hanson


On March 14, 2014, 9:06 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19218/
> ---
> 
> (Updated March 14, 2014, 9:06 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6649
> https://issues.apache.org/jira/browse/HIVE-6649
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Query:
> select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
>datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
>datediff(date_add(dt, 2), date_sub(dt, 2))
> from vectortab10korc limit 1;
> 
> throws NPE.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java
>  901005e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringUnaryUDF.java
>  4875d0d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColCol.java
>  09f6e47 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddColScalar.java
>  6578907 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateAddScalarCol.java
>  d1156b6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColCol.java
>  15e995c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffColScalar.java
>  05b71ac 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateDiffScalarCol.java
>  7c76901 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFDateString.java
>  dd84de3 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldString.java
>  011a790 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java
>  8418587 
>   ql/src/test/queries/clientpositive/vectorized_date_funcs.q 6c9515c 
>   ql/src/test/results/clientpositive/vectorized_date_funcs.q.out a9d7dde 
> 
> Diff: https://reviews.apache.org/r/19218/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.

2014-03-13 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933506#comment-13933506
 ] 

Eric Hanson commented on HIVE-6649:
---

Can you put this up on ReviewBoard if you're ready for a review?

> Vectorization: some date expressions throw exception.
> -
>
> Key: HIVE-6649
> URL: https://issues.apache.org/jira/browse/HIVE-6649
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6649.1.patch
>
>
> Query ran with hive.vectorized.execution.enabled=true:
> {code}
> select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
>datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
>datediff(date_add(dt, 2), date_sub(dt, 2))
> from vectortab10korc limit 1;
> {code}
> fails with the following error:
> {noformat}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> datediff(date_add(dt, 2), date_sub(dt, 2))
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
>   ... 9 more
> Caused by: java.lang.NullPointerException
>   at java.lang.String.checkBounds(String.java:400)
>   at java.lang.String.(String.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
>   ... 13 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932149#comment-13932149
 ] 

Eric Hanson commented on HIVE-6633:
---

Code review at https://reviews.apache.org/r/19140/

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19140: pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19140/
---

Review request for hive.


Bugs: HIVE-6633
https://issues.apache.org/jira/browse/HIVE-6633


Repository: hive-git


Description
---

see JIRA


Diffs
-

  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/HCatLoader.java
 a32149c 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hcatalog/pig/PigHCatUtil.java
 a01d9e3 

Diff: https://reviews.apache.org/r/19140/diff/


Testing
---


Thanks,

Eric Hanson

[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6633:
--

Attachment: HIVE-6633.01.patch

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6633:
--

Status: Patch Available  (was: Open)

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6633:
-

 Summary: pig -useHCatalog with embedded metastore fails to pass 
command line args to metastore
 Key: HIVE-6633
 URL: https://issues.apache.org/jira/browse/HIVE-6633
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
Reporter: Eric Hanson
 Fix For: 0.14.0


This fails because the embedded metastore can't connect to the database because 
the command line -D arguments passed to pig are not getting passed to the 
metastore when the embedded metastore is created. Using hive.metastore.uris set 
to the empty string causes creation of an embedded metastore.

pig -useHCatalog "-Dhive.metastore.uris=" 
"-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"

The goal is to allow a pig job submitted via WebHCat to specify a metastore to 
use via job arguments. That is not working because it is not possible to pass 
Djavax.jdo.option.ConnectionPassword and other necessary arguments to the 
embedded metastore.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-03-12 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6633:
-

Assignee: Eric Hanson

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.14.0
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

Upload again to try to kick off pre-commit tests

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch, HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6568) Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-11 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13930574#comment-13930574
 ] 

Eric Hanson commented on HIVE-6568:
---

+1

> Vectorized cast of decimal to string and timestamp produces incorrect result.
> -
>
> Key: HIVE-6568
> URL: https://issues.apache.org/jira/browse/HIVE-6568
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 0.13.0
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6568.1.patch, HIVE-6568.2.patch, HIVE-6568.3.patch
>
>
> A decimal value 1.23 with scale 5 is represented in string as 1.23000. This 
> behavior is different from HiveDecimal behavior.
> The difference in cast to timestamp is due to more aggressive rounding in 
> vectorized expression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-11 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18972/#review36803
---

Ship it!


Ship It!

- Eric Hanson


On March 10, 2014, 9:51 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18972/
> ---
> 
> (Updated March 10, 2014, 9:51 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorized cast of decimal to string and timestamp produces incorrect result.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 
>   common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
> 34bd9d0 
>   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
> debc270 
>   common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
> 9ac68fe 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
>  2e8c3a4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
>  df7e1ee 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
>  832463d 
>   ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 
>   ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 
> 
> Diff: https://reviews.apache.org/r/18972/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

Re: Review Request 18972: Vectorized cast of decimal to string and timestamp produces incorrect result.

2014-03-10 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18972/#review36703
---


Overall it looks good. Please see my specific comments.


common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java
<https://reviews.apache.org/r/18972/#comment67785>

Please add one or more tests with a large integer with trailing zeros, e.g.

1234123000

to make sure that comes our right (no zeros get lopped off).



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
<https://reviews.apache.org/r/18972/#comment67784>

Please comment why you're using this logic.


- Eric Hanson


On March 10, 2014, 5:02 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18972/
> ---
> 
> (Updated March 10, 2014, 5:02 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorized cast of decimal to string and timestamp produces incorrect result.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 9d25620 
>   common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java 
> 34bd9d0 
>   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
> debc270 
>   common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java 
> 9ac68fe 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
>  2e8c3a4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
>  df7e1ee 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
>  832463d 
>   ql/src/test/queries/clientpositive/vector_decimal_expressions.q 38934d2 
>   ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 629f5d5 
> 
> Diff: https://reviews.apache.org/r/18972/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-07 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924234#comment-13924234
 ] 

Eric Hanson commented on HIVE-6511:
---

+1

> casting from decimal to tinyint,smallint, int and bigint generates different 
> result when vectorization is on
> 
>
> Key: HIVE-6511
> URL: https://issues.apache.org/jira/browse/HIVE-6511
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6511.1.patch, HIVE-6511.2.patch, HIVE-6511.3.patch, 
> HIVE-6511.4.patch
>
>
> select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
> vectortab10korc limit 20 generates following result when vectorization is 
> enabled:
> {code}
> 4619756289662.078125  -1628520834 -16770  126
> 1553532646710.316406  -1245514442 -2762   54
> 3367942487288.360352  688127224   -776-8
> 4386447830839.337891  1286221623  12087   55
> -3234165331139.458008 -54957251   27453   61
> -488378613475.326172  1247658269  -16099  29
> -493942492598.691406  -21253559   -19895  73
> 3101852523586.039062  886135874   23618   66
> 2544105595941.381836  1484956709  -23515  37
> -3997512403067.0625   1102149509  30597   -123
> -1183754978977.589355 1655994718  31070   94
> 1408783849655.676758  34576568-26440  -72
> -2993175106993.426758 417098319   27215   79
> 3004723551798.100586  -1753555402 -8650   54
> 1103792083527.786133  -14511544   -28088  72
> 469767055288.485352   1615620024  26552   -72
> -1263700791098.294434 -980406074  12486   -58
> -4244889766496.484375 -1462078048 30112   -96
> -3962729491139.782715 1525323068  -27332  60
> NULL  NULLNULLNULL
> {code}
> When vectorization is disabled, result looks like this:
> {code}
> 4619756289662.078125  -1628520834 -16770  126
> 1553532646710.316406  -1245514442 -2762   54
> 3367942487288.360352  688127224   -776-8
> 4386447830839.337891  1286221623  12087   55
> -3234165331139.458008 -54957251   27453   61
> -488378613475.326172  1247658269  -16099  29
> -493942492598.691406  -21253558   -19894  74
> 3101852523586.039062  886135874   23618   66
> 2544105595941.381836  1484956709  -23515  37
> -3997512403067.0625   1102149509  30597   -123
> -1183754978977.589355 1655994719  31071   95
> 1408783849655.676758  34576567-26441  -73
> -2993175106993.426758 417098319   27215   79
> 3004723551798.100586  -1753555402 -8650   54
> 1103792083527.786133  -14511545   -28089  71
> 469767055288.485352   1615620024  26552   -72
> -1263700791098.294434 -980406074  12486   -58
> -4244889766496.484375 -1462078048 30112   -96
> -3962729491139.782715 1525323069  -27331  61
> NULL  NULLNULLNULL
> {code}
> This issue is visible only for certain decimal values. In above example, row 
> 7,11,12, and 15 generates different results.
> vectortab10korc table schema:
> {code}
> t tinyint from deserializer   
> sismallintfrom deserializer   
> i int from deserializer   
> b bigint  from deserializer   
> f float   from deserializer   
> d double  from deserializer   
> dcdecimal(38,18)  from deserializer   
> boboolean from deserializer   
> s string  from deserializer   
> s2string  from deserializer   
> tstimestamp   from deserializer   
>
> # Detailed Table Information   
> Database: default  
> Owner:xyz  
> CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: 
> hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
&g

Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on

2014-03-07 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18808/#review36558
---

Ship it!


looks good to me!


common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
<https://reviews.apache.org/r/18808/#comment67561>

Can you open a bug for scaleDownTenDestructive, based on what you found? 
You can make it low priority since it is not getting called in the current code 
paths. But it will be good to have a record of it.


- Eric Hanson


On March 7, 2014, 4:39 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18808/
> ---
> 
> (Updated March 7, 2014, 4:39 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6511
> https://issues.apache.org/jira/browse/HIVE-6511
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Casting from decimal to tinyint,smallint, int and bigint generates different 
> result when vectorization is on.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 
>   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
> 426c03d 
> 
> Diff: https://reviews.apache.org/r/18808/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

Re: Review Request 18808: Casting from decimal to tinyint, smallint, int and bigint generates different result when vectorization is on

2014-03-05 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18808/#review36290
---



common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
<https://reviews.apache.org/r/18808/#comment67244>

Nice idea to special-case signum==0 and scale==0 cases to speed it up.



common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
<https://reviews.apache.org/r/18808/#comment67243>

Decimal128.divideDestructive had a bug that we worked around by just 
rewriting it to use HiveDecimal divide.

I am worried that UnsignedInt128.divideDestructive could have been the 
original source of the bug.

That makes me think it might be safer to just use the HiveDecimal code here 
to do the divide by 10**scale.
    


- Eric Hanson


On March 5, 2014, 9:39 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18808/
> ---
> 
> (Updated March 5, 2014, 9:39 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6511
> https://issues.apache.org/jira/browse/HIVE-6511
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Casting from decimal to tinyint,smallint, int and bigint generates different 
> result when vectorization is on.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java a5d7399 
>   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
> 426c03d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
>  d5f34d5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
>  df7e1ee 
> 
> Diff: https://reviews.apache.org/r/18808/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.03.patch

fix typo

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch, 
> HIVE-6546.03.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921553#comment-13921553
 ] 

Eric Hanson commented on HIVE-6546:
---

Code review at https://reviews.apache.org/r/18816/

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 18816: WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18816/
---

Review request for hive.


Bugs: HIVE-6546
https://issues.apache.org/jira/browse/HIVE-6546


Repository: hive-git


Description
---

See JIRA


Diffs
-

  
hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java
 482e993 

Diff: https://reviews.apache.org/r/18816/diff/


Testing
---


Thanks,

Eric Hanson

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.02.patch

removed trailing white space

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch, HIVE-6546.02.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Fix Version/s: 0.13.0
 Assignee: Eric Hanson
   Status: Patch Available  (was: Open)

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0, 0.11.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-05 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Attachment: HIVE-6546.01.patch

Changed constant placeholder to include = sign

> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6546.01.patch
>
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Environment: 
HDInsight deploying HDP 1.3:  c:\apps\dist\pig-0.11.0.1.3.2.0-05
Also on Windows HDP 1.3 one-box configuration.

  was:Windows Azure HDINSIGHT and Windows one-box installations.


> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: HDInsight deploying HDP 1.3:  
> c:\apps\dist\pig-0.11.0.1.3.2.0-05
> Also on Windows HDP 1.3 one-box configuration.
>Reporter: Eric Hanson
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6546:
--

Description: 
On a one-box windows setup, do the following from a powershell prompt:

cmd /c curl.exe -s `
  -d user.name=hadoop `
  -d arg=-useHCatalog `
  -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
  -d statusdir="/tmp/webhcat.output01" `
  'http://localhost:50111/templeton/v1/pig' -v

The job fails with error code 7, but it should run. 

I traced this down to the following. In the job configuration for the 
TempletonJobController, we have templeton.args set to

cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
 = load '/data/emp/emp_0.dat'; dump emp;"

Notice the = sign before "-useHCatalog". I think this should be a comma.

The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created in  
org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().

It happens at line 434:
{code}
  } else {
  if (i < args.length - 1) {
prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
  }
}
{code}

Bug is here:
{code}
  if (prop != null) {
if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
not contain equal, so else branch is run and appends ="-useHCatalog",
  // everything good
} else {
  if (i < args.length - 1) {
prop += "=" + args[++i];
  }
}
newArgs.add(prop);
  }
{code}
One possible fix is to change the string constant 
org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
 to have an "=" sign in it. Or, preProcessForWindows() itself could be changed.


> WebHCat job submission for pig with -useHCatalog argument fails on Windows
> --
>
> Key: HIVE-6546
> URL: https://issues.apache.org/jira/browse/HIVE-6546
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.11.0, 0.12.0, 0.13.0
> Environment: Windows Azure HDINSIGHT and Windows one-box 
> installations.
>Reporter: Eric Hanson
>
> On a one-box windows setup, do the following from a powershell prompt:
> cmd /c curl.exe -s `
>   -d user.name=hadoop `
>   -d arg=-useHCatalog `
>   -d execute="emp = load '/data/emp/emp_0.dat'; dump emp;" `
>   -d statusdir="/tmp/webhcat.output01" `
>   'http://localhost:50111/templeton/v1/pig' -v
> The job fails with error code 7, but it should run. 
> I traced this down to the following. In the job configuration for the 
> TempletonJobController, we have templeton.args set to
> cmd,/c,call,C:\\hadooppig-0.11.0.1.3.0.0-0846/bin/pig.cmd,-D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog",-execute,"emp
>  = load '/data/emp/emp_0.dat'; dump emp;"
> Notice the = sign before "-useHCatalog". I think this should be a comma.
> The bad string D__WEBHCAT_TOKEN_FILE_LOCATION__="-useHCatalog" gets created 
> in  org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows().
> It happens at line 434:
> {code}
>   } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];   // RIGHT HERE! at iterations i = 37, 38
>   }
> }
> {code}
> Bug is here:
> {code}
>   if (prop != null) {
> if (prop.contains("=")) {  // -D__WEBHCAT_TOKEN_FILE_LOCATION__ does 
> not contain equal, so else branch is run and appends ="-useHCatalog",
>   // everything good
> } else {
>   if (i < args.length - 1) {
> prop += "=" + args[++i];
>   }
> }
> newArgs.add(prop);
>   }
> {code}
> One possible fix is to change the string constant 
> org.apache.hcatalog.templeton.tool.TempletonControllerJob.TOKEN_FILE_ARG_PLACEHOLDER
>  to have an "=" sign in it. Or, preProcessForWindows() itself could be 
> changed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6546) WebHCat job submission for pig with -useHCatalog argument fails on Windows

2014-03-04 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6546:
-

 Summary: WebHCat job submission for pig with -useHCatalog argument 
fails on Windows
 Key: HIVE-6546
 URL: https://issues.apache.org/jira/browse/HIVE-6546
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0, 0.11.0, 0.13.0
 Environment: Windows Azure HDINSIGHT and Windows one-box installations.
Reporter: Eric Hanson






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6511) casting from decimal to tinyint,smallint, int and bigint generates different result when vectorization is on

2014-03-03 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918290#comment-13918290
 ] 

Eric Hanson commented on HIVE-6511:
---

Can you put this up on ReviewBoard?

> casting from decimal to tinyint,smallint, int and bigint generates different 
> result when vectorization is on
> 
>
> Key: HIVE-6511
> URL: https://issues.apache.org/jira/browse/HIVE-6511
> Project: Hive
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6511.1.patch
>
>
> select dc,cast(dc as int), cast(dc as smallint),cast(dc as tinyint) from 
> vectortab10korc limit 20 generates following result when vectorization is 
> enabled:
> {code}
> 4619756289662.078125  -1628520834 -16770  126
> 1553532646710.316406  -1245514442 -2762   54
> 3367942487288.360352  688127224   -776-8
> 4386447830839.337891  1286221623  12087   55
> -3234165331139.458008 -54957251   27453   61
> -488378613475.326172  1247658269  -16099  29
> -493942492598.691406  -21253559   -19895  73
> 3101852523586.039062  886135874   23618   66
> 2544105595941.381836  1484956709  -23515  37
> -3997512403067.0625   1102149509  30597   -123
> -1183754978977.589355 1655994718  31070   94
> 1408783849655.676758  34576568-26440  -72
> -2993175106993.426758 417098319   27215   79
> 3004723551798.100586  -1753555402 -8650   54
> 1103792083527.786133  -14511544   -28088  72
> 469767055288.485352   1615620024  26552   -72
> -1263700791098.294434 -980406074  12486   -58
> -4244889766496.484375 -1462078048 30112   -96
> -3962729491139.782715 1525323068  -27332  60
> NULL  NULLNULLNULL
> {code}
> When vectorization is disabled, result looks like this:
> {code}
> 4619756289662.078125  -1628520834 -16770  126
> 1553532646710.316406  -1245514442 -2762   54
> 3367942487288.360352  688127224   -776-8
> 4386447830839.337891  1286221623  12087   55
> -3234165331139.458008 -54957251   27453   61
> -488378613475.326172  1247658269  -16099  29
> -493942492598.691406  -21253558   -19894  74
> 3101852523586.039062  886135874   23618   66
> 2544105595941.381836  1484956709  -23515  37
> -3997512403067.0625   1102149509  30597   -123
> -1183754978977.589355 1655994719  31071   95
> 1408783849655.676758  34576567-26441  -73
> -2993175106993.426758 417098319   27215   79
> 3004723551798.100586  -1753555402 -8650   54
> 1103792083527.786133  -14511545   -28089  71
> 469767055288.485352   1615620024  26552   -72
> -1263700791098.294434 -980406074  12486   -58
> -4244889766496.484375 -1462078048 30112   -96
> -3962729491139.782715 1525323069  -27331  61
> NULL  NULLNULLNULL
> {code}
> This issue is visible only for certain decimal values. In above example, row 
> 7,11,12, and 15 generates different results.
> vectortab10korc table schema:
> {code}
> t tinyint from deserializer   
> sismallintfrom deserializer   
> i int from deserializer   
> b bigint  from deserializer   
> f float   from deserializer   
> d double  from deserializer   
> dcdecimal(38,18)  from deserializer   
> boboolean from deserializer   
> s string  from deserializer   
> s2string  from deserializer   
> tstimestamp   from deserializer   
>
> # Detailed Table Information   
> Database: default  
> Owner:xyz  
> CreateTime:   Tue Feb 25 21:54:28 UTC 2014 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: 
> hdfs://host1.domain.com:8020/apps/hive/warehouse/vectortab10korc 
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   true
>   numFiles1   
>   numRows 1

RE: need advice on debugging into TempletonJobController.java

2014-02-28 Thread Eric Hanson (BIG DATA)

Hey, I found the solution. You need to add this to webhcat-site.xml. -Eric


To attach the debugger to the templeton controller MR job started by the 
templeton service, go to %hcatalog_home%\conf\webhcat-site.xml and add the 
following block (copied from etc\webhcat\webhcat-default.xml, and enhanced with 
the highlighted options for debugging).
  
templeton.controller.mr.child.opts
-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true
Java options to be passed to templeton controller map task.
The default value of mapreduce child "-Xmx" (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much 
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the 
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value

  

-Original Message-----
From: Eric Hanson (BIG DATA) [mailto:eric.n.han...@microsoft.com] 
Sent: Friday, February 28, 2014 12:06 PM
To: dev@hive.apache.org
Subject: need advice on debugging into TempletonJobController.java

I want to attach a debugger to TempletonJobController.java (code that runs in a 
map job started by templeton service, that in turn will start another job). 
Does anybody know how to make the job wait for a debugger to attach? i.e. what 
file to modify to change the java opts?

Eric

Details of what I tried:

I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work:

  
mapred.child.java.opts
-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m
  

I also tried this, in:

%hcatalog_home%\etc\webhcat\webhcat-default.xml

Adding:


templeton.controller.mr.child.opts
-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true
Java options to be passed to templeton controller map task.
The default value of mapreduce child "-Xmx" (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value

  

But the job doesn't appear to wait, and I keep seeing this in my job config:

mapred.child.java.opts

-server -Xmx256m -Djava.net.preferIPv4Stack=true

need advice on debugging into TempletonJobController.java

2014-02-28 Thread Eric Hanson (BIG DATA)

I want to attach a debugger to TempletonJobController.java (code that runs in a 
map job started by templeton service, that in turn will start another job). 
Does anybody know how to make the job wait for a debugger to attach? i.e. what 
file to modify to change the java opts?

Eric

Details of what I tried:

I tried adding it in %hadoop_home%/conf/mapred-site.xml but it didn't work:

  
mapred.child.java.opts
-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -Xmx1024m
  

I also tried this, in:

%hcatalog_home%\etc\webhcat\webhcat-default.xml

Adding:


templeton.controller.mr.child.opts
-Xdebug -Djava.compiler=NONE 
-Xrunjdwp:transport=dt_socket,address=8004,server=y,suspend=y -server -Xmx256m 
-Djava.net.preferIPv4Stack=true
Java options to be passed to templeton controller map task.
The default value of mapreduce child "-Xmx" (heap memory limit)
might be close to what is allowed for a map task.
Even if templeton  controller map task does not need much
memory, the jvm (with -server option?)
allocates the max memory when it starts. This along with the
memory used by pig/hive client it starts can end up exceeding
the max memory configured to be allowed for a map task
Use this option to set -Xmx to lower value

  

But the job doesn't appear to wait, and I keep seeing this in my job config:

mapred.child.java.opts

-server -Xmx256m -Djava.net.preferIPv4Stack=true

RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

2014-02-28 Thread Eric Hanson (BIG DATA)

Congratulations Xuefu!

-Original Message-
From: Remus Rusanu [mailto:rem...@microsoft.com] 
Sent: Friday, February 28, 2014 11:43 AM
To: dev@hive.apache.org; u...@hive.apache.org
Cc: Xuefu Zhang
Subject: RE: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

Grats!

From: Prasanth Jayachandran 
Sent: Friday, February 28, 2014 9:11 PM
To: dev@hive.apache.org
Cc: u...@hive.apache.org; Xuefu Zhang
Subject: Re: [ANNOUNCE] New Hive PMC Member - Xuefu Zhang

Congratulations Xuefu!

Thanks
Prasanth Jayachandran

On Feb 28, 2014, at 11:04 AM, Vaibhav Gumashta  
wrote:

> Congrats Xuefu!
>
>
> On Fri, Feb 28, 2014 at 9:20 AM, Prasad Mujumdar wrote:
>
>>   Congratulations Xuefu !!
>>
>> thanks
>> Prasad
>>
>>
>>
>> On Fri, Feb 28, 2014 at 1:20 AM, Carl Steinbach  wrote:
>>
>>> I am pleased to announce that Xuefu Zhang has been elected to the 
>>> Hive Project Management Committee. Please join me in congratulating Xuefu!
>>>
>>> Thanks.
>>>
>>> Carl
>>>
>>>
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. Thank 
> You.


--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.

[jira] [Commented] (HIVE-6496) Queries fail to Vectorize.

2014-02-27 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13914962#comment-13914962
 ] 

Eric Hanson commented on HIVE-6496:
---

+1 conditional on addressing my review comments

> Queries fail to Vectorize.
> --
>
> Key: HIVE-6496
> URL: https://issues.apache.org/jira/browse/HIVE-6496
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6496.1.patch, HIVE-6496.2.patch, HIVE-6496.3.patch
>
>
> Following issues are causing many queries to fail to vectorize:
> 1) NPE because row resolver is null.
> 2) VectorUDFAdapter doesn't handle decimal.
> 3) Decimal cast to boolean, timestamp, string fail because classes are not 
> annotated appropriately.
> 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18566: Queries fail to Vectorize.

2014-02-27 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18566/#review35682
---


Looks good.

Please add unit tests to exercise the code you changed, or if this code is 
already covered by other tests, please explain in comments on the JIRA.



common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java
<https://reviews.apache.org/r/18566/#comment66370>

Please add comment saying the purpose of this method


- Eric Hanson


On Feb. 27, 2014, 6:43 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18566/
> ---
> 
> (Updated Feb. 27, 2014, 6:43 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6496
> https://issues.apache.org/jira/browse/HIVE-6496
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 1) NPE because row resolver is null.
> 2) VectorUDFAdapter doesn't handle decimal.
> 3) Decimal cast to boolean, timestamp, string fail because classes are not 
> annotated appropriately.
> 4) Decimal modulo fails to vectorize because GenericUDFOPMod is not annotated.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java 09af28a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
>  4de9f9f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 842994e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/udf/VectorUDFAdaptor.java 
> 3bc9493 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> e6be03f 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java 54c665e 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMod.java 
> db4eafa 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTimestamp.java 
> e2529d2 
> 
> Diff: https://reviews.apache.org/r/18566/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

RE: [ANNOUNCE] New Hive Committer - Remus Rusanu

2014-02-26 Thread Eric Hanson (BIG DATA)

Fantastic! Welcome aboard, Remus!

Eric

From: Carl Steinbach [mailto:cwsteinb...@gmail.com]
Sent: Wednesday, February 26, 2014 8:59 AM
To: u...@hive.apache.org; dev@hive.apache.org
Cc: Remus Rusanu
Subject: [ANNOUNCE] New Hive Committer - Remus Rusanu

The Apache Hive PMC has voted to make Remus Rusanu a committer on the Apache 
Hive Project.

Please join me in congratulating Remus!

Thanks.

Carl

[jira] [Commented] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-18 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904829#comment-13904829
 ] 

Eric Hanson commented on HIVE-6416:
---

Looks good to me.

+1 conditional on addressing my review comments (all of which are minor)

> Vectorized mathematical functions for decimal type.
> ---
>
> Key: HIVE-6416
> URL: https://issues.apache.org/jira/browse/HIVE-6416
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch
>
>
> Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18184: Vectorized mathematical functions for decimal type.

2014-02-18 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18184/#review34783
---



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
<https://reviews.apache.org/r/18184/#comment65070>

format comment better (blank after //, blank line before first comment line)



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
<https://reviews.apache.org/r/18184/#comment65066>

I think you could speed this up with an array fill operation for 
outputIsNull before the loop, but that is a nice-to-have and not essential.



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
<https://reviews.apache.org/r/18184/#comment65071>

remove trailing white space



ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt
<https://reviews.apache.org/r/18184/#comment65073>

remove trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/18184/#comment65132>

Please add comment to explain what method does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
<https://reviews.apache.org/r/18184/#comment65136>

delete trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
<https://reviews.apache.org/r/18184/#comment65137>

delete trailing white space




ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
<https://reviews.apache.org/r/18184/#comment65138>

fix comment format



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
<https://reviews.apache.org/r/18184/#comment65139>

remove trailing white space



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
<https://reviews.apache.org/r/18184/#comment65140>

remove trailing white space



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
<https://reviews.apache.org/r/18184/#comment65144>

please add cases for non-zero values close to 0 like -0.3 and 0.3

for floor and ceiling



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
<https://reviews.apache.org/r/18184/#comment65147>

Please add test to negate 0 and make sure you still get 0



ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q
<https://reviews.apache.org/r/18184/#comment65148>

    please remove trailing white space in .q file (several locations)


- Eric Hanson


On Feb. 17, 2014, 9:05 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18184/
> -------
> 
> (Updated Feb. 17, 2014, 9:05 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6416
> https://issues.apache.org/jira/browse/HIVE-6416
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorized mathematical functions for decimal type.
> 
> 
> Diffs
> -
> 
>   ant/src/org/apache/hadoop/hive/ant/GenVectorCode.java 1b76fc9 
>   common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java 2e0f058 
>   ql/src/gen/vectorization/ExpressionTemplates/DecimalColumnUnaryFunc.txt 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> f69bfc0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/DecimalUtil.java
>  589450f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncRoundWithNumDigitsDecimalToDecimal.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSign.java 628f06d 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFAbs.java 
> 1c1bcfe 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCeil.java 
> ceb56bb 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFloor.java 
> a95a263 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNegative.java 
> f355a82 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRound.java 
> 5cc8025 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestDecimalUtil.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/vector_decimal_math_funcs.q PRE-CREATION 
>   ql/src/test/results/clientpositive/vector_decimal_math_funcs.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/18184/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6452:
--

Assignee: Jitendra Nath Pandey

> fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in 
> Decimal128.multiplyDestructive
> ---
>
> Key: HIVE-6452
> URL: https://issues.apache.org/jira/browse/HIVE-6452
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>    Reporter: Eric Hanson
>Assignee: Jitendra Nath Pandey
> Attachments: multiplyArrays4And4To8-start.patch
>
>
> UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
> failures, one of which appears in TestDecimal128.testKnownPriorErrors.
> Fix the bug by finishing the TODO section in 
> UnsignedInt128.multiplyArrays4And4To8 in the provided 
> multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
> per-operation storage allocations.
> Retain the rest of the work (the new tests) in 
> multiplyArrays4And4To8-start.patch as much as possible.
> Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
> short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
> multiplyDestructiveNativeDecimal128.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6452:
--

Attachment: multiplyArrays4And4To8-start.patch

> fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in 
> Decimal128.multiplyDestructive
> ---
>
> Key: HIVE-6452
> URL: https://issues.apache.org/jira/browse/HIVE-6452
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>    Reporter: Eric Hanson
> Attachments: multiplyArrays4And4To8-start.patch
>
>
> UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
> failures, one of which appears in TestDecimal128.testKnownPriorErrors.
> Fix the bug by finishing the TODO section in 
> UnsignedInt128.multiplyArrays4And4To8 in the provided 
> multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
> per-operation storage allocations.
> Retain the rest of the work (the new tests) in 
> multiplyArrays4And4To8-start.patch as much as possible.
> Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
> short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
> multiplyDestructiveNativeDecimal128.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6452) fix bug in UnsignedInt128.multiplyArrays4And4To8 and revert temporary fix in Decimal128.multiplyDestructive

2014-02-17 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6452:
-

 Summary: fix bug in UnsignedInt128.multiplyArrays4And4To8 and 
revert temporary fix in Decimal128.multiplyDestructive
 Key: HIVE-6452
 URL: https://issues.apache.org/jira/browse/HIVE-6452
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.13.0
Reporter: Eric Hanson


UnsignedInt128.multiplyArrays4And4To8 has a bug that causes rare multiply 
failures, one of which appears in TestDecimal128.testKnownPriorErrors.

Fix the bug by finishing the TODO section in 
UnsignedInt128.multiplyArrays4And4To8 in the provided 
multiplyArrays4And4To8-start.patch. Make it fast and make it work with no 
per-operation storage allocations.

Retain the rest of the work (the new tests) in 
multiplyArrays4And4To8-start.patch as much as possible.

Revert the changes to Decimal128.multiplyDestructive so it doesn't use the 
short-term, slow fix based on HiveDecimal. I.e. use the implementation in 
multiplyDestructiveNativeDecimal128.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-17 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Vectorization
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
>  Labels: vectorization
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
> HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6435) Allow specification of alternate metastore in WebHCat job

2014-02-15 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6435:
--

Description: Allow a user to specify with their WebHCat Hive and Pig jobs a 
metastore database JDBC connection string. For the job, this overrides the 
default metastore configured for the cluster.  (was: Allow a user to specify 
with their WebHCat jobs a metastore database JDBC connection string. For the 
job, this overrides the default metastore configured for the cluster.)

> Allow specification of alternate metastore in WebHCat job
> -
>
> Key: HIVE-6435
> URL: https://issues.apache.org/jira/browse/HIVE-6435
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, WebHCat
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
>
> Allow a user to specify with their WebHCat Hive and Pig jobs a metastore 
> database JDBC connection string. For the job, this overrides the default 
> metastore configured for the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6436:
-

Assignee: Eric Hanson

> Allow specification of one or more additional Windows Azure storage accounts 
> in WebHCat job
> ---
>
> Key: HIVE-6436
> URL: https://issues.apache.org/jira/browse/HIVE-6436
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, WebHCat
>    Reporter: Eric Hanson
>Assignee: Eric Hanson
>
> Allow a user to specify one or more additional Windows Azure storage 
> accounts, including account name and key, in a WebHCat Hive job submission. 
> These would be in addition to any that were specified in the default cluster 
> configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6436) Allow specification of one or more additional Windows Azure storage accounts in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6436:
-

 Summary: Allow specification of one or more additional Windows 
Azure storage accounts in WebHCat job
 Key: HIVE-6436
 URL: https://issues.apache.org/jira/browse/HIVE-6436
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson


Allow a user to specify one or more additional Windows Azure storage accounts, 
including account name and key, in a WebHCat Hive job submission. These would 
be in addition to any that were specified in the default cluster configuration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6435) Allow specification of alternate metastore in WebHCat job

2014-02-14 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6435:
-

 Summary: Allow specification of alternate metastore in WebHCat job
 Key: HIVE-6435
 URL: https://issues.apache.org/jira/browse/HIVE-6435
 Project: Hive
  Issue Type: Improvement
  Components: CLI, WebHCat
Reporter: Eric Hanson
Assignee: Eric Hanson


Allow a user to specify with their WebHCat jobs a metastore database JDBC 
connection string. For the job, this overrides the default metastore configured 
for the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.05.patch

Promoting patch 02 to first position to get committed, now as 05.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Vectorization
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
>  Labels: vectorization
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
> HIVE-6399.02.patch, HIVE-6399.05.patch, HIVE-6399.3.patch, HIVE-6399.4.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901995#comment-13901995
 ] 

Eric Hanson commented on HIVE-6399:
---

Remus' patch is technically good. I have a question I'll raise with the PMC 
though about the comment about using the algorithm from 
BigInteger.multiplyToLen. For now I'm going to promote my original patch to get 
it in so we can get the bug failure out of trunk.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Vectorization
>    Reporter: Eric Hanson
>Assignee: Eric Hanson
>  Labels: vectorization
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
> HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-14 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6399:
-

Assignee: Eric Hanson  (was: Remus Rusanu)

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Vectorization
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
>  Labels: vectorization
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
> HIVE-6399.02.patch, HIVE-6399.3.patch, HIVE-6399.4.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5759) Implement vectorized support for COALESCE conditional expression

2014-02-14 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901749#comment-13901749
 ] 

Eric Hanson commented on HIVE-5759:
---

+1

Also, the failure in testHighPrecisionDecimal128Multiply is external to this 
patch.

> Implement vectorized support for COALESCE conditional expression
> 
>
> Key: HIVE-5759
> URL: https://issues.apache.org/jira/browse/HIVE-5759
> Project: Hive
>  Issue Type: Sub-task
>    Reporter: Eric Hanson
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-5759.1.patch, HIVE-5759.2.patch
>
>
> Implement full, end-to-end support for COALESCE in vectorized mode, including 
> new VectorExpression class(es), VectorizationContext translation to a 
> VectorExpression, and unit tests for these, as well as end-to-end ad hoc 
> testing. An end-to-end .q test is recommended.
> This is lower priority than IF and CASE but it is still a fairly popular 
> expression.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression

2014-02-13 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18025/#review34370
---



ql/src/test/queries/clientpositive/vector_coalesce.q
<https://reviews.apache.org/r/18025/#comment64447>

Can you do one with > 3 arguments too? Will that vectorize?



ql/src/test/queries/clientpositive/vector_coalesce.q
<https://reviews.apache.org/r/18025/#comment64450>

Please also test for smallint and timestamp.



ql/src/test/queries/clientpositive/vector_coalesce.q
<https://reviews.apache.org/r/18025/#comment64451>

Please also test for expressions as arguments, not just columns.



ql/src/test/queries/clientpositive/vector_coalesce.q
<https://reviews.apache.org/r/18025/#comment64448>

It is not unusual to use COALESCE like this:

COALESCE(col1, ..., colK, 0)

So if arguments 1..K are NULL, the default value is the constant at the 
end, 0 in this case. Could you please make that work in this patch, or open a 
separate JIRA to do it later?


- Eric Hanson


On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18025/
> ---
> 
> (Updated Feb. 12, 2014, 7 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-5759
> https://issues.apache.org/jira/browse/HIVE-5759
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement vectorized support for COALESCE conditional expression
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java 
> f1eef14 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
> d0d8597 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java 
> cb23129 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java 
> aa05b19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 7141d63 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 21fe8ca 
>   ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION 
>   ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/18025/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

Re: Review Request 18025: Implement vectorized support for COALESCE conditional expression

2014-02-12 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18025/#review34336
---



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java
<https://reviews.apache.org/r/18025/#comment64388>

I think setRef is only safe for base vectors (that get data from table 
columns), not intermediate working results. There was bug there since those can 
get re-used during processing of a single vectorized row batch.

So, use setVal here unless you know the source vector is a base vector 
loaded from a table column.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java
<https://reviews.apache.org/r/18025/#comment64389>

use .update() instead of = assignment or you could have a bug.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/18025/#comment64392>

please add comment to explain what method does



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/18025/#comment64395>

This is the same code block as the previous case. Can you share the case 
and change the condition to an OR?

Up to you...



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
<https://reviews.apache.org/r/18025/#comment64396>

I'm not sure this is always EOF. Consider deleting ", this is EOF"



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
<https://reviews.apache.org/r/18025/#comment64397>

This can have > 1 argument. Please add comment to explain.





ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
<https://reviews.apache.org/r/18025/#comment64398>

What happens if one of the inputs is a scalar, not a column? 



ql/src/test/queries/clientpositive/vector_coalesce.q
<https://reviews.apache.org/r/18025/#comment64399>

ERIC TODO: start reviewing here


- Eric Hanson


On Feb. 12, 2014, 7 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18025/
> ---
> 
> (Updated Feb. 12, 2014, 7 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-5759
> https://issues.apache.org/jira/browse/HIVE-5759
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implement vectorized support for COALESCE conditional expression
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/BytesColumnVector.java 
> f1eef14 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 0a8811f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DecimalColumnVector.java 
> d0d8597 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/DoubleColumnVector.java 
> cb23129 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/LongColumnVector.java 
> aa05b19 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 7141d63 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> 21fe8ca 
>   ql/src/test/queries/clientpositive/vector_coalesce.q PRE-CREATION 
>   ql/src/test/results/clientpositive/vector_coalesce.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/18025/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-12 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.02.patch

Uploading again to trigger precommit tests.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch, 
> HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898550#comment-13898550
 ] 

Eric Hanson commented on HIVE-6399:
---

Review board entry: https://reviews.apache.org/r/17972/

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Status: Patch Available  (was: In Progress)

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 started by Eric Hanson.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Work stopped] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 stopped by Eric Hanson.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Work started] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6399 started by Eric Hanson.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.02.patch

Patch with update to Decimal128.multiplyDestructive() to make it use 
HiveDecimal.multiply internally, plus updated tests.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch, HIVE-6399.02.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Assigned] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-11 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson reassigned HIVE-6399:
-

Assignee: Eric Hanson

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13897040#comment-13897040
 ] 

Eric Hanson commented on HIVE-6333:
---

+1

> Generate vectorized plan for decimal expressions.
> -
>
> Key: HIVE-6333
> URL: https://issues.apache.org/jira/browse/HIVE-6333
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, 
> HIVE-6333.4.patch, HIVE-6333.5.patch, HIVE-6333.6.patch
>
>
> Transform non-vector plan to vectorized plan for supported decimal 
> expressions. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Attachment: HIVE-6399.01.patch

Attached patch with explicit test for this known bug in testKnownPriorErrors. 
No fix yet. 

A quick fix would be to use BigDecimal multiply inside Decimal128 multiply. 
Although this would not perform well, it'd be safe.

> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6399.01.patch
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Description: 
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failures. This is one example of such a 
failure.

  was:
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184




> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
> Fix For: 0.13.0
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failures. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6399:
--

Description: 
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failure. This is one example of such a 
failure.

  was:
For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184

More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
many times, you'll get an occasional failures. This is one example of such a 
failure.


> bug in high-precision Decimal128 multiply
> -
>
> Key: HIVE-6399
> URL: https://issues.apache.org/jira/browse/HIVE-6399
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Eric Hanson
> Fix For: 0.13.0
>
>
> For operation -605044214913338382 * 55269579109718297360
> expected: -33440539101030154945490585226577271520
> but was:   -33440539021801992431226247633033321184
> More generally, if you run TestDecimal128.testHighPrecisionDecimal128Multiply 
> many times, you'll get an occasional failure. This is one example of such a 
> failure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6333) Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13896935#comment-13896935
 ] 

Eric Hanson commented on HIVE-6333:
---

I opened bug HIVE-6399 to track the testHighPrecisionDecimal128Multiply 
failure. It is external to this patch.

> Generate vectorized plan for decimal expressions.
> -
>
> Key: HIVE-6333
> URL: https://issues.apache.org/jira/browse/HIVE-6333
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HIVE-6333.1.patch, HIVE-6333.2.patch, HIVE-6333.3.patch, 
> HIVE-6333.4.patch, HIVE-6333.5.patch
>
>
> Transform non-vector plan to vectorized plan for supported decimal 
> expressions. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6399) bug in high-precision Decimal128 multiply

2014-02-10 Thread Eric Hanson (JIRA)

Eric Hanson created HIVE-6399:
-

 Summary: bug in high-precision Decimal128 multiply
 Key: HIVE-6399
 URL: https://issues.apache.org/jira/browse/HIVE-6399
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
 Fix For: 0.13.0


For operation -605044214913338382 * 55269579109718297360

expected: -33440539101030154945490585226577271520
but was:   -33440539021801992431226247633033321184





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17769: Generate vectorized plan for decimal expressions.

2014-02-10 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17769/#review34088
---

Ship it!


The functionality looks good. Please address the minor issues about the 
comments that I pointed out. No need for me to do another review.


ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment64058>

there -> their



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment64060>

Please add comment before method explaining what it does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment64059>

    loose -> lose


- Eric Hanson


On Feb. 8, 2014, 6:15 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17769/
> ---
> 
> (Updated Feb. 8, 2014, 6:15 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6333
> https://issues.apache.org/jira/browse/HIVE-6333
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Generate vectorized plan for decimal expressions.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt
>  699b7c5 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt
>  99366ca 
>   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 
> 2aa4152 
>   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 
> 2e84334 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt
>  9578d34 
>   ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 
> 6ee9d5f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
>  1c70387 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> f5ab731 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
> f513188 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java
>  4510368 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java
>  6a7762d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDecimal.java
>  14b91e1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToDouble.java
>  2ba1509 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
>  65a804d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToString.java
>  5b2a658 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDoubleToDecimal.java
>  14e30c3 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastLongToDecimal.java
>  1d4d84d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDecimal.java
>  41762ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToDecimal.java
>  37e92e1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ConstantVectorExpression.java
>  cac1d80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java
>  93052a1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncDoubleToDecimal.java
>  8b2a6f0 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FuncLongToDecimal.java
>  18d1dbb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpression.java
>  d00d99b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
>  e5c3aa4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
>  a242fef 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> ad96fa5 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java 4f59125 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java e4dfcc9 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java 4e2d1d4 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java 6f9746c 
>   ql/src

Re: Review Request 17769: Generate vectorized plan for decimal expressions.

2014-02-07 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17769/#review33942
---


Overall this looks good. Please see my specific comments. I did find one bug 
(used an Add in place of Subtract in GenericUDFOpMinus), 
and possibly one design issue related to implicit cast precision and scale.


ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment63766>

Please add a comment explaining what castExpressionUdfs is for



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment63767>

Expand the comment to explain the kind of situations where this is 
necessary.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment63768>

Add comment before method explain what it does.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment63808>

Hive Java coding standard says put blank line before all comments.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
<https://reviews.apache.org/r/17769/#comment63769>


Because TypeInfo has decimal precision/scale, the output scale is not 
always the same as the input scale. E.g. I've seen that 
decimal(18,2)*decimal(18,2) might have scale=4 or something like that. 

Might it be better to have integers be cast to decimal(19,0) and floats to, 
say, decimal(38,18) or something like that, so you never or rarely lose 
information during the cast, or get a NULL due to overflow? But of course, you 
would not change the expression result precision/scale.

What you have here looks pretty good, but it may be worth more thought.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java
<https://reviews.apache.org/r/17769/#comment63816>

add comment saying briefly what method does



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPMinus.java
<https://reviews.apache.org/r/17769/#comment63823>

DecimalColAddDecimalScalar should be subtact



ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/17769/#comment63825>

please add brief comment saying what this test checks



ql/src/test/queries/clientpositive/vector_decimal_expressions.q
<https://reviews.apache.org/r/17769/#comment63828>

I think we need a JIRA to add unary minus for vectorized decimal, plus a 
test.



ql/src/test/results/clientpositive/vectorization_short_regress.q.out
<https://reviews.apache.org/r/17769/#comment63837>

It looks like some new rows showed up in the output after you changed the 
test. Is this expected, or does it reveal a correctness issue?


- Eric Hanson


On Feb. 7, 2014, 2:31 a.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17769/
> ---
> 
> (Updated Feb. 7, 2014, 2:31 a.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Bugs: HIVE-6333
> https://issues.apache.org/jira/browse/HIVE-6333
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Generate vectorized plan for decimal expressions.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 29c5168 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumnDecimal.txt
>  699b7c5 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticScalarDecimal.txt
>  99366ca 
>   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideColumnDecimal.txt 
> 2aa4152 
>   ql/src/gen/vectorization/ExpressionTemplates/ColumnDivideScalarDecimal.txt 
> 2e84334 
>   
> ql/src/gen/vectorization/ExpressionTemplates/ScalarArithmeticColumnDecimal.txt
>  9578d34 
>   ql/src/gen/vectorization/ExpressionTemplates/ScalarDivideColumnDecimal.txt 
> 6ee9d5f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
>  1c70387 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> f5ab731 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
> f513188 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java
>  4510368 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToBoolean.java
>  6a776

Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.

2014-02-05 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17622/#review33747
---

Ship it!


Ship It!

- Eric Hanson


On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17622/
> ---
> 
> (Updated Jan. 31, 2014, 10:19 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> VectorExpressionWriter for date and decimal datatypes.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
> f513188 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
>  e5c3aa4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
>  a242fef 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> ad96fa5 
>   ql/src/test/queries/clientpositive/vectorization_decimal_date.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/vectorization_decimal_date.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17622/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Work stopped] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-02-03 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-6234 stopped by Eric Hanson.

> Implement fast vectorized InputFormat extension for text files
> --
>
> Key: HIVE-6234
> URL: https://issues.apache.org/jira/browse/HIVE-6234
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
> InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
> state-diagram.jpg
>
>
> Implement support for vectorized scan input of text files (plain text with 
> configurable record and field separators). This should work for CSV files, 
> tab delimited files, etc. 
> The goal is to provide high-performance reading of these files using 
> vectorized scans, and also to do it as an extension of existing Hive. Then, 
> if vectorized query is enabled, existing tables based on text files will be 
> able to benefit immediately without the need to use a different input format. 
> After upgrading to new Hive bits that support this, faster, vectorized 
> processing over existing text tables should just work, when vectorization is 
> enabled.
> Another goal is to go beyond a simple layering of vectorized row batch 
> iterator over the top of the existing row iterator. It should be possible to, 
> say, read a chunk of data into a byte buffer (several thousand or even 
> million rows), and then read data from it into vectorized row batches 
> directly. Object creations should be minimized to save allocation time and GC 
> overhead. If it is possible to save CPU for values like dates and numbers by 
> caching the translation from string to the final data type, that should 
> ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.

2014-02-03 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17622/#review33378
---


Looks good to me. See one comment inline.


ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
<https://reviews.apache.org/r/17622/#comment62885>

Please add a comment why you are using decimal.* and why it's different 
than the others.


- Eric Hanson


On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17622/
> ---
> 
> (Updated Jan. 31, 2014, 10:19 p.m.)
> 
> 
> Review request for hive and Eric Hanson.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> VectorExpressionWriter for date and decimal datatypes.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
> f513188 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
>  e5c3aa4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
>  a242fef 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
> ad96fa5 
>   ql/src/test/queries/clientpositive/vectorization_decimal_date.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/vectorization_decimal_date.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17622/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jitendra Pandey
> 
>

[jira] [Updated] (HIVE-6257) Add more unit tests for high-precision Decimal128 arithmetic

2014-01-31 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6257:
--

   Resolution: Implemented
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk

> Add more unit tests for high-precision Decimal128 arithmetic
> 
>
> Key: HIVE-6257
> URL: https://issues.apache.org/jira/browse/HIVE-6257
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.13.0
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
>Priority: Minor
> Fix For: 0.13.0
>
> Attachments: HIVE-6257.02.patch, HIVE-6257.03.patch, 
> HIVE-6257.04.patch
>
>
> Add more unit tests for high-precision Decimal128 arithmetic, with arguments 
> close to or at 38 digit limit. Consider some random stress tests for broader 
> coverage. Coverage is pretty good now (after HIVE-6243) for precision up to 
> about 18. This is to go beyond that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-31 Thread Eric Hanson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Hanson updated HIVE-6234:
--

Attachment: state-diagram.jpg

State diagram for finding line breaks. May be of use for future reference. Not 
done. Just a working document.

> Implement fast vectorized InputFormat extension for text files
> --
>
> Key: HIVE-6234
> URL: https://issues.apache.org/jira/browse/HIVE-6234
> Project: Hive
>  Issue Type: Sub-task
>        Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
> InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
> state-diagram.jpg
>
>
> Implement support for vectorized scan input of text files (plain text with 
> configurable record and field separators). This should work for CSV files, 
> tab delimited files, etc. 
> The goal is to provide high-performance reading of these files using 
> vectorized scans, and also to do it as an extension of existing Hive. Then, 
> if vectorized query is enabled, existing tables based on text files will be 
> able to benefit immediately without the need to use a different input format. 
> After upgrading to new Hive bits that support this, faster, vectorized 
> processing over existing text tables should just work, when vectorization is 
> enabled.
> Another goal is to go beyond a simple layering of vectorized row batch 
> iterator over the top of the existing row iterator. It should be possible to, 
> say, read a chunk of data into a byte buffer (several thousand or even 
> million rows), and then read data from it into vectorized row batches 
> directly. Object creations should be minimized to save allocation time and GC 
> overhead. If it is possible to save CPU for values like dates and numbers by 
> caching the translation from string to the final data type, that should 
> ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-01-31 Thread Eric Hanson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888279#comment-13888279
 ] 

Eric Hanson commented on HIVE-6234:
---

This is just getting started. I need to put this aside for a while (probably at 
least until the end of Feb.). I parked the latest information here on the JIRA.

> Implement fast vectorized InputFormat extension for text files
> --
>
> Key: HIVE-6234
> URL: https://issues.apache.org/jira/browse/HIVE-6234
> Project: Hive
>  Issue Type: Sub-task
>    Reporter: Eric Hanson
>    Assignee: Eric Hanson
> Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text 
> InputFormat design.docx, Vectorized Text InputFormat design.pdf, 
> state-diagram.jpg
>
>
> Implement support for vectorized scan input of text files (plain text with 
> configurable record and field separators). This should work for CSV files, 
> tab delimited files, etc. 
> The goal is to provide high-performance reading of these files using 
> vectorized scans, and also to do it as an extension of existing Hive. Then, 
> if vectorized query is enabled, existing tables based on text files will be 
> able to benefit immediately without the need to use a different input format. 
> After upgrading to new Hive bits that support this, faster, vectorized 
> processing over existing text tables should just work, when vectorization is 
> enabled.
> Another goal is to go beyond a simple layering of vectorized row batch 
> iterator over the top of the existing row iterator. It should be possible to, 
> say, read a chunk of data into a byte buffer (several thousand or even 
> million rows), and then read data from it into vectorized row batches 
> directly. Object creations should be minimized to save allocation time and GC 
> overhead. If it is possible to save CPU for values like dates and numbers by 
> caching the translation from string to the final data type, that should 
> ideally be implemented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 902 matches

Mail list logo