from:"Kevin Wilfong \(JIRA\)"

[jira] [Commented] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.

2014-04-23 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978736#comment-13978736
 ] 

Kevin Wilfong commented on HIVE-2621:
-

It's been a while, but I the definition you posted looks correct.

> Allow multiple group bys with the same input data and spray keys to be run on 
> the same reducer.
> ---
>
> Key: HIVE-2621
> URL: https://issues.apache.org/jira/browse/HIVE-2621
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.9.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.2.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.3.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.4.patch, HIVE-2621.1.patch.txt
>
>
> Currently, when a user runs a query, such as a multi-insert, where each 
> insertion subclause consists of a simple query followed by a group by, the 
> group bys for each clause are run on a separate reducer.  This requires 
> writing the data for each group by clause to an intermediate file, and then 
> reading it back.  This uses a significant amount of the total CPU consumed by 
> the query for an otherwise simple query.
> If the subclauses are grouped by their distinct expressions and group by 
> keys, with all of the group by expressions for a group of subclauses run on a 
> single reducer, this would reduce the amount of reading/writing to 
> intermediate files for some queries.
> To do this, for each group of subclauses, in the mapper we would execute a 
> the filters for each subclause 'or'd together (provided each subclause has a 
> filter) followed by a reduce sink.  In the reducer, the child operators would 
> be each subclauses filter followed by the group by and any subsequent 
> operations.
> Note that this would require turning off map aggregation, so we would need to 
> make using this type of plan configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-03 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917817#comment-13917817
 ] 

Kevin Wilfong commented on HIVE-4975:
-

The goal of this is just feature parity with other file formats, e.g. RC file.  
AFAIK, no formats in Hive handle reordering of columns, or swapping the names 
of columns (I'm assuming that's what you're worried about with regards to 
changing the name of a column).

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Assignee: Kevin Wilfong
>Priority: Critical
>  Labels: orcfile
> Fix For: 0.13.0
>
> Attachments: HIVE-4975.1.patch.txt
>
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at

[jira] [Comment Edited] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-03 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13917817#comment-13917817
 ] 

Kevin Wilfong edited comment on HIVE-4975 at 3/3/14 8:03 AM:
-

The goal of this patch is just feature parity with other file formats, e.g. RC 
file.  AFAIK, no formats in Hive handle reordering of columns, or swapping the 
names of columns (I'm assuming that's what you're worried about with regards to 
changing the name of a column).


was (Author: kevinwilfong):
The goal of this is just feature parity with other file formats, e.g. RC file.  
AFAIK, no formats in Hive handle reordering of columns, or swapping the names 
of columns (I'm assuming that's what you're worried about with regards to 
changing the name of a column).

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Assignee: Kevin Wilfong
>Priority: Critical
>  Labels: orcfile
> Fix For: 0.13.0
>
> Attachments: HIVE-4975.1.patch.txt
>
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.ha

[jira] [Updated] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-01 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4975:


Fix Version/s: 0.13.0
   Status: Patch Available  (was: Open)

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Assignee: Kevin Wilfong
>Priority: Critical
>  Labels: orcfile
> Fix For: 0.13.0
>
> Attachments: HIVE-4975.1.patch.txt
>
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluati

[jira] [Updated] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-01 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4975:


Attachment: HIVE-4975.1.patch.txt

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Assignee: Kevin Wilfong
>Priority: Critical
>  Labels: orcfile
> Attachments: HIVE-4975.1.patch.txt
>
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> d
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.

[jira] [Assigned] (HIVE-4975) Reading orc file throws exception after adding new column

2014-03-01 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong reassigned HIVE-4975:
---

Assignee: Kevin Wilfong

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Assignee: Kevin Wilfong
>Priority: Critical
>  Labels: orcfile
> Attachments: HIVE-4975.1.patch.txt
>
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> d
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.proc

[jira] [Commented] (HIVE-4975) Reading orc file throws exception after adding new column

2014-02-18 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904371#comment-13904371
 ] 

Kevin Wilfong commented on HIVE-4975:
-

This is probably the fix
https://github.com/facebook/hive-dwrf/commit/9c283feb00d94971d278e4f7deca9a929f9524b5

> Reading orc file throws exception after adding new column
> -
>
> Key: HIVE-4975
> URL: https://issues.apache.org/jira/browse/HIVE-4975
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.11.0
> Environment: hive 0.11.0 hadoop 1.0.0
>Reporter: cyril liao
>Priority: Critical
>  Labels: orcfile
>
> ORC file read failure after add table column.
> create a table which have three column .(a string,b string,c string).
> add a new column after c by executing "ALTER TABLE table ADD COLUMNS (d 
> string)".
> execute hiveql "select d from table",the following exception goes:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row [Error getting row data with 
> exception java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row [Error getting row data with exception 
> java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector.getStructFieldData(OrcStruct.java:206)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldData(UnionStructObjectInspector.java:128)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:371)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:236)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:222)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:665)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:671)
>   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
>   ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> d
>   at 
>

[jira] [Commented] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-06-05 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676104#comment-13676104
 ] 

Kevin Wilfong commented on HIVE-4324:
-

Sorry for the delay Owen.  Are you concerned that there will be applications 
outside of Hive calling methods in OrcFile.java?

If so I can add the backward compatible method.

> ORC Turn off dictionary encoding when number of distinct keys is greater than 
> threshold
> ---
>
> Key: HIVE-4324
> URL: https://issues.apache.org/jira/browse/HIVE-4324
> Project: Hive
>  Issue Type: Sub-task
>  Components: File Formats
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4324.1.patch.txt
>
>
> Add a configurable threshold so that if the number of distinct values in a 
> string column is greater than that fraction of non-null values, dictionary 
> encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4565) TestCliDriver and TestParse fail with non Sun Java

2013-05-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong reassigned HIVE-4565:
---

Assignee: (was: Kevin Wilfong)

> TestCliDriver and TestParse fail with non Sun Java
> --
>
> Key: HIVE-4565
> URL: https://issues.apache.org/jira/browse/HIVE-4565
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.11.0
> Environment: RedHat x86 IBM Java 6
>Reporter: Renata Ghisloti Duarte de Souza
>Priority: Minor
> Fix For: 0.11.0
>
>
> While executing Hive's unit tests two testcases have different outputs with 
> Sun Java and non-Sun Java (such as IBM):
> TestCliDriver and TestParse.
> The differences are mainly due to the use of HashMaps on the creation of the 
> Logical Plan on analyzeInternal method. Sun java presents the elements of a 
> HashMap in one order, and non sun Java on a different order.
> Both outputs are correct, and don't affect the final query result.  I propose 
> this patch attached to make Hive unit tests compliant with all JVMs.
> The patch adds the output files and a change on ql/build.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4565) TestCliDriver and TestParse fail with non Sun Java

2013-05-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong reassigned HIVE-4565:
---

Assignee: Kevin Wilfong

> TestCliDriver and TestParse fail with non Sun Java
> --
>
> Key: HIVE-4565
> URL: https://issues.apache.org/jira/browse/HIVE-4565
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.11.0
> Environment: RedHat x86 IBM Java 6
>Reporter: Renata Ghisloti Duarte de Souza
>Assignee: Kevin Wilfong
>Priority: Minor
> Fix For: 0.11.0
>
>
> While executing Hive's unit tests two testcases have different outputs with 
> Sun Java and non-Sun Java (such as IBM):
> TestCliDriver and TestParse.
> The differences are mainly due to the use of HashMaps on the creation of the 
> Logical Plan on analyzeInternal method. Sun java presents the elements of a 
> HashMap in one order, and non sun Java on a different order.
> Both outputs are correct, and don't affect the final query result.  I propose 
> this patch attached to make Hive unit tests compliant with all JVMs.
> The patch adds the output files and a change on ql/build.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4494) ORC map columns get class cast exception in some context

2013-05-08 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13652214#comment-13652214
 ] 

Kevin Wilfong commented on HIVE-4494:
-

+1

Go ahead and commit if tests pass.

> ORC map columns get class cast exception in some context
> 
>
> Key: HIVE-4494
> URL: https://issues.apache.org/jira/browse/HIVE-4494
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4494.D10653.1.patch, HIVE-4494.D10653.2.patch
>
>
> Setting up the test case like:
> {quote}
> create table map_text (
>   name string,
>   m map
> ) row format delimited
> fields terminated by '|'
> collection items terminated by ','
> map keys terminated by ':';
> create table map_orc (
>   name string,
>   m map
> ) stored as orc;
> cat map.txt
> name1|key11:value11,key12:value12,key13:value13
> name2|key21:value21,key22:value22,key23:value23
> name3|key31:value31,key32:value32,key33:value33
> load data local   inpath 'map.txt' into table map_text;
> insert overwrite table map_orc select * from map_text;
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4430) Semantic analysis fails in presence of certain literals in on clause

2013-04-26 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4430:


Status: Patch Available  (was: Open)

> Semantic analysis fails in presence of certain literals in on clause
> 
>
> Key: HIVE-4430
> URL: https://issues.apache.org/jira/browse/HIVE-4430
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
>Priority: Minor
> Attachments: HIVE-4430.HIVE-4430.HIVE-4430.HIVE-4430.D10587.1.patch
>
>
> When users include a bigint literal (a number suffixed with 'L') in the 
> conditions in the on clause the query will fail with, e.g.
> FAILED: SemanticException 0L encountered with 0 children
> I haven't tried it yet, but I suspect the same is true for other, lesser used 
> literals.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4430) Semantic analysis fails in presence of certain literals in on clause

2013-04-26 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4430:
---

 Summary: Semantic analysis fails in presence of certain literals 
in on clause
 Key: HIVE-4430
 URL: https://issues.apache.org/jira/browse/HIVE-4430
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor


When users include a bigint literal (a number suffixed with 'L') in the 
conditions in the on clause the query will fail with, e.g.

FAILED: SemanticException 0L encountered with 0 children

I haven't tried it yet, but I suspect the same is true for other, lesser used 
literals.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4340) ORC should provide raw data size

2013-04-25 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641967#comment-13641967
 ] 

Kevin Wilfong commented on HIVE-4340:
-

Sorry, I hadn't tested the patch after refreshing it, it wasn't ready for 
review.

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4340.1.patch.txt, HIVE-4340.2.patch.txt, 
> HIVE-4340.3.patch.txt
>
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-25 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Attachment: HIVE-4340.3.patch.txt

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4340.1.patch.txt, HIVE-4340.2.patch.txt, 
> HIVE-4340.3.patch.txt
>
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Attachment: HIVE-4340.2.patch.txt

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4340.1.patch.txt, HIVE-4340.2.patch.txt
>
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Status: Patch Available  (was: Open)

Refreshed.

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.7.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt, HIVE-4005.6.patch.txt, HIVE-4005.7.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4221) Stripe-level merge for ORC files

2013-04-24 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641208#comment-13641208
 ] 

Kevin Wilfong commented on HIVE-4221:
-

Comments on Phabricator

> Stripe-level merge for ORC files
> 
>
> Key: HIVE-4221
> URL: https://issues.apache.org/jira/browse/HIVE-4221
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Samuel Yuan
>Assignee: Samuel Yuan
> Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by 
> reading/writing stripes without decompressing/recompressing them. This will 
> be similar to the RC file merge, except that footers will have to be updated 
> with the stripe positions in the new file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4221) Stripe-level merge for ORC files

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4221:


Status: Open  (was: Patch Available)

> Stripe-level merge for ORC files
> 
>
> Key: HIVE-4221
> URL: https://issues.apache.org/jira/browse/HIVE-4221
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Samuel Yuan
>Assignee: Samuel Yuan
> Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by 
> reading/writing stripes without decompressing/recompressing them. This will 
> be similar to the RC file merge, except that footers will have to be updated 
> with the stripe positions in the new file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4005) Column truncation

2013-04-24 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4005:


Attachment: HIVE-4005.6.patch.txt

> Column truncation
> -
>
> Key: HIVE-4005
> URL: https://issues.apache.org/jira/browse/HIVE-4005
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4005.1.patch.txt, HIVE-4005.2.patch.txt, 
> HIVE-4005.3.patch.txt, HIVE-4005.4.patch.txt, HIVE-4005.5.patch.txt, 
> HIVE-4005.6.patch.txt
>
>
> Column truncation allows users to remove data for columns that are no longer 
> useful.
> This is done by removing the data for the column and setting the length of 
> the column data and related lengths to 0 in the RC file header.
> RC file was fixed to recognize columns with lengths of zero to be empty and 
> are treated as if the column doesn't exist in the data, a null is returned 
> for every value of that column in every row. This is the same thing that 
> happens when more columns are selected than exist in the file.
> A new command was added to the CLI
> TRUNCATE TABLE ... PARTITION ... COLUMNS ...
> This launches a map only job where each mapper rewrites a single file without 
> the unnecessary column data and the adjusted headers. It does not 
> uncompress/deserialize the data so it is much faster than rewriting the data 
> with NULLs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Attachment: HIVE-4340.1.patch.txt

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4340.1.patch.txt
>
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4340:


Status: Patch Available  (was: Open)

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4340.1.patch.txt
>
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4340) ORC should provide raw data size

2013-04-19 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13636677#comment-13636677
 ] 

Kevin Wilfong commented on HIVE-4340:
-

https://reviews.facebook.net/D10179

> ORC should provide raw data size
> 
>
> Key: HIVE-4340
> URL: https://issues.apache.org/jira/browse/HIVE-4340
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
>
> ORC's SerDe currently does nothing, and hence does not calculate a raw data 
> size.  WriterImpl, however, has enough information to provide one.
> WriterImpl should compute a raw data size for each row, aggregate them per 
> stripe and record it in the strip information, as RC currently does in its 
> key header, and allow the FileSinkOperator access to the size per row.
> FileSinkOperator should be able to get the raw data size from either the 
> SerDe or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4344) CREATE VIEW fails when redundant casts are rewritten

2013-04-16 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633290#comment-13633290
 ] 

Kevin Wilfong commented on HIVE-4344:
-

+1

> CREATE VIEW fails when redundant casts are rewritten
> 
>
> Key: HIVE-4344
> URL: https://issues.apache.org/jira/browse/HIVE-4344
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Samuel Yuan
>Assignee: Samuel Yuan
> Attachments: HIVE-4344.HIVE-4344.HIVE-4344.HIVE-4344.D10221.1.patch
>
>
> e.g. create view v as select cast(key as string) from src;
> The rewriter tries to replace both cast(key as string) and key as 
> `src`.`key`, because cast(key as string) is a no-op.
> There may be other cases like this one.
> See HIVE-2439 for context.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630814#comment-13630814
 ] 

Kevin Wilfong commented on HIVE-4318:
-

I'm just really surprised that a couple of null checks increase the amount of 
time by ~5% especially given that we do maybe 4 null checks in the 
FileSinkOperator's process method alone.

Of course, I can't argue with facts, so if you could try such a patch once it's 
available and post your results I'd really appreciate it.

> OperatorHooks hit performance even when not used
> 
>
> Key: HIVE-4318
> URL: https://issues.apache.org/jira/browse/HIVE-4318
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
> Environment: Ubuntu LXC (64 bit)
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch, 
> HIVE-4318.patch.pam.txt
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when 
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has 
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
> HiveException {
>return;
>  }
>  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
> tag);
> -preProcessCounter();
> -enterOperatorHooks(opHookContext);
> +//preProcessCounter();
> +//enterOperatorHooks(opHookContext);
>  processOp(row, tag);
> -exitOperatorHooks(opHookContext);
> -postProcessCounter();
> +//exitOperatorHooks(opHookContext);
> +//postProcessCounter();
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-12 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630637#comment-13630637
 ] 

Kevin Wilfong commented on HIVE-4318:
-

It's not clear to me that we can't cut down the cost added by operator hooks 
when there are no operator hooks present to the point where it does not 
significantly affect performance.

Pam, could you provide Gunther a patch which sets the list of operator hooks to 
null rather than the empty list, and initializes the OperatorHookContext in the 
calls to enterOperatorHooks and exitOperatorHooks after the check if the list 
is null.  This should limit the impact of operator hooks, to two method calls 
and two null checks.  We could even put the check if this.operatorHooks==null 
around the method calls themselves, in case the Java compiler isn't inlining it 
for some reason.

If after that, they still introduce a substantial amount of overhead, there's 
not much more we can do, and I'd be ok with removing operator hooks. 

> OperatorHooks hit performance even when not used
> 
>
> Key: HIVE-4318
> URL: https://issues.apache.org/jira/browse/HIVE-4318
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
> Environment: Ubuntu LXC (64 bit)
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4318.1.patch, HIVE-4318.2.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when 
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has 
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
> HiveException {
>return;
>  }
>  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
> tag);
> -preProcessCounter();
> -enterOperatorHooks(opHookContext);
> +//preProcessCounter();
> +//enterOperatorHooks(opHookContext);
>  processOp(row, tag);
> -exitOperatorHooks(opHookContext);
> -postProcessCounter();
> +//exitOperatorHooks(opHookContext);
> +//postProcessCounter();
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4340) ORC should provide raw data size

2013-04-11 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4340:
---

 Summary: ORC should provide raw data size
 Key: HIVE-4340
 URL: https://issues.apache.org/jira/browse/HIVE-4340
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


ORC's SerDe currently does nothing, and hence does not calculate a raw data 
size.  WriterImpl, however, has enough information to provide one.

WriterImpl should compute a raw data size for each row, aggregate them per 
stripe and record it in the strip information, as RC currently does in its key 
header, and allow the FileSinkOperator access to the size per row.

FileSinkOperator should be able to get the raw data size from either the SerDe 
or the RecordWriter when the RecordWriter can provide it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-11 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629247#comment-13629247
 ] 

Kevin Wilfong commented on HIVE-4318:
-

It looks like the performance cost of those methods was recognized shortly 
after they were added
https://issues.apache.org/jira/browse/HIVE-768

It's possible the config hive.task.progress isn't as effective at countering it 
as expected.

Can you make sure it is set to false in your run.

> OperatorHooks hit performance even when not used
> 
>
> Key: HIVE-4318
> URL: https://issues.apache.org/jira/browse/HIVE-4318
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
> Environment: Ubuntu LXC (64 bit)
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4318.1.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when 
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has 
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
> HiveException {
>return;
>  }
>  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
> tag);
> -preProcessCounter();
> -enterOperatorHooks(opHookContext);
> +//preProcessCounter();
> +//enterOperatorHooks(opHookContext);
>  processOp(row, tag);
> -exitOperatorHooks(opHookContext);
> -postProcessCounter();
> +//exitOperatorHooks(opHookContext);
> +//postProcessCounter();
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4318) OperatorHooks hit performance even when not used

2013-04-11 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629242#comment-13629242
 ] 

Kevin Wilfong commented on HIVE-4318:
-

The preProcessCounter and postProcessCounter weren't added for operator hooks, 
they were preexisting, and it doesn't look like they were modified by the diff 
either.
http://svn.apache.org/viewvc/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java?r1=1404933&r2=1438825

Could you try your experiment again just removing those two method calls and 
not enterOperatorHooks and exitOperatorHooks.  Those two methods look really 
cheap if there are not operator hooks (they just check if the list is null and 
return).

I just want to make sure if anything is removed, we're not removing too much.

> OperatorHooks hit performance even when not used
> 
>
> Key: HIVE-4318
> URL: https://issues.apache.org/jira/browse/HIVE-4318
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
> Environment: Ubuntu LXC (64 bit)
>Reporter: Gopal V
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4318.1.patch
>
>
> Operator Hooks inserted into Operator.java cause a performance hit even when 
> it is not being used.
> For a count(1) query tested with & without the operator hook calls.
> {code:title=with}
> 2013-04-09 07:33:58,920 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 84.07 sec
> Total MapReduce CPU Time Spent: 1 minutes 24 seconds 70 msec
> OK
> 28800991
> Time taken: 40.407 seconds, Fetched: 1 row(s)
> {code}
> {code:title=without}
> 2013-04-09 07:33:02,355 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 68.48 sec
> ...
> Total MapReduce CPU Time Spent: 1 minutes 8 seconds 480 msec
> OK
> 28800991
> Time taken: 35.907 seconds, Fetched: 1 row(s)
> {code}
> The effect is multiplied by the number of operators in the pipeline that has 
> to forward the row - the more operators there are the, the slower the query.
> The modification made to test this was 
> {code:title=Operator.java}
> --- ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> +++ ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
> @@ -526,16 +526,16 @@ public void process(Object row, int tag) throws 
> HiveException {
>return;
>  }
>  OperatorHookContext opHookContext = new OperatorHookContext(this, row, 
> tag);
> -preProcessCounter();
> -enterOperatorHooks(opHookContext);
> +//preProcessCounter();
> +//enterOperatorHooks(opHookContext);
>  processOp(row, tag);
> -exitOperatorHooks(opHookContext);
> -postProcessCounter();
> +//exitOperatorHooks(opHookContext);
> +//postProcessCounter();
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4336:


Attachment: HIVE-4336.3.patch.txt

> Selecting from a view, and another view that also selects from that view fails
> --
>
> Key: HIVE-4336
> URL: https://issues.apache.org/jira/browse/HIVE-4336
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4336.1.patch.txt, HIVE-4336.2.patch.txt, 
> HIVE-4336.3.patch.txt
>
>
> E.g. the following query fails with an NPE
> CREATE VIEW test_view1 AS SELECT * FROM src;
> CREATE VIEW test_view2 AS SELECT * FROM test_view1;
> SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4336:


Attachment: HIVE-4336.2.patch.txt

> Selecting from a view, and another view that also selects from that view fails
> --
>
> Key: HIVE-4336
> URL: https://issues.apache.org/jira/browse/HIVE-4336
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4336.1.patch.txt, HIVE-4336.2.patch.txt
>
>
> E.g. the following query fails with an NPE
> CREATE VIEW test_view1 AS SELECT * FROM src;
> CREATE VIEW test_view2 AS SELECT * FROM test_view1;
> SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4336:


Status: Patch Available  (was: Open)

> Selecting from a view, and another view that also selects from that view fails
> --
>
> Key: HIVE-4336
> URL: https://issues.apache.org/jira/browse/HIVE-4336
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4336.1.patch.txt
>
>
> E.g. the following query fails with an NPE
> CREATE VIEW test_view1 AS SELECT * FROM src;
> CREATE VIEW test_view2 AS SELECT * FROM test_view1;
> SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4336:


Attachment: HIVE-4336.1.patch.txt

> Selecting from a view, and another view that also selects from that view fails
> --
>
> Key: HIVE-4336
> URL: https://issues.apache.org/jira/browse/HIVE-4336
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4336.1.patch.txt
>
>
> E.g. the following query fails with an NPE
> CREATE VIEW test_view1 AS SELECT * FROM src;
> CREATE VIEW test_view2 AS SELECT * FROM test_view1;
> SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628188#comment-13628188
 ] 

Kevin Wilfong commented on HIVE-4336:
-

https://reviews.facebook.net/D10125

> Selecting from a view, and another view that also selects from that view fails
> --
>
> Key: HIVE-4336
> URL: https://issues.apache.org/jira/browse/HIVE-4336
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4336.1.patch.txt
>
>
> E.g. the following query fails with an NPE
> CREATE VIEW test_view1 AS SELECT * FROM src;
> CREATE VIEW test_view2 AS SELECT * FROM test_view1;
> SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4336) Selecting from a view, and another view that also selects from that view fails

2013-04-10 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4336:
---

 Summary: Selecting from a view, and another view that also selects 
from that view fails
 Key: HIVE-4336
 URL: https://issues.apache.org/jira/browse/HIVE-4336
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


E.g. the following query fails with an NPE

CREATE VIEW test_view1 AS SELECT * FROM src;

CREATE VIEW test_view2 AS SELECT * FROM test_view1;

SELECT COUNT(*) FROM test_view1 a JOIN test_view2 b ON a.key = b.key;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4324:


Status: Patch Available  (was: Open)

> ORC Turn off dictionary encoding when number of distinct keys is greater than 
> threshold
> ---
>
> Key: HIVE-4324
> URL: https://issues.apache.org/jira/browse/HIVE-4324
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4324.1.patch.txt
>
>
> Add a configurable threshold so that if the number of distinct values in a 
> string column is greater than that fraction of non-null values, dictionary 
> encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-04-10 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4324:


Attachment: HIVE-4324.1.patch.txt

> ORC Turn off dictionary encoding when number of distinct keys is greater than 
> threshold
> ---
>
> Key: HIVE-4324
> URL: https://issues.apache.org/jira/browse/HIVE-4324
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4324.1.patch.txt
>
>
> Add a configurable threshold so that if the number of distinct values in a 
> string column is greater than that fraction of non-null values, dictionary 
> encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-04-10 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627978#comment-13627978
 ] 

Kevin Wilfong commented on HIVE-4324:
-

https://reviews.facebook.net/D10113

> ORC Turn off dictionary encoding when number of distinct keys is greater than 
> threshold
> ---
>
> Key: HIVE-4324
> URL: https://issues.apache.org/jira/browse/HIVE-4324
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4324.1.patch.txt
>
>
> Add a configurable threshold so that if the number of distinct values in a 
> string column is greater than that fraction of non-null values, dictionary 
> encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4324) ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

2013-04-09 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4324:
---

 Summary: ORC Turn off dictionary encoding when number of distinct 
keys is greater than threshold
 Key: HIVE-4324
 URL: https://issues.apache.org/jira/browse/HIVE-4324
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


Add a configurable threshold so that if the number of distinct values in a 
string column is greater than that fraction of non-null values, dictionary 
encoding is turned off.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-04-09 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4189:


Attachment: HIVE-4189.2.patch.txt

> ORC fails with String column that ends in lots of nulls
> ---
>
> Key: HIVE-4189
> URL: https://issues.apache.org/jira/browse/HIVE-4189
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4189.1.patch.txt, HIVE-4189.2.patch.txt
>
>
> When ORC attempts to write out a string column that ends in enough nulls to 
> span an index stride, StringTreeWriter's writeStripe method will get an 
> exception from TreeWriter's writeStripe method
> Column has wrong number of index entries found: x expected: y
> This is caused by rowIndexValueCount having multiple entries equal to the 
> number of non-null rows in the column, combined with the fact that 
> StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4199) ORC writer doesn't handle non-UTF8 encoded Text properly

2013-04-09 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4199:


Status: Open  (was: Patch Available)

> ORC writer doesn't handle non-UTF8 encoded Text properly
> 
>
> Key: HIVE-4199
> URL: https://issues.apache.org/jira/browse/HIVE-4199
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Samuel Yuan
>Assignee: Samuel Yuan
>Priority: Minor
> Attachments: HIVE-4199.HIVE-4199.HIVE-4199.D9501.1.patch, 
> HIVE-4199.HIVE-4199.HIVE-4199.D9501.2.patch, 
> HIVE-4199.HIVE-4199.HIVE-4199.D9501.3.patch, 
> HIVE-4199.HIVE-4199.HIVE-4199.D9501.4.patch
>
>
> StringTreeWriter currently converts fields stored as Text objects into 
> Strings. This can lose information (see 
> http://en.wikipedia.org/wiki/Replacement_character#Replacement_character), 
> and is also unnecessary since the dictionary stores Text objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4178) ORC fails with files with different numbers of columns

2013-04-08 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4178:


Status: Patch Available  (was: Open)

> ORC fails with files with different numbers of columns
> --
>
> Key: HIVE-4178
> URL: https://issues.apache.org/jira/browse/HIVE-4178
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4178.1.patch.txt
>
>
> When CombineHiveInputFormat is used, it's possible that two files with 
> different numbers of files can be included in the same split, in which case 
> Hive will fail at one of several points with an 
> ArrayIndexOutOfBoundsException.
> This can happen when a partition contains empty files or two partitions are 
> read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4178) ORC fails with files with different numbers of columns

2013-04-08 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13625943#comment-13625943
 ] 

Kevin Wilfong commented on HIVE-4178:
-

Responded to comments on Phabricator.

> ORC fails with files with different numbers of columns
> --
>
> Key: HIVE-4178
> URL: https://issues.apache.org/jira/browse/HIVE-4178
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4178.1.patch.txt
>
>
> When CombineHiveInputFormat is used, it's possible that two files with 
> different numbers of files can be included in the same split, in which case 
> Hive will fail at one of several points with an 
> ArrayIndexOutOfBoundsException.
> This can happen when a partition contains empty files or two partitions are 
> read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4312) Make ORC SerDe support replace columns

2013-04-08 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4312:
---

 Summary: Make ORC SerDe support replace columns
 Key: HIVE-4312
 URL: https://issues.apache.org/jira/browse/HIVE-4312
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong


In the alterTable method of DDLTask.java there is an explicit list of SerDes 
which support the replace columns command.  ORC should support this, at least 
for partitioned tables, maybe not unpartitioned tables.

This may be as simple as adding it to that list, but I suspect some significant 
changes will be needed to make this work the the CombineHiveInputFormat (e.g. 
where are combined and one split has a column stored as a string and in the 
other it is stored as an int).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4151) HiveProfiler NPE with ScriptOperator

2013-04-05 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4151:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed, thanks Pam.

> HiveProfiler NPE with ScriptOperator
> 
>
> Key: HIVE-4151
> URL: https://issues.apache.org/jira/browse/HIVE-4151
> Project: Hive
>  Issue Type: Bug
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Fix For: 0.11.0
>
> Attachments: HIVE-4151.patch.0.txt, HIVE-4151.patch.1.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4151) HiveProfiler NPE with ScriptOperator

2013-04-04 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622966#comment-13622966
 ] 

Kevin Wilfong commented on HIVE-4151:
-

+1

> HiveProfiler NPE with ScriptOperator
> 
>
> Key: HIVE-4151
> URL: https://issues.apache.org/jira/browse/HIVE-4151
> Project: Hive
>  Issue Type: Bug
>Reporter: Pamela Vagata
>Assignee: Pamela Vagata
>Priority: Minor
> Attachments: HIVE-4151.patch.0.txt, HIVE-4151.patch.1.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-28 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4235:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed, thanks Tim.

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Fix For: 0.11.0
>
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4244) Make string dictionaries adaptive in ORC

2013-03-28 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616547#comment-13616547
 ] 

Kevin Wilfong commented on HIVE-4244:
-

Some initial thoughts based on some experiments.

Dicitonary encoding seems to be less effective than just Zlib at compressing 
values if the number of distinct values is > ~80% of the total number of 
values.  This number can be configurable.  It's still smaller in memory, so we 
may be able to get away with on writing the stripe, writing out the data 
directly there.  This should be comparable in performance to converting the 
dictionary index that is already done.

Also, if the uncompressed (but encoded) size of the dictionary + index (data 
stream) is greater than the size of the uncompressed size of the original data, 
the compressed data tends to be larger as well despite the sorting.  This will 
be more expensive to figure out as we don't know the size of the index until it 
has been run length encoded.

> Make string dictionaries adaptive in ORC
> 
>
> Key: HIVE-4244
> URL: https://issues.apache.org/jira/browse/HIVE-4244
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Kevin Wilfong
>
> The ORC writer should adaptively switch between dictionary and direct 
> encoding. I'd propose looking at the first 100,000 values in each column and 
> decide whether there is sufficient loading in the dictionary to use 
> dictionary encoding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4244) Make string dictionaries adaptive in ORC

2013-03-28 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong reassigned HIVE-4244:
---

Assignee: Kevin Wilfong  (was: Owen O'Malley)

> Make string dictionaries adaptive in ORC
> 
>
> Key: HIVE-4244
> URL: https://issues.apache.org/jira/browse/HIVE-4244
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Kevin Wilfong
>
> The ORC writer should adaptively switch between dictionary and direct 
> encoding. I'd propose looking at the first 100,000 values in each column and 
> decide whether there is sufficient loading in the dictionary to use 
> dictionary encoding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists

2013-03-26 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614742#comment-13614742
 ] 

Kevin Wilfong commented on HIVE-4235:
-

+1

> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
> 
>
> Key: HIVE-4235
> URL: https://issues.apache.org/jira/browse/HIVE-4235
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Query Processor, SQL
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-4235.patch.1
>
>
> CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists.
> It uses Hive.java's getTablesByPattern(...) to check if table exists. It 
> involves regular expression and eventually database join. Very efficient. It 
> can cause database lock time increase and hurt db performance if a lot of 
> such commands hit database.
> The suggested approach is to use getTable(...) since we know tablename already

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases

2013-03-22 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611242#comment-13611242
 ] 

Kevin Wilfong commented on HIVE-4159:
-

Thanks Ashutosh, I can take care of it.

> RetryingHMSHandler doesn't retry in enough cases
> 
>
> Key: HIVE-4159
> URL: https://issues.apache.org/jira/browse/HIVE-4159
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4159.1.patch.txt
>
>
> HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in 
> MetaExceptions.  This caused the RetryingHMSHandler to not retry on these 
> exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4156) need to add protobuf classes to hive-exec.jar

2013-03-22 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611046#comment-13611046
 ] 

Kevin Wilfong commented on HIVE-4156:
-

+1

> need to add protobuf classes to hive-exec.jar
> -
>
> Key: HIVE-4156
> URL: https://issues.apache.org/jira/browse/HIVE-4156
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4156.D9375.1.patch
>
>
> In some queries, the tasks fail when they can't find classes from the 
> protobuf library.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4092) Store complete names of tables in column access analyzer

2013-03-22 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4092:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Sam.

> Store complete names of tables in column access analyzer
> 
>
> Key: HIVE-4092
> URL: https://issues.apache.org/jira/browse/HIVE-4092
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Samuel Yuan
>Assignee: Samuel Yuan
>Priority: Trivial
> Fix For: 0.11.0
>
> Attachments: HIVE-4092.HIVE-4092.HIVE-4092.D8985.1.patch
>
>
> Right now the db name is not being stored. We should store the complete name, 
> which includes the db name, as the table access analyzer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4217) Fix show_create_table_*.q test failures

2013-03-21 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609598#comment-13609598
 ] 

Kevin Wilfong commented on HIVE-4217:
-

+1

> Fix show_create_table_*.q test failures
> ---
>
> Key: HIVE-4217
> URL: https://issues.apache.org/jira/browse/HIVE-4217
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-4217.1.patch.txt, HIVE-4217.2.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4217) Fix show_create_table_*.q test failures

2013-03-21 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609542#comment-13609542
 ] 

Kevin Wilfong commented on HIVE-4217:
-

Can you remove the entry in eclipse-templates/.classpath for the stringtemplate 
jar as well?

> Fix show_create_table_*.q test failures
> ---
>
> Key: HIVE-4217
> URL: https://issues.apache.org/jira/browse/HIVE-4217
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-4217.1.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-21 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4188:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed, thanks Prasad.

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
> Fix For: 0.11.0
>
> Attachments: HIVE-4188-1.patch, HIVE-4188-2.patch
>
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-20 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608276#comment-13608276
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Running tests.

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
> Attachments: HIVE-4188-1.patch, HIVE-4188-2.patch
>
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4015) Add ORC file to the grammar as a file format

2013-03-20 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4015:


   Resolution: Fixed
Fix Version/s: 0.11.0
   Status: Resolved  (was: Patch Available)

Committed, thanks Gunther.

> Add ORC file to the grammar as a file format
> 
>
> Key: HIVE-4015
> URL: https://issues.apache.org/jira/browse/HIVE-4015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Gunther Hagleitner
> Fix For: 0.11.0
>
> Attachments: HIVE-4015.1.patch, HIVE-4015.2.patch, HIVE-4015.3.patch, 
> HIVE-4015.4.patch, HIVE-4015.5.patch
>
>
> It would be much more convenient for users if we enable them to use ORC as a 
> file format in the HQL grammar. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-20 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608108#comment-13608108
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Could you update the patch eclipse-templates/.classpath has been updated.

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
> Attachments: HIVE-4188-1.patch
>
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-4207) Implement "ALTER PARTITIONPROPERTIES" to allow partition level properties to be changed

2013-03-20 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong reassigned HIVE-4207:
---

Assignee: Dheeraj Kumar Singh

> Implement "ALTER PARTITIONPROPERTIES" to allow partition level properties to 
> be changed
> ---
>
> Key: HIVE-4207
> URL: https://issues.apache.org/jira/browse/HIVE-4207
> Project: Hive
>  Issue Type: New Feature
>Reporter: Dheeraj Kumar Singh
>Assignee: Dheeraj Kumar Singh
>Priority: Minor
>
> What we want is something like this:
> ALTER TABLE  PARTITION  SET PARTITIONPROPERTIES 
> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-20 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607854#comment-13607854
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Thanks +1

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
> Attachments: HIVE-4188-1.patch
>
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4015) Add ORC file to the grammar as a file format

2013-03-19 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607081#comment-13607081
 ] 

Kevin Wilfong commented on HIVE-4015:
-

Yes +1

> Add ORC file to the grammar as a file format
> 
>
> Key: HIVE-4015
> URL: https://issues.apache.org/jira/browse/HIVE-4015
> Project: Hive
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>Assignee: Gunther Hagleitner
> Attachments: HIVE-4015.1.patch, HIVE-4015.2.patch, HIVE-4015.3.patch, 
> HIVE-4015.4.patch, HIVE-4015.5.patch
>
>
> It would be much more convenient for users if we enable them to use ORC as a 
> file format in the HQL grammar. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4154) NPE reading column of empty string from ORC file

2013-03-18 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4154:


Attachment: HIVE-4154.2.patch.txt

> NPE reading column of empty string from ORC file
> 
>
> Key: HIVE-4154
> URL: https://issues.apache.org/jira/browse/HIVE-4154
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4154.1.patch.txt, HIVE-4154.2.patch.txt
>
>
> If a String column contains only empty strings, a null pointer exception is 
> throws from the RecordReaderImpl for ORC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604006#comment-13604006
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Could you attach the patch to the JIRA and mark it Patch Available if it's 
ready for review.

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils

2013-03-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4138:


Status: Open  (was: Patch Available)

I'm getting a failure from TestOrcStruct

> ORC's union object inspector returns a type name that isn't parseable by 
> TypeInfoUtils
> --
>
> Key: HIVE-4138
> URL: https://issues.apache.org/jira/browse/HIVE-4138
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: h-4138.patch, HIVE-4138.D9219.1.patch
>
>
> Currently the typename returned by ORC's union object inspector isn't 
> parseable by TypeInfoUtils. The format needs to be union.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603905#comment-13603905
 ] 

Kevin Wilfong commented on HIVE-4145:
-

+1 thanks for fixing the other Eclipse issues

> Create hcatalog stub directory and add it to the build
> --
>
> Key: HIVE-4145
> URL: https://issues.apache.org/jira/browse/HIVE-4145
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-4145.1.patch.txt, HIVE-4145.2.patch.txt, 
> HIVE-4145.3.patch.txt
>
>
> Alan has requested that we create a directory for hcatalog and give the 
> HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4162) disable TestBeeLineDriver

2013-03-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4162:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Thejas.

> disable TestBeeLineDriver
> -
>
> Key: HIVE-4162
> URL: https://issues.apache.org/jira/browse/HIVE-4162
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.11.0
>
> Attachments: HIVE-4162.1.patch
>
>
> See HIVE-4161. We should disable the TestBeeLineDriver test cases. In its 
> current state, it was not supposed to be enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4189:


Status: Patch Available  (was: Open)

> ORC fails with String column that ends in lots of nulls
> ---
>
> Key: HIVE-4189
> URL: https://issues.apache.org/jira/browse/HIVE-4189
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4189.1.patch.txt
>
>
> When ORC attempts to write out a string column that ends in enough nulls to 
> span an index stride, StringTreeWriter's writeStripe method will get an 
> exception from TreeWriter's writeStripe method
> Column has wrong number of index entries found: x expected: y
> This is caused by rowIndexValueCount having multiple entries equal to the 
> number of non-null rows in the column, combined with the fact that 
> StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603819#comment-13603819
 ] 

Kevin Wilfong commented on HIVE-4189:
-

https://reviews.facebook.net/D9465

> ORC fails with String column that ends in lots of nulls
> ---
>
> Key: HIVE-4189
> URL: https://issues.apache.org/jira/browse/HIVE-4189
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4189.1.patch.txt
>
>
> When ORC attempts to write out a string column that ends in enough nulls to 
> span an index stride, StringTreeWriter's writeStripe method will get an 
> exception from TreeWriter's writeStripe method
> Column has wrong number of index entries found: x expected: y
> This is caused by rowIndexValueCount having multiple entries equal to the 
> number of non-null rows in the column, combined with the fact that 
> StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4189:


Attachment: HIVE-4189.1.patch.txt

> ORC fails with String column that ends in lots of nulls
> ---
>
> Key: HIVE-4189
> URL: https://issues.apache.org/jira/browse/HIVE-4189
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4189.1.patch.txt
>
>
> When ORC attempts to write out a string column that ends in enough nulls to 
> span an index stride, StringTreeWriter's writeStripe method will get an 
> exception from TreeWriter's writeStripe method
> Column has wrong number of index entries found: x expected: y
> This is caused by rowIndexValueCount having multiple entries equal to the 
> number of non-null rows in the column, combined with the fact that 
> StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4189) ORC fails with String column that ends in lots of nulls

2013-03-15 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4189:
---

 Summary: ORC fails with String column that ends in lots of nulls
 Key: HIVE-4189
 URL: https://issues.apache.org/jira/browse/HIVE-4189
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


When ORC attempts to write out a string column that ends in enough nulls to 
span an index stride, StringTreeWriter's writeStripe method will get an 
exception from TreeWriter's writeStripe method

Column has wrong number of index entries found: x expected: y

This is caused by rowIndexValueCount having multiple entries equal to the 
number of non-null rows in the column, combined with the fact that 
StringTreeWriter has special logic for constructing its index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603797#comment-13603797
 ] 

Kevin Wilfong commented on HIVE-4188:
-

Thanks Prasad

> TestJdbcDriver2.testDescribeTable failing consistently
> --
>
> Key: HIVE-4188
> URL: https://issues.apache.org/jira/browse/HIVE-4188
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Prasad Mujumdar
>
> Running in Linux on a clean checkout after running ant very-clean package, 
> the test TestJdbcDriver2.testDescribeTable fails consistently with 
> Column name 'under_col' not found expected: but was:<# col_name >
> junit.framework.ComparisonFailure: Column name 'under_col' not found 
> expected: but was:<# col_name >
> at junit.framework.Assert.assertEquals(Assert.java:81)
> at 
> org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:154)
> at junit.framework.TestCase.runBare(TestCase.java:127)
> at junit.framework.TestResult$1.protect(TestResult.java:106)
> at junit.framework.TestResult.runProtected(TestResult.java:124)
> at junit.framework.TestResult.run(TestResult.java:109)
> at junit.framework.TestCase.run(TestCase.java:118)
> at junit.framework.TestSuite.runTest(TestSuite.java:208)
> at junit.framework.TestSuite.run(TestSuite.java:203)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
> at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4188) TestJdbcDriver2.testDescribeTable failing consistently

2013-03-15 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4188:
---

 Summary: TestJdbcDriver2.testDescribeTable failing consistently
 Key: HIVE-4188
 URL: https://issues.apache.org/jira/browse/HIVE-4188
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.11.0
Reporter: Kevin Wilfong


Running in Linux on a clean checkout after running ant very-clean package, the 
test TestJdbcDriver2.testDescribeTable fails consistently with 

Column name 'under_col' not found expected: but was:<# col_name >

junit.framework.ComparisonFailure: Column name 'under_col' not found 
expected: but was:<# col_name >
at junit.framework.Assert.assertEquals(Assert.java:81)
at 
org.apache.hive.jdbc.TestJdbcDriver2.testDescribeTable(TestJdbcDriver2.java:815)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:154)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4145) Create hcatalog stub directory and add it to the build

2013-03-15 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603754#comment-13603754
 ] 

Kevin Wilfong commented on HIVE-4145:
-

Could you add an entry to eclipse-templates/.classpath as well for hcatalog?

> Create hcatalog stub directory and add it to the build
> --
>
> Key: HIVE-4145
> URL: https://issues.apache.org/jira/browse/HIVE-4145
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
> Attachments: HIVE-4145.1.patch.txt
>
>
> Alan has requested that we create a directory for hcatalog and give the 
> HCatalog submodule committers karma on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4162) disable TestBeeLineDriver

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603039#comment-13603039
 ] 

Kevin Wilfong commented on HIVE-4162:
-

Yep, I'm running the tests.

> disable TestBeeLineDriver
> -
>
> Key: HIVE-4162
> URL: https://issues.apache.org/jira/browse/HIVE-4162
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.11.0
>
> Attachments: HIVE-4162.1.patch
>
>
> See HIVE-4161. We should disable the TestBeeLineDriver test cases. In its 
> current state, it was not supposed to be enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4138) ORC's union object inspector returns a type name that isn't parseable by TypeInfoUtils

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603018#comment-13603018
 ] 

Kevin Wilfong commented on HIVE-4138:
-

+1 on the update.

> ORC's union object inspector returns a type name that isn't parseable by 
> TypeInfoUtils
> --
>
> Key: HIVE-4138
> URL: https://issues.apache.org/jira/browse/HIVE-4138
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: h-4138.patch, HIVE-4138.D9219.1.patch
>
>
> Currently the typename returned by ORC's union object inspector isn't 
> parseable by TypeInfoUtils. The format needs to be union.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4176:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Namit and Ashutosh.

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-4176.1.patch.txt, HIVE-4176.2.patch.txt
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4178) ORC fails with files with different numbers of columns

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4178:


Status: Patch Available  (was: Open)

> ORC fails with files with different numbers of columns
> --
>
> Key: HIVE-4178
> URL: https://issues.apache.org/jira/browse/HIVE-4178
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4178.1.patch.txt
>
>
> When CombineHiveInputFormat is used, it's possible that two files with 
> different numbers of files can be included in the same split, in which case 
> Hive will fail at one of several points with an 
> ArrayIndexOutOfBoundsException.
> This can happen when a partition contains empty files or two partitions are 
> read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4178) ORC fails with files with different numbers of columns

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4178:


Attachment: HIVE-4178.1.patch.txt

> ORC fails with files with different numbers of columns
> --
>
> Key: HIVE-4178
> URL: https://issues.apache.org/jira/browse/HIVE-4178
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4178.1.patch.txt
>
>
> When CombineHiveInputFormat is used, it's possible that two files with 
> different numbers of files can be included in the same split, in which case 
> Hive will fail at one of several points with an 
> ArrayIndexOutOfBoundsException.
> This can happen when a partition contains empty files or two partitions are 
> read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4178) ORC fails with files with different numbers of columns

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602865#comment-13602865
 ] 

Kevin Wilfong commented on HIVE-4178:
-

https://reviews.facebook.net/D9423

> ORC fails with files with different numbers of columns
> --
>
> Key: HIVE-4178
> URL: https://issues.apache.org/jira/browse/HIVE-4178
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4178.1.patch.txt
>
>
> When CombineHiveInputFormat is used, it's possible that two files with 
> different numbers of files can be included in the same split, in which case 
> Hive will fail at one of several points with an 
> ArrayIndexOutOfBoundsException.
> This can happen when a partition contains empty files or two partitions are 
> read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4178) ORC fails with files with different numbers of columns

2013-03-14 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4178:
---

 Summary: ORC fails with files with different numbers of columns
 Key: HIVE-4178
 URL: https://issues.apache.org/jira/browse/HIVE-4178
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


When CombineHiveInputFormat is used, it's possible that two files with 
different numbers of files can be included in the same split, in which case 
Hive will fail at one of several points with an ArrayIndexOutOfBoundsException.

This can happen when a partition contains empty files or two partitions are 
read with different numbers of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4176:


Attachment: HIVE-4176.2.patch.txt

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-4176.1.patch.txt, HIVE-4176.2.patch.txt
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4176:


Status: Patch Available  (was: Open)

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-4176.1.patch.txt
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4156) need to add protobuf classes to hive-exec.jar

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602452#comment-13602452
 ] 

Kevin Wilfong commented on HIVE-4156:
-

Is the Snappy jar needed as well for the SnappyCodec?

> need to add protobuf classes to hive-exec.jar
> -
>
> Key: HIVE-4156
> URL: https://issues.apache.org/jira/browse/HIVE-4156
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4156.D9375.1.patch
>
>
> In some queries, the tasks fail when they can't find classes from the 
> protobuf library.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4176:


Attachment: HIVE-4176.1.patch.txt

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-4176.1.patch.txt
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602433#comment-13602433
 ] 

Kevin Wilfong commented on HIVE-4176:
-

https://reviews.facebook.net/D9405

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4176:
---

 Summary: disable TestBeeLineDriver in ptest util
 Key: HIVE-4176
 URL: https://issues.apache.org/jira/browse/HIVE-4176
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4176) disable TestBeeLineDriver in ptest util

2013-03-14 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602430#comment-13602430
 ] 

Kevin Wilfong commented on HIVE-4176:
-

See HIVE-4162 and HIVE-4161

> disable TestBeeLineDriver in ptest util
> ---
>
> Key: HIVE-4176
> URL: https://issues.apache.org/jira/browse/HIVE-4176
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
>
> The test is disabled for ant test, so it should be disabled for ptest as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4162) disable TestBeeLineDriver

2013-03-13 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601842#comment-13601842
 ] 

Kevin Wilfong commented on HIVE-4162:
-

Could you mark it as Patch Available if the patch is complete.

+1 if so

> disable TestBeeLineDriver
> -
>
> Key: HIVE-4162
> URL: https://issues.apache.org/jira/browse/HIVE-4162
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.11.0
>
> Attachments: HIVE-4162.1.patch
>
>
> See HIVE-4161. We should disable the TestBeeLineDriver test cases. In its 
> current state, it was not supposed to be enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2935) Implement HiveServer2

2013-03-13 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601614#comment-13601614
 ] 

Kevin Wilfong commented on HIVE-2935:
-

Thanks Thejas.

> Implement HiveServer2
> -
>
> Key: HIVE-2935
> URL: https://issues.apache.org/jira/browse/HIVE-2935
> Project: Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
>  Labels: HiveServer2
> Fix For: 0.11.0
>
> Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, 
> HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, 
> HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, 
> HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, 
> HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, 
> HIVE-2935-7.patch.tar.gz, HIVE-2935-7.testerrs.patch, 
> HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, 
> HS2-with-thrift-patch-rebased.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2935) Implement HiveServer2

2013-03-13 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601549#comment-13601549
 ] 

Kevin Wilfong commented on HIVE-2935:
-

After this patch was committed the tests have been taking an incredible amount 
of time.  It looks like TestBeeLineDriver run the majority of the tests run by 
TestCliDriver again.  Is this necessary, could a larger number of tests be 
excluded or the tests included be explicitly specified rather than the tests 
excluded?

It looks like the Jenkins tests may be timing out because of this as well. 
https://builds.apache.org/job/Hive-trunk-h0.21/2013/console

At the very least, could the ptest framework be fixed to parallelize these 
tests?

> Implement HiveServer2
> -
>
> Key: HIVE-2935
> URL: https://issues.apache.org/jira/browse/HIVE-2935
> Project: Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
>  Labels: HiveServer2
> Fix For: 0.11.0
>
> Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, 
> HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, 
> HIVE-2935.3.patch.gz, HIVE-2935-4.changed-files-only.patch, 
> HIVE-2935-4.nothrift.patch, HIVE-2935-4.patch, HIVE-2935-5.beeline.patch, 
> HIVE-2935-5.core-hs2.patch, HIVE-2935-5.thrift-gen.patch, 
> HIVE-2935-7.patch.tar.gz, HIVE-2935-7.testerrs.patch, 
> HIVE-2935.fix.unsecuredoAs.patch, HS2-changed-files-only.patch, 
> HS2-with-thrift-patch-rebased.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases

2013-03-12 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4159:


Status: Patch Available  (was: Open)

> RetryingHMSHandler doesn't retry in enough cases
> 
>
> Key: HIVE-4159
> URL: https://issues.apache.org/jira/browse/HIVE-4159
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4159.1.patch.txt
>
>
> HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in 
> MetaExceptions.  This caused the RetryingHMSHandler to not retry on these 
> exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases

2013-03-12 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4159:


Attachment: HIVE-4159.1.patch.txt

> RetryingHMSHandler doesn't retry in enough cases
> 
>
> Key: HIVE-4159
> URL: https://issues.apache.org/jira/browse/HIVE-4159
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4159.1.patch.txt
>
>
> HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in 
> MetaExceptions.  This caused the RetryingHMSHandler to not retry on these 
> exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases

2013-03-12 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600734#comment-13600734
 ] 

Kevin Wilfong commented on HIVE-4159:
-

https://reviews.facebook.net/D9357

> RetryingHMSHandler doesn't retry in enough cases
> 
>
> Key: HIVE-4159
> URL: https://issues.apache.org/jira/browse/HIVE-4159
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4159.1.patch.txt
>
>
> HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in 
> MetaExceptions.  This caused the RetryingHMSHandler to not retry on these 
> exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases

2013-03-12 Thread Kevin Wilfong (JIRA)

Kevin Wilfong created HIVE-4159:
---

 Summary: RetryingHMSHandler doesn't retry in enough cases
 Key: HIVE-4159
 URL: https://issues.apache.org/jira/browse/HIVE-4159
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in 
MetaExceptions.  This caused the RetryingHMSHandler to not retry on these 
exceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4157) ORC runs out of heap when writing

2013-03-12 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4157:


Attachment: HIVE-4157.1.patch.txt

> ORC runs out of heap when writing
> -
>
> Key: HIVE-4157
> URL: https://issues.apache.org/jira/browse/HIVE-4157
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4157.1.patch.txt
>
>
> The OutStream class used by the ORC file format seems to aggressively 
> allocate memory for ByteBuffers and doesn't seem too eager to give it back.
> This causes issues with heap space, particularly when a wide tables/dynamic 
> partitions are involved.
> As a first step to resolving this problem, the OutStream class can be 
> modified to lazily allocate memory, and more actively make it available for 
> garbage collection.
> Follow ups could include checking the amount of free memory as part of 
> determining if a spill is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4157) ORC runs out of heap when writing

2013-03-12 Thread Kevin Wilfong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-4157:


Status: Patch Available  (was: Open)

> ORC runs out of heap when writing
> -
>
> Key: HIVE-4157
> URL: https://issues.apache.org/jira/browse/HIVE-4157
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Attachments: HIVE-4157.1.patch.txt
>
>
> The OutStream class used by the ORC file format seems to aggressively 
> allocate memory for ByteBuffers and doesn't seem too eager to give it back.
> This causes issues with heap space, particularly when a wide tables/dynamic 
> partitions are involved.
> As a first step to resolving this problem, the OutStream class can be 
> modified to lazily allocate memory, and more actively make it available for 
> garbage collection.
> Follow ups could include checking the amount of free memory as part of 
> determining if a spill is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4157) ORC runs out of heap when writing

2013-03-12 Thread Kevin Wilfong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600621#comment-13600621
 ] 

Kevin Wilfong commented on HIVE-4157:
-

https://reviews.facebook.net/D9351

> ORC runs out of heap when writing
> -
>
> Key: HIVE-4157
> URL: https://issues.apache.org/jira/browse/HIVE-4157
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.11.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
>
> The OutStream class used by the ORC file format seems to aggressively 
> allocate memory for ByteBuffers and doesn't seem too eager to give it back.
> This causes issues with heap space, particularly when a wide tables/dynamic 
> partitions are involved.
> As a first step to resolving this problem, the OutStream class can be 
> modified to lazily allocate memory, and more actively make it available for 
> garbage collection.
> Follow ups could include checking the amount of free memory as part of 
> determining if a spill is needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 917 matches

Mail list logo