[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12883: --- Attachment: HIVE-12883.03.patch > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, > HIVE-12883.03.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9774) Print yarn application id to console [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9774: - Attachment: HIVE-9774.1-spark.patch The patch uses {{SparkContext::applicationId}}, which is the YARN app ID when spark runs on yarn. It's printing "Running with YARN application" instead of "Starting application". This is because user can submit multiple jobs to one spark session -- we're not starting an application for each job. > Print yarn application id to console [Spark Branch] > --- > > Key: HIVE-9774 > URL: https://issues.apache.org/jira/browse/HIVE-9774 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Brock Noland >Assignee: Rui Li > Attachments: HIVE-9774.1-spark.patch > > > Oozie would like to use beeline to capture the yarn application id of apps so > that if a workflow is canceled, the job can be cancelled. When running under > MR we print the job id but under spark we do not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12771) HiveServer2 HiveTemplate running HiveStatement.execute throw java.lang.NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dailidong resolved HIVE-12771. -- Resolution: Not A Problem > HiveServer2 HiveTemplate running HiveStatement.execute throw > java.lang.NullPointerException > --- > > Key: HIVE-12771 > URL: https://issues.apache.org/jira/browse/HIVE-12771 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 > Environment: apache hive 1.2.1 > jdk7 >Reporter: dailidong >Assignee: Vaibhav Gumashta > > when I use HiveTemplate to select hive table,then my program log throws > exception, the details are as follow: > java.sql.SQLException: java.lang.NullPointerException > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:311) > at > org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) > at > org.datanucleus.store.rdbms.datasource.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208) > at > org.datanucleus.store.rdbms.datasource.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208) > at > com.xxx.analysis.hadoop.xxx.hive.utils.HiveTemplate.executeQuery(HiveTemplate.java:56) > any help will be appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104221#comment-15104221 ] Ashutosh Chauhan commented on HIVE-12863: - +1 pending tests > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch, > HIVE-12863.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104177#comment-15104177 ] Vaibhav Gumashta commented on HIVE-12049: - Patch v2 has changes for FileSinkOp. > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks
[ https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12049: Attachment: HIVE-12049.2.patch > Provide an option to write serialized thrift objects in final tasks > --- > > Key: HIVE-12049 > URL: https://issues.apache.org/jira/browse/HIVE-12049 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Rohit Dholakia >Assignee: Rohit Dholakia > Attachments: HIVE-12049.1.patch, HIVE-12049.2.patch > > > For each fetch request to HiveServer2, we pay the penalty of deserializing > the row objects and translating them into a different representation suitable > for the RPC transfer. In a moderate to high concurrency scenarios, this can > result in significant CPU and memory wastage. By having each task write the > appropriate thrift objects to the output files, HiveServer2 can simply stream > a batch of rows on the wire without incurring any of the additional cost of > deserialization and translation. > This can be implemented by writing a new SerDe, which the FileSinkOperator > can use to write thrift formatted row batches to the output file. Using the > pluggable property of the {{hive.query.result.fileformat}}, we can set it to > use SequenceFile and write a batch of thrift formatted rows as a value blob. > The FetchTask can now simply read the blob and send it over the wire. On the > client side, the *DBC driver can read the blob and since it is already > formatted in the way it expects, it can continue building the ResultSet the > way it does in the current implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12777) Add capability to restore session
[ https://issues.apache.org/jira/browse/HIVE-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104171#comment-15104171 ] Xuefu Zhang commented on HIVE-12777: Sorry but I still don't understand what functionality we are adding here. We should have clear description of the problem we are trying to solve, the new functionality we are adding, and the approach we are taking, For bigger feature additions, we might even need a functional and design doc. The doc or JIRA description is supposed to help code review, but not the other way around. Here it seems I have to understand the code in order to understand the feature. > Add capability to restore session > - > > Key: HIVE-12777 > URL: https://issues.apache.org/jira/browse/HIVE-12777 > Project: Hive > Issue Type: Improvement >Reporter: Rajat Khandelwal >Assignee: Rajat Khandelwal > Attachments: HIVE-12777.04.patch, HIVE-12777.08.patch, > HIVE-12777.09.patch, HIVE-12777.11.patch, HIVE-12777.12.patch, > HIVE-12777.13.patch > > > Extensions using Hive session handles should be able to restore the hive > session from the handle. > Apache Lens depends on a fork of hive and that fork has such a capability. > Relevant commit: > https://github.com/InMobi/hive/commit/931fe9116161a18952c082c14223ad6745fefe00#diff-0acb35f7cab7492f522b0c40ce3ce1be -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104138#comment-15104138 ] Xuefu Zhang commented on HIVE-12736: Hi [~chengxiang li], Sorry for being late in reviewing this. The patch looks good, but patch #2 has a change in ReduceSinkOperator. Is that intentional? It seems changing the return value from "false" to "true" (inherited from Operator class). Secondly, can we incorporate the test case provided in the JIRA description? Let's forget about it if it's too hard. Thanks. > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} > explain 2: > {code} > set hive.execution.en
[jira] [Updated] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-12736: --- Description: {code} select * from staff; 1 jone22 1 2 lucy21 1 3 hmm 22 2 4 james 24 3 5 xiaoliu 23 3 select id,date_ from trade union all select id,"test" from trade ; 1 201510210908 2 201509080234 2 201509080235 1 test 2 test 2 test set hive.execution.engine=spark; set spark.master=local; select /*+mapjoin(t)*/ * from staff s join (select id,date_ from trade union all select id,"test" from trade ) t on s.id=t.id; 1 jone22 1 1 201510210908 2 lucy21 1 2 201509080234 2 lucy21 1 2 201509080235 set hive.execution.engine=mr; select /*+mapjoin(t)*/ * from staff s join (select id,date_ from trade union all select id,"test" from trade ) t on s.id=t.id; FAILED: SemanticException [Error 10227]: Not all clauses are supported with mapjoin hint. Please remove mapjoin hint. {code} I have two questions 1.Why result of hive on spark not include the following record? {code} 1 jone22 1 1 test 2 lucy21 1 2 test 2 lucy21 1 2 test {code} 2.Why there are two different ways of dealing same query? explain 1: {code} set hive.execution.engine=spark; set spark.master=local; explain select id,date_ from trade union all select id,"test" from trade; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Spark DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 Vertices: Map 1 Map Operator Tree: TableScan alias: trade Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int), date_ (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Map 2 Map Operator Tree: TableScan alias: trade Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int), 'test' (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 12 Data size: 96 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} explain 2: {code} set hive.execution.engine=spark; set spark.master=local; explain select /*+mapjoin(t)*/ * from staff s join (select id,date_ from trade union all select id,"test" from trade ) t on s.id=t.id; OK STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: jonezhang_20151222191716_be7eac84-b5b6-4478-b88f-9f59e2b1b1a8:3 Vertices: Map 1 Map Operator Tree: TableScan alias: trade Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 3 Data size: 24 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int), date_ (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 3 Data size: 24 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator
[jira] [Assigned] (HIVE-9774) Print yarn application id to console [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li reassigned HIVE-9774: Assignee: Rui Li (was: Chinna Rao Lalam) > Print yarn application id to console [Spark Branch] > --- > > Key: HIVE-9774 > URL: https://issues.apache.org/jira/browse/HIVE-9774 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Brock Noland >Assignee: Rui Li > > Oozie would like to use beeline to capture the yarn application id of apps so > that if a workflow is canceled, the job can be cancelled. When running under > MR we print the job id but under spark we do not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9774) Print yarn application id to console [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104091#comment-15104091 ] Rui Li commented on HIVE-9774: -- OK, assigned this to me. > Print yarn application id to console [Spark Branch] > --- > > Key: HIVE-9774 > URL: https://issues.apache.org/jira/browse/HIVE-9774 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Brock Noland >Assignee: Rui Li > > Oozie would like to use beeline to capture the yarn application id of apps so > that if a workflow is canceled, the job can be cancelled. When running under > MR we print the job id but under spark we do not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104080#comment-15104080 ] Chengxiang Li commented on HIVE-12736: -- [~xuefuz], would you help to review this patch? > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch > > > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > I have two questions > 1.Why result of hive on spark not include the following record? > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > 2.Why there are two different ways of dealing same query? > explain 1: > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > explain 2: > set hive.execution.engine=spark; > set spark.master=local; > explain > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > OK > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > DagName: jonezh
[jira] [Commented] (HIVE-12551) Fix several kryo exceptions in branch-1
[ https://issues.apache.org/jira/browse/HIVE-12551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104068#comment-15104068 ] Feng Yuan commented on HIVE-12551: -- can you look this please? [~xuefuz],[~serganch],[~pchag] > Fix several kryo exceptions in branch-1 > --- > > Key: HIVE-12551 > URL: https://issues.apache.org/jira/browse/HIVE-12551 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: serialization > Fix For: 1.3.0 > > Attachments: HIVE-12551.1.patch, test case.zip > > > HIVE-11519, HIVE-12174 and the following exception are all caused by > unregistered classes or serializers. HIVE-12175 should have fixed these > issues for master branch. > {code} > Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: > java.lang.NullPointerException > Serialization trace: > chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > expr (org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor) > childExpressions > (org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterStringColumnBetween) > conditionEvaluator > (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:367) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:276) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1087) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:976) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:990) > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:426) > ... 27 more > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:3818) > at java.util.AbstractList.add(Abs
[jira] [Commented] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104021#comment-15104021 ] Hive QA commented on HIVE-12863: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782792/HIVE-12863.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10024 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ctas org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge_incompat1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_bround org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_nvl org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_reduce3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_16 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_not org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_pushdown org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6657/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6657/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6657/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 23 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782792 - PreCommit-HIVE-TRUNK-Build > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch, > HIVE-12863.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103987#comment-15103987 ] Hive QA commented on HIVE-12883: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782788/HIVE-12883.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 10023 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.allWithStats org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.binaryPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.binaryTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.booleanPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.booleanTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doublePartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doubleTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreCached.booleanTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.partitionStatistics org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6656/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6656/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6656/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782788 - PreCommit-HIVE-TRUNK-Build > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12863: --- Attachment: HIVE-12863.03.patch > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch, > HIVE-12863.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12863: --- Attachment: (was: HIVE-12883.02.patch) > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch, > HIVE-12863.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12863) fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union
[ https://issues.apache.org/jira/browse/HIVE-12863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12863: --- Attachment: HIVE-12883.02.patch > fix test failure for TestMiniTezCliDriver.testCliDriver_tez_union > - > > Key: HIVE-12863 > URL: https://issues.apache.org/jira/browse/HIVE-12863 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12863.01.patch, HIVE-12863.02.patch, > HIVE-12883.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12883: --- Attachment: HIVE-12883.02.patch > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.
[ https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103907#comment-15103907 ] Eugene Koifman commented on HIVE-10632: --- resolveTable() in Initiator returns null as the tmp table is gone by then > Make sure TXN_COMPONENTS gets cleaned up if table is dropped before > compaction. > --- > > Key: HIVE-10632 > URL: https://issues.apache.org/jira/browse/HIVE-10632 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > The compaction process will clean up entries in TXNS, > COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS. If the table/partition is dropped > before compaction is complete there will be data left in these tables. Need > to investigate if there are other situations where this may happen and > address it. > see HIVE-10595 for additional info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103640#comment-15103640 ] Hive QA commented on HIVE-12883: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782753/HIVE-12883.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 10008 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-orc_merge6.q-vector_outer_join0.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.allWithStats org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.binaryPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.binaryTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.booleanPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.booleanTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doublePartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doubleTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreCached.booleanTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.partitionStatistics org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6655/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6655/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6655/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782753 - PreCommit-HIVE-TRUNK-Build > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)