date:20150731


[ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650154#comment-14650154
 ] 

Hive QA commented on HIVE-11304:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748291/HIVE-11304.4.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4783/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4783/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4783/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4783/succeeded/TestCompareCliDriver
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748291 - PreCommit-HIVE-TRUNK-Build

 Migrate to Log4j2 from Log4j 1.x
 

 Key: HIVE-11304
 URL: https://issues.apache.org/jira/browse/HIVE-11304
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11304.2.patch, HIVE-11304.3.patch, 
 HIVE-11304.4.patch, HIVE-11304.patch


 Log4J2 has some great benefits and can benefit hive significantly. Some 
 notable features include
 1) Performance (parametrized logging, performance when logging is disabled 
 etc.) More details can be found here 
 https://logging.apache.org/log4j/2.x/performance.html
 2) RoutingAppender - Route logs to different log files based on MDC context 
 (useful for HS2, LLAP etc.)
 3) Asynchronous logging
 This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess

2015-07-31 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650060#comment-14650060
 ] 

Aihua Xu commented on HIVE-10755:
-

[~viraj] and [~mithun] How is everything? The attached patch should fix both 
HIVE-5193 and HIVE-10720. Can we get it submitted?

 Rework on HIVE-5193 to enhance the column oriented table acess
 --

 Key: HIVE-10755
 URL: https://issues.apache.org/jira/browse/HIVE-10755
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
 Fix For: 2.0.0

 Attachments: HIVE-10755.patch


 Add the support of column pruning for column oriented table access which was 
 done in HIVE-5193 but was reverted due to the join issue in HIVE-10720.
 In 1.3.0, the patch posted by Viray didn't work, probably due to some jar 
 reference. That seems to get fixed and that patch works in 2.0.0 now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout


[ 
https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650059#comment-14650059
 ] 

Hive QA commented on HIVE-11317:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748236/HIVE-11317.patch

{color:red}ERROR:{color} -1 due to 146 failed/errored test(s), 9280 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_case_with_row_sequence
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_serde_regex
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_drop_partition
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_drop_table
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_move_tbl
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_add_partition_with_whitelist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_addpart1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_change_col_dup_col
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_change_col_nonexist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_with_whitelist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_rename_partition_failure
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_rename_partition_failure2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_table_wrong_regex
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_altern1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_corrupt
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi3
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi5
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi6
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_cannot_create_default_role
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_caseinsensitivity
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_create_role_no_admin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_drop_admin_role
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_drop_role_no_admin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_8
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_group
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_allpriv
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_dup
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_fail1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_fail_nogrant
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_invalid_priv_v2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_priv_current_role_neg
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_create
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_drop
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_revoke_table_fail1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_revoke_table_fail2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_case
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant2

[jira] [Updated] (HIVE-11087) DbTxnManager exceptions should include txnid


 [ 
https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11087:
--
Attachment: HIVE-11087.patch

 DbTxnManager exceptions should include txnid
 

 Key: HIVE-11087
 URL: https://issues.apache.org/jira/browse/HIVE-11087
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11087.patch


 must include txnid in the exception so that user visible error can be 
 correlated with log file info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-11423) Ship hive-storage-api along with hive-exec jar to all Tasks


 [ 
https://issues.apache.org/jira/browse/HIVE-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V resolved HIVE-11423.

Resolution: Duplicate

 Ship hive-storage-api along with hive-exec jar to all Tasks
 ---

 Key: HIVE-11423
 URL: https://issues.apache.org/jira/browse/HIVE-11423
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 2.0.0
Reporter: Gopal V
Priority: Blocker

 After moving critical classes into hive-storage-api, those classes are needed 
 for queries to execute successfully.
 Currently all queries run fail with ClassNotFound exceptions on a large 
 cluster.
 {code}
 Caused by: java.lang.NoClassDefFoundError: 
 Lorg/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch;
 at java.lang.Class.getDeclaredFields0(Native Method)
 at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
 at java.lang.Class.getDeclaredFields(Class.java:1916)
 at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:150)
 at 
 org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.init(FieldSerializer.java:109)
 ... 57 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 62 more
 {code}
 Temporary workaround added to hiverc: {{add jar 
 ./dist/hive/lib/hive-storage-api-2.0.0-SNAPSHOT.jar;}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-11395) Enhance Explain annotation for the JSON metadata collection


 [ 
https://issues.apache.org/jira/browse/HIVE-11395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-11395:
--

Assignee: Gopal V

 Enhance Explain annotation for the JSON metadata collection
 ---

 Key: HIVE-11395
 URL: https://issues.apache.org/jira/browse/HIVE-11395
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Gopal V
Assignee: Gopal V

 ExplainTask cannot collect  information that is not visible during explain 
 extended level. 
 Need a new marker to mark the field as collected by the explain formatted 
 JSON structures, but not as part of the regular explain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-07-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11304:
-
Attachment: HIVE-11304.4.patch

Fixed test failures. The issue was with iterating over file appenders and 
printing file location in qfile test. The old api for iterating over existing 
appenders do not work anymore. 

[~gopalv] Can you take a look at the new patch?

 Migrate to Log4j2 from Log4j 1.x
 

 Key: HIVE-11304
 URL: https://issues.apache.org/jira/browse/HIVE-11304
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11304.2.patch, HIVE-11304.3.patch, 
 HIVE-11304.4.patch, HIVE-11304.patch


 Log4J2 has some great benefits and can benefit hive significantly. Some 
 notable features include
 1) Performance (parametrized logging, performance when logging is disabled 
 etc.) More details can be found here 
 https://logging.apache.org/log4j/2.x/performance.html
 2) RoutingAppender - Route logs to different log files based on MDC context 
 (useful for HS2, LLAP etc.)
 3) Asynchronous logging
 This is an umbrella jira to track changes related to Log4j2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015


 [ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10166:
---
Attachment: (was: HIVE-10166.1.patch)

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11424) Improve HivePreFilteringRule performance

2015-07-31 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11424:
---
Attachment: HIVE-11424.patch

 Improve HivePreFilteringRule performance
 

 Key: HIVE-11424
 URL: https://issues.apache.org/jira/browse/HIVE-11424
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11424.patch


 1) Remove early bail out condition.
 2) Create IN clause instead of OR tree (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11401) Predicate push down does not work with Parquet when partitions are in the expression

2015-07-31 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649299#comment-14649299
 ] 

Sergio Peña commented on HIVE-11401:


The tests are not related with this patch. I run them in my local system, and 
they're working correct.

 Predicate push down does not work with Parquet when partitions are in the 
 expression
 

 Key: HIVE-11401
 URL: https://issues.apache.org/jira/browse/HIVE-11401
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-11401.1.patch, HIVE-11401.2.patch


 When filtering Parquet tables using a partition column, the query fails 
 saying the column does not exist:
 {noformat}
 hive create table part1 (id int, content string) partitioned by (p string) 
 stored as parquet;
 hive alter table part1 add partition (p='p1');
 hive insert into table part1 partition (p='p1') values (1, 'a'), (2, 'b');
 hive select id from part1 where p='p1';
 Failed with exception java.io.IOException:java.lang.IllegalArgumentException: 
 Column [p] was not found in schema!
 Time taken: 0.151 seconds
 {noformat}
 It is correct that the partition column is not part of the Parquet schema. 
 So, the fix should be to remove such expression from the Parquet PPD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11380) NPE when FileSinkOperator is not initialized

2015-07-31 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649298#comment-14649298
 ] 

Sergio Peña commented on HIVE-11380:


+1 
The patch is simple. Thanks [~ychena]

 NPE when FileSinkOperator is not initialized
 

 Key: HIVE-11380
 URL: https://issues.apache.org/jira/browse/HIVE-11380
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11380.1.patch


 When FileSinkOperator's initializeOp is not called (which may happen when an 
 operator before FileSinkOperator initializeOp failed), FileSinkOperator will 
 throw NPE at close time. The stacktrace:
 {noformat}
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:523)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:952)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:199)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:519)
 ... 18 more
 {noformat}
 This Exception is misleading and often distracts users from finding real 
 issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-07-31 Thread Nicholas Brenwald (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649184#comment-14649184
 ] 

Nicholas Brenwald commented on HIVE-11410:
--

Hi, 
Thanks for taking a look at this so quickly.
I confirm we are using branch-1.1 (distributed as part of CDH 5.4.4). For 
example, hive cli jar is named hive-cli-1.1.0-cdh5.4.4.jar. When we run 'hive' 
on the command line, we see the following printed message showing the 
hive-common-1.1.0 is being used.
{code}
Logging initialized using configuration in 
jar:file:/cloudera/parcel-repo/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/hive-common-1.1.0-cdh5.4.4.jar!/hive-log4j.properties
{code}

And the explain plan we see is as follows:
{code}
hive EXPLAIN
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2;
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-5 depends on stages: Stage-1
  Stage-4 depends on stages: Stage-5
  Stage-0 depends on stages: Stage-4

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: t2
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: c1 (type: string), c2 (type: int)
  outputColumnNames: c1, c2
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  Group By Operator
aggregations: max(c2)
keys: c1 (type: string)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
Reduce Output Operator
  key expressions: _col0 (type: string)
  sort order: +
  Map-reduce partition columns: _col0 (type: string)
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
  value expressions: _col1 (type: int)
  Reduce Operator Tree:
Group By Operator
  aggregations: max(VALUE._col0)
  keys: KEY._col0 (type: string)
  mode: mergepartial
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
NONE
  Filter Operator
predicate: _col1 is not null (type: boolean)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
File Output Operator
  compressed: false
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe

  Stage: Stage-5
Map Reduce Local Work
  Alias - Map Local Tables:
t1 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
t1 
  TableScan
alias: t1
filterExpr: c2 is not null (type: boolean)
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: c2 is not null (type: boolean)
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  HashTable Sink Operator
keys:
  0 c2 (type: int)
  1 _col1 (type: int)

  Stage: Stage-4
Map Reduce
  Map Operator Tree:
  TableScan
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 c2 (type: int)
1 _col1 (type: int)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: _col0 (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
File Output Operator
  compressed: true
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.TextInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
  Local Work:
Map Reduce Local Work

  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
ListSink
{code}

 Join with subquery containing a group by incorrectly returns no results

[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015


 [ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10166:
---
Attachment: HIVE-10166.1.patch

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.1.patch, HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649288#comment-14649288
 ] 

Hive QA commented on HIVE-10975:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748130/HIVE-10975.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9275 tests executed
*Failed tests:*
{noformat}
TestMarkPartition - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_partitioned
org.apache.hadoop.hive.ql.io.parquet.TestParquetRowGroupFilter.testRowGroupFilterTakeEffect
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4773/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4773/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4773/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748130 - PreCommit-HIVE-TRUNK-Build

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-07-31 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649293#comment-14649293
 ] 

Sergio Peña commented on HIVE-10975:


The patch looks good to me, but those 2 parquet tests are failing. Something 
might have changed from 1.7 to 1.8 that is causing those failures.

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015


 [ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10166:
---
Attachment: HIVE-10166.1.patch

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015


[ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649270#comment-14649270
 ] 

Xuefu Zhang commented on HIVE-10166:


Since this is a clean merge, containing fix for HIVE-11423, I'm going to get 
this in first and create a followup jira to investigate and fix the two test 
failures.
[~csun], would you mind reviewing the patch? Thanks.

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST


[ 
https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649416#comment-14649416
 ] 

Hive QA commented on HIVE-11397:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748131/HIVE-11397.1.patch

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9276 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_lateral_view
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4774/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4774/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4774/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748131 - PreCommit-HIVE-TRUNK-Build

 Parse Hive OR clauses as they are written into the AST
 --

 Key: HIVE-11397
 URL: https://issues.apache.org/jira/browse/HIVE-11397
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11397.1.patch, HIVE-11397.patch


 When parsing A OR B OR C, hive converts it into 
 (C OR B) OR A
 instead of turning it into
 A OR (B OR C)
 {code}
 GenericUDFOPOr or = new GenericUDFOPOr();
 ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2);
 expressions.add(previous);
 expressions.add(current);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015

2015-07-31 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649302#comment-14649302
 ] 

Alan Gates commented on HIVE-10166:
---

All of the metastore generated files have been regenerated, but it doesn't look 
like you've changed the interface.  What version of thrift did you use to 
generate this?  We should be careful switching thrift versions.

A more general question, why is spark dev still going on in the branch given 
that it's been merged?  It's much easier to track and review changes when they 
come a patch at a time instead of in merges with 8M files (I know 95% of this 
is generated code, but still).

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used


[ 
https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649467#comment-14649467
 ] 

Vaibhav Gumashta edited comment on HIVE-11408 at 7/31/15 4:55 PM:
--

Patch for 1.0, 1.1, 0.14. cc [~thejas]


was (Author: vgumashta):
Patch for 1.0, 1.1, 0.14. Has been fixed in 1.2 via HIVE-10329.

 HiveServer2 is leaking ClassLoaders when add jar / temporary functions are 
 used
 ---

 Key: HIVE-11408
 URL: https://issues.apache.org/jira/browse/HIVE-11408
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11408.1.patch


 I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue 
 (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). 
 Basically, add jar creates a new classloader for loading the classes from the 
 new jar and adds the new classloader to the SessionState object of user's 
 session, making the older one its parent. Creating a temporary function uses 
 the new classloader to load the class used for the function. On closing a 
 session, although there is code to close the classloader for the session, I'm 
 not seeing the new classloader getting GCed and from the heapdump I can see 
 it holds on to the temporary function's class that should have gone away 
 after the session close. 
 Steps to reproduce:
 1.
 {code}
 jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar;
 {code}
 2. 
 Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was 
 added.
 3. 
 {code}
 jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 
 'org.gumashta.udf.AUDF'; 
 {code}
 4. 
 Close the jdbc session.
 5. 
 Take the memory snapshot and verify that the new URLClassLoader is indeed 
 there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the 
 session which we already closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used


 [ 
https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11408:

Attachment: HIVE-11408.1.patch

Patch for 1.0, 1.1, 0.14. Has been fixed in 1.2 via HIVE-10329.

 HiveServer2 is leaking ClassLoaders when add jar / temporary functions are 
 used
 ---

 Key: HIVE-11408
 URL: https://issues.apache.org/jira/browse/HIVE-11408
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11408.1.patch


 I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue 
 (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). 
 Basically, add jar creates a new classloader for loading the classes from the 
 new jar and adds the new classloader to the SessionState object of user's 
 session, making the older one its parent. Creating a temporary function uses 
 the new classloader to load the class used for the function. On closing a 
 session, although there is code to close the classloader for the session, I'm 
 not seeing the new classloader getting GCed and from the heapdump I can see 
 it holds on to the temporary function's class that should have gone away 
 after the session close. 
 Steps to reproduce:
 1.
 {code}
 jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar;
 {code}
 2. 
 Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was 
 added.
 3. 
 {code}
 jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 
 'org.gumashta.udf.AUDF'; 
 {code}
 4. 
 Close the jdbc session.
 5. 
 Take the memory snapshot and verify that the new URLClassLoader is indeed 
 there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the 
 session which we already closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used


 [ 
https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11408:

Affects Version/s: 1.1.1
   0.13.0
   0.13.1
   1.0.0

 HiveServer2 is leaking ClassLoaders when add jar / temporary functions are 
 used
 ---

 Key: HIVE-11408
 URL: https://issues.apache.org/jira/browse/HIVE-11408
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue 
 (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). 
 Basically, add jar creates a new classloader for loading the classes from the 
 new jar and adds the new classloader to the SessionState object of user's 
 session, making the older one its parent. Creating a temporary function uses 
 the new classloader to load the class used for the function. On closing a 
 session, although there is code to close the classloader for the session, I'm 
 not seeing the new classloader getting GCed and from the heapdump I can see 
 it holds on to the temporary function's class that should have gone away 
 after the session close. 
 Steps to reproduce:
 1.
 {code}
 jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar;
 {code}
 2. 
 Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was 
 added.
 3. 
 {code}
 jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 
 'org.gumashta.udf.AUDF'; 
 {code}
 4. 
 Close the jdbc session.
 5. 
 Take the memory snapshot and verify that the new URLClassLoader is indeed 
 there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the 
 session which we already closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11409) CBO: Calcite Operator To Hive Operator (Calcite Return Path): add SEL before UNION

2015-07-31 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649301#comment-14649301
 ] 

Jesus Camacho Rodriguez commented on HIVE-11409:


+1

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): add SEL before 
 UNION
 --

 Key: HIVE-11409
 URL: https://issues.apache.org/jira/browse/HIVE-11409
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11409.01.patch, HIVE-11409.02.patch


 Two purpose: (1) to ensure that the data type of non-primary branch (the 1st 
 branch is the primary branch) of union can be casted to that of the primary 
 branch; (2) to make UnionProcessor optimizer work; (3) if the SEL is 
 redundant, it will be removed by IdentidyProjectRemover optimizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015


[ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649330#comment-14649330
 ] 

Xuefu Zhang commented on HIVE-10166:


1. The code generation is due to changes in queryplan.thrift. There is no 
version change.

2. There are still big features happening on Spark, so a branch facilitates the 
process, especially the precommit test run. The same code standard is applied 
regardless. Feel free to review each individual JIRA if you wish.

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11422) Join a ACID table with non-ACID table fail with MR


 [ 
https://issues.apache.org/jira/browse/HIVE-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11422:
--
Component/s: Transactions
 Query Processor

 Join a ACID table with non-ACID table fail with MR
 --

 Key: HIVE-11422
 URL: https://issues.apache.org/jira/browse/HIVE-11422
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 1.3.0
Reporter: Daniel Dai
 Fix For: 1.3.0, 2.0.0


 The following script fail on MR mode:
 {code}
 CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) 
 CLUSTERED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC TBLPROPERTIES(transactional=true); 
 INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I');
 CREATE TABLE orc_table (k1 INT, f1 STRING) 
 CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS 
 STORED AS ORC; 
 INSERT OVERWRITE TABLE orc_table VALUES (1, 'x');
 SET hive.execution.engine=mr; 
 SET hive.auto.convert.join=false; 
 SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
 SELECT t1.*, t2.* FROM orc_table t1 
 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1;
 {code}
 Stack:
 {code}
 Error: java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas(AcidUtils.java:368)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1211)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1129)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
   ... 9 more
 {code}
 The script pass in 1.2.0 release however.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11424) Improve HivePreFilteringRule performance


[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649530#comment-14649530
 ] 

Hive QA commented on HIVE-11424:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748180/HIVE-11424.patch

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9278 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionall_unbalancedppd
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4776/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4776/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4776/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748180 - PreCommit-HIVE-TRUNK-Build

 Improve HivePreFilteringRule performance
 

 Key: HIVE-11424
 URL: https://issues.apache.org/jira/browse/HIVE-11424
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11424.patch


 1) Remove early bail out condition.
 2) Create IN clause instead of OR tree (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal


 [ 
https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11425:
--
Attachment: HIVE-11425.patch

[~owen.omalley],[~prasanth_j] could you review

 submitting a query via CLI against a running cluster fails with 
 ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
 --

 Key: HIVE-11425
 URL: https://issues.apache.org/jira/browse/HIVE-11425
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11425.patch


 submitting a query via CLI against a running cluster fails.  This is a side 
 effect of the new
 storage-api module which is not included hive-exec.jar
 {noformat}
 hive insert into orders values(1,2);
 Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Job = job_1438305627853_0002, Tracking URL = 
 http://localhost:8088/proxy/application_1438305627853_0002/
 Kill Command = 
 /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop
  job  -kill job_1438305627853_0002
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2015-07-30 18:28:16,330 Stage-1 map = 0%,  reduce = 0%
 2015-07-30 18:28:33,929 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_1438305627853_0002 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/
 Examining task ID: task_1438305627853_0002_m_00 (and more) from job 
 job_1438305627853_0002
 Task with the most failures(4): 
 -
 Task ID:
   task_1438305627853_0002_m_00
 URL:
   
 http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
   ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   ... 17 more
 Caused by: java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
   ... 22 more
 Caused by: java.lang.NoClassDefFoundError:

[jira] [Updated] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression

2015-07-31 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11405:
-
Attachment: HIVE-11405.1.patch

 Add early termination for recursion in 
 StatsRulesProcFactory$FilterStatsRule.evaluateExpression  for OR expression
 --

 Key: HIVE-11405
 URL: https://issues.apache.org/jira/browse/HIVE-11405
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11405.1.patch, HIVE-11405.patch


 Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330.  Quoting 
 him,
 The recursion protection works well with an AND expr, but it doesn't work 
 against
 (OR a=1 (OR a=2 (OR a=3 (OR ...)
 since the for the rows will never be reduced during recursion due to the 
 nature of the OR.
 We need to execute a short-circuit to satisfy the OR properly - no case which 
 matches a=1 qualifies for the rest of the filters.
 Recursion should pass in the numRows - branch1Rows for the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015

2015-07-31 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649506#comment-14649506
 ] 

Chao Sun commented on HIVE-10166:
-

[~alangates], I regenerated the files with Thrift 0.9.2, which is the version 
specified in the pom.xml. I think previously the files were generated with 
Thrift 0.9.0.

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal


 [ 
https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11425:
--
Description: 
submitting a query via CLI against a running cluster fails.  This is a side 
effect of the new
storage-api module which is not included hive-exec.jar
{noformat}
hive insert into orders values(1,2);
Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=number
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=number
In order to set a constant number of reducers:
  set mapreduce.job.reduces=number
Starting Job = job_1438305627853_0002, Tracking URL = 
http://localhost:8088/proxy/application_1438305627853_0002/
Kill Command = 
/Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop
 job  -kill job_1438305627853_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2015-07-30 18:28:16,330 Stage-1 map = 0%,  reduce = 0%
2015-07-30 18:28:33,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1438305627853_0002 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/
Examining task ID: task_1438305627853_0002_m_00 (and more) from job 
job_1438305627853_0002

Task with the most failures(4): 
-
Task ID:
  task_1438305627853_0002_m_00

URL:
  
http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00
-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
... 17 more
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
... 22 more
Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/hive/common/type/HiveDecimal
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.clinit(PrimitiveObjectInspectorUtils.java:234)
at 
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:341)
at 
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:331)
at 
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:392)
at 
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:305)
at

[jira] [Commented] (HIVE-8954) StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request

2015-07-31 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649546#comment-14649546
 ] 

Thejas M Nair commented on HIVE-8954:
-

[~Alexandre LINTE]
Do you also have following set ? (either via hive-site.xml or 
hiveserver2-site.xml )
{code}
property
   namehive.security.authorization.enabled/name
   valuefalse/value
/property

property
   namehive.security.authorization.manager/name
   
valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value
/property
{code}

 StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT 
 SQL request
 --

 Key: HIVE-8954
 URL: https://issues.apache.org/jira/browse/HIVE-8954
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.14.0
 Environment: centos 6.5 
Reporter: LINTE

 With hive.security.metastore.authorization.manager set to 
 org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.
 It seem that on a read request, write permissions are check on the HDFS by 
 the metastore.
 sample :
 bash# hive 
 hive (default) use database;
 OK
 Time taken: 0.747 seconds
 hive (database) SELECT * FROM  table LIMIT 10;
 FAILED: HiveException java.security.AccessControlException: action WRITE not 
 permitted on path hdfs://cluster/hive_warehouse/database.db/table for user 
 myuser



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal

2015-07-31 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649567#comment-14649567
 ] 

Prasanth Jayachandran commented on HIVE-11425:
--

+1

 submitting a query via CLI against a running cluster fails with 
 ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
 --

 Key: HIVE-11425
 URL: https://issues.apache.org/jira/browse/HIVE-11425
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11425.patch


 submitting a query via CLI against a running cluster fails.  This is a side 
 effect of the new
 storage-api module which is not included hive-exec.jar
 {noformat}
 hive insert into orders values(1,2);
 Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Job = job_1438305627853_0002, Tracking URL = 
 http://localhost:8088/proxy/application_1438305627853_0002/
 Kill Command = 
 /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop
  job  -kill job_1438305627853_0002
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2015-07-30 18:28:16,330 Stage-1 map = 0%,  reduce = 0%
 2015-07-30 18:28:33,929 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_1438305627853_0002 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/
 Examining task ID: task_1438305627853_0002_m_00 (and more) from job 
 job_1438305627853_0002
 Task with the most failures(4): 
 -
 Task ID:
   task_1438305627853_0002_m_00
 URL:
   
 http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
   at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
   ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   ... 17 more
 Caused by: java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140)
   ... 22 more
 Caused by: java.lang.NoClassDefFoundError: 
 org/apache/hadoop/hive/common/type/HiveDecimal

[jira] [Updated] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance


 [ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11406:
---
Assignee: Matt McCline  (was: Gopal V)

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal


[ 
https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649703#comment-14649703
 ] 

Hive QA commented on HIVE-11425:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748205/HIVE-11425.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9277 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4777/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4777/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4777/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748205 - PreCommit-HIVE-TRUNK-Build

 submitting a query via CLI against a running cluster fails with 
 ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
 --

 Key: HIVE-11425
 URL: https://issues.apache.org/jira/browse/HIVE-11425
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11425.patch


 submitting a query via CLI against a running cluster fails.  This is a side 
 effect of the new
 storage-api module which is not included hive-exec.jar
 {noformat}
 hive insert into orders values(1,2);
 Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e
 Total jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
   set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
   set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
   set mapreduce.job.reduces=number
 Starting Job = job_1438305627853_0002, Tracking URL = 
 http://localhost:8088/proxy/application_1438305627853_0002/
 Kill Command = 
 /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop
  job  -kill job_1438305627853_0002
 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
 1
 2015-07-30 18:28:16,330 Stage-1 map = 0%,  reduce = 0%
 2015-07-30 18:28:33,929 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_1438305627853_0002 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/
 Examining task ID: task_1438305627853_0002_m_00 (and more) from job 
 job_1438305627853_0002
 Task with the most failures(4): 
 -
 Task ID:
   task_1438305627853_0002_m_00
 URL:
   
 http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00
 -
 Diagnostic Messages for this Task:
 Error: java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
   ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
   at

[jira] [Updated] (HIVE-11426) lineage3.q fails with -Phadoop-1

2015-07-31 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-11426:
---
Attachment: HIVE-11426.1.patch

 lineage3.q fails with -Phadoop-1
 

 Key: HIVE-11426
 URL: https://issues.apache.org/jira/browse/HIVE-11426
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-11426.1.patch


 Some queries in lineage3.q emit different results with -Phadoop-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11380) NPE when FileSinkOperator is not initialized

2015-07-31 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649771#comment-14649771
 ] 

Yongzhi Chen commented on HIVE-11380:
-

Thanks [~spena] for reviewing it. 

 NPE when FileSinkOperator is not initialized
 

 Key: HIVE-11380
 URL: https://issues.apache.org/jira/browse/HIVE-11380
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Attachments: HIVE-11380.1.patch


 When FileSinkOperator's initializeOp is not called (which may happen when an 
 operator before FileSinkOperator initializeOp failed), FileSinkOperator will 
 throw NPE at close time. The stacktrace:
 {noformat}
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:523)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:952)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:199)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:519)
 ... 18 more
 {noformat}
 This Exception is misleading and often distracts users from finding real 
 issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance


[ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649764#comment-14649764
 ] 

Gopal V commented on HIVE-11406:


[~mmccline]: LGTM - +1

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015

2015-07-31 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649737#comment-14649737
 ] 

Chao Sun commented on HIVE-10166:
-

LGTM +1

 Merge Spark branch to master 7/30/2015
 --

 Key: HIVE-10166
 URL: https://issues.apache.org/jira/browse/HIVE-10166
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-10166.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout


 [ 
https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11317:
--
Attachment: HIVE-11317.patch

 ACID: Improve transaction Abort logic due to timeout
 

 Key: HIVE-11317
 URL: https://issues.apache.org/jira/browse/HIVE-11317
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
  Labels: triage
 Attachments: HIVE-11317.patch


 the logic to Abort transactions that have stopped heartbeating is in
 TxnHandler.timeOutTxns()
 This is only called when DbTxnManger.getValidTxns() is called.
 So if there is a lot of txns that need to be timed out and the there are not 
 SQL clients talking to the system, there is nothing to abort dead 
 transactions, and thus compaction can't clean them up so garbage accumulates 
 in the system.
 Also, streaming api doesn't call DbTxnManager at all.
 Need to move this logic into Initiator (or some other metastore side thread).
 Also, make sure it is broken up into multiple small(er) transactions against 
 metastore DB.
 Also more timeOutLocks() locks there as well.
 see about adding TXNS.COMMENT field which can be used for Auto aborted due 
 to timeout for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries

2015-07-31 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-11415:
--
Assignee: Matt McCline

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11412) StackOverFlow in SemanticAnalyzer for huge filters (~5000)

2015-07-31 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-11412:
--
Assignee: Hari Sankar Sivarama Subramaniyan

 StackOverFlow in SemanticAnalyzer for huge filters (~5000)
 --

 Key: HIVE-11412
 URL: https://issues.apache.org/jira/browse/HIVE-11412
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Hari Sankar Sivarama Subramaniyan

 Queries with ~5000 filter conditions fails in SemanticAnalysis
 Stack trace:
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.util.HashMap.hash(HashMap.java:366)
   at java.util.HashMap.getEntry(HashMap.java:466)
   at java.util.HashMap.containsKey(HashMap.java:453)
   at 
 org.apache.commons.collections.map.AbstractMapDecorator.containsKey(AbstractMapDecorator.java:83)
   at 
 org.apache.hadoop.conf.Configuration.isDeprecated(Configuration.java:558)
   at 
 org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:605)
   at org.apache.hadoop.conf.Configuration.get(Configuration.java:885)
   at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:907)
   at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1308)
   at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2641)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11132)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
 {code}
 Query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 or (t=7 and si=8)
 or (t=7 and si=8)
 ...
 {code}
 Repeat the filter around 5000 times. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50


 [ 
https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11429:

Affects Version/s: 1.2.1
   0.14.0
   1.0.0
   1.2.0

 Increase default JDBC result set fetch size (# rows it fetches in one RPC 
 call) to 1000 from 50
 ---

 Key: HIVE-11429
 URL: https://issues.apache.org/jira/browse/HIVE-11429
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 This is in addition to HIVE-10982 which plans to make the fetch size 
 customizable. This just bumps the default to 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-8954) StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request

2015-07-31 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649546#comment-14649546
 ] 

Thejas M Nair edited comment on HIVE-8954 at 7/31/15 5:56 PM:
--

[~Alexandre LINTE]
Do you also have following set ? (either via hive-site.xml or 
hiveserver2-site.xml )
{code}
property
   namehive.security.authorization.enabled/name
   valuefalse/value
/property

property
   namehive.security.authorization.manager/name
   
valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value
/property
{code}

Looks like this happens only when StorageBasedAuthorization is enabled at 
compile time.
The recommended place for enabling StorageBasedAuthorization is in hive 
metastore.  [see SBA metastore 
instructions|https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+Metastore+Server]
Setting this for compile time is redundant and not something I would recommend.
I would recommend compile time authorization being enabled only if you want to  
use fine grained authorization such as SQL Standards based authorization or 
Apache Ranger.



was (Author: thejas):
[~Alexandre LINTE]
Do you also have following set ? (either via hive-site.xml or 
hiveserver2-site.xml )
{code}
property
   namehive.security.authorization.enabled/name
   valuefalse/value
/property

property
   namehive.security.authorization.manager/name
   
valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value
/property
{code}

 StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT 
 SQL request
 --

 Key: HIVE-8954
 URL: https://issues.apache.org/jira/browse/HIVE-8954
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.14.0
 Environment: centos 6.5 
Reporter: LINTE

 With hive.security.metastore.authorization.manager set to 
 org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.
 It seem that on a read request, write permissions are check on the HDFS by 
 the metastore.
 sample :
 bash# hive 
 hive (default) use database;
 OK
 Time taken: 0.747 seconds
 hive (database) SELECT * FROM  table LIMIT 10;
 FAILED: HiveException java.security.AccessControlException: action WRITE not 
 permitted on path hdfs://cluster/hive_warehouse/database.db/table for user 
 myuser



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression


[ 
https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649845#comment-14649845
 ] 

Hive QA commented on HIVE-11405:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748211/HIVE-11405.1.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9279 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_17
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_17
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_17
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4778/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748211 - PreCommit-HIVE-TRUNK-Build

 Add early termination for recursion in 
 StatsRulesProcFactory$FilterStatsRule.evaluateExpression  for OR expression
 --

 Key: HIVE-11405
 URL: https://issues.apache.org/jira/browse/HIVE-11405
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11405.1.patch, HIVE-11405.patch


 Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330.  Quoting 
 him,
 The recursion protection works well with an AND expr, but it doesn't work 
 against
 (OR a=1 (OR a=2 (OR a=3 (OR ...)
 since the for the rows will never be reduced during recursion due to the 
 nature of the OR.
 We need to execute a short-circuit to satisfy the OR properly - no case which 
 matches a=1 qualifies for the rest of the filters.
 Recursion should pass in the numRows - branch1Rows for the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11412) StackOverFlow in SemanticAnalyzer for huge filters (~5000)

2015-07-31 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649857#comment-14649857
 ] 

Gunther Hagleitner commented on HIVE-11412:
---

[~mmokhtar]/[~t3rmin4t0r]

 StackOverFlow in SemanticAnalyzer for huge filters (~5000)
 --

 Key: HIVE-11412
 URL: https://issues.apache.org/jira/browse/HIVE-11412
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Hari Sankar Sivarama Subramaniyan

 Queries with ~5000 filter conditions fails in SemanticAnalysis
 Stack trace:
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.util.HashMap.hash(HashMap.java:366)
   at java.util.HashMap.getEntry(HashMap.java:466)
   at java.util.HashMap.containsKey(HashMap.java:453)
   at 
 org.apache.commons.collections.map.AbstractMapDecorator.containsKey(AbstractMapDecorator.java:83)
   at 
 org.apache.hadoop.conf.Configuration.isDeprecated(Configuration.java:558)
   at 
 org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:605)
   at org.apache.hadoop.conf.Configuration.get(Configuration.java:885)
   at 
 org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:907)
   at 
 org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1308)
   at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2641)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11132)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226)
 {code}
 Query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 or (t=7 and si=8)
 or (t=7 and si=8)
 ...
 {code}
 Repeat the filter around 5000 times. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list


 [ 
https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11398:
---
Summary: Parse wide OR and wide AND trees to balanced structures or a 
ANY/ALL list  (was: Parse wide OR and wide AND trees as a flat ANY/ALL list)

 Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list
 -

 Key: HIVE-11398
 URL: https://issues.apache.org/jira/browse/HIVE-11398
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer, UDF
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez

 Deep trees of AND/OR are hard to traverse particularly when they are merely 
 the same structure in nested form as a version of the operator that takes an 
 arbitrary number of args.
 One potential way to convert the DFS searches into a simpler BFS search is to 
 introduce a new Operator pair named ALL and ANY.
 ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A)
 ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A)
 The SemanticAnalyser would be responsible for generating these operators and 
 this would mean that the depth and complexity of traversals for the simplest 
 case of wide AND/OR trees would be trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50


[ 
https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649925#comment-14649925
 ] 

Vaibhav Gumashta commented on HIVE-11429:
-

cc [~thejas]

 Increase default JDBC result set fetch size (# rows it fetches in one RPC 
 call) to 1000 from 50
 ---

 Key: HIVE-11429
 URL: https://issues.apache.org/jira/browse/HIVE-11429
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11429.1.patch


 This is in addition to HIVE-10982 which plans to make the fetch size 
 customizable. This just bumps the default to 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries


[ 
https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649926#comment-14649926
 ] 

Gopal V commented on HIVE-11415:


The right fix for this is to go ahead and take a ~8000 OR tree and turn it into 
a balanced tree ~14 levels deep.

Failing to convert the tree to vectorization would be a bad idea in general, 
because this error can be progressively bypassed by running

 Add early termination for recursion in vectorization for deep filter queries
 

 Key: HIVE-11415
 URL: https://issues.apache.org/jira/browse/HIVE-11415
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Matt McCline

 Queries with deep filters (left deep) throws StackOverflowException in 
 vectorization
 {code}
 Exception in thread main java.lang.StackOverflowError
   at java.lang.Class.getAnnotation(Class.java:3415)
   at 
 org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164)
 {code}
 Sample query:
 {code}
 explain select count(*) from over1k where (
 (t=1 and si=2)
 or (t=2 and si=3)
 or (t=3 and si=4) 
 or (t=4 and si=5) 
 or (t=5 and si=6) 
 or (t=6 and si=7) 
 or (t=7 and si=8)
 ...
 ..
 {code}
 repeat the filter for few thousand times for reproduction of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50


 [ 
https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11429:

Attachment: HIVE-11429.1.patch

 Increase default JDBC result set fetch size (# rows it fetches in one RPC 
 call) to 1000 from 50
 ---

 Key: HIVE-11429
 URL: https://issues.apache.org/jira/browse/HIVE-11429
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Attachments: HIVE-11429.1.patch


 This is in addition to HIVE-10982 which plans to make the fetch size 
 customizable. This just bumps the default to 1000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11426) lineage3.q fails with -Phadoop-1


[ 
https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649959#comment-14649959
 ] 

Hive QA commented on HIVE-11426:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748233/HIVE-11426.1.patch

{color:green}SUCCESS:{color} +1 9278 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4779/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4779/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4779/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748233 - PreCommit-HIVE-TRUNK-Build

 lineage3.q fails with -Phadoop-1
 

 Key: HIVE-11426
 URL: https://issues.apache.org/jira/browse/HIVE-11426
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11426.1.patch


 Some queries in lineage3.q emit different results with -Phadoop-1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


 [ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10975:

Attachment: HIVE-10975.patch

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-07-31 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648855#comment-14648855
 ] 

Matt McCline commented on HIVE-11410:
-

[~nbrenwald] please attach your EXPLAIN plan for the query.  And, confirm you 
are using branch-1.1

 Join with subquery containing a group by incorrectly returns no results
 ---

 Key: HIVE-11410
 URL: https://issues.apache.org/jira/browse/HIVE-11410
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Nicholas Brenwald
Assignee: Matt McCline
Priority: Minor
 Attachments: hive-site.xml


 Start by creating a table *t* with columns *c1* and *c2* and populate with 1 
 row of data. For example create table *t* from an existing table which 
 contains at least 1 row of data by running:
 {code}
 create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
 {code}
 Table *t* looks like the following:
 ||c1||c2||
 |abc|0|
 Running the following query then returns zero results.
 {code}
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2
 {code}
 However, we expected to see the following:
 ||c1||
 |abc|
 The problem seems to relate to the fact that in the subquery, we group by 
 column *c1*, but this is not subsequently used in the join condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-07-31 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648853#comment-14648853
 ] 

Matt McCline commented on HIVE-11410:
-


Right, postgres produces:
{code}
mmccline=# SELECT
  t1.c1
FROM 
  t t1
JOIN
(SELECT 
   t2.c1,
   MAX(t2.c2) AS c2
 FROM 
   t t2 
 GROUP BY 
   t2.c1
) t3
ON t1.c2=t3.c2;
 c1 
-
 abc
(1 row)
{code}

And, Hive branch-1.1 produces the right result:
{code}
SELECT 
  t1.c1
FROM 
  t t1
JOIN
(SELECT 
   t2.c1,
   MAX(t2.c2) AS c2
 FROM 
   t t2 
 GROUP BY 
   t2.c1
) t3
ON t1.c2=t3.c2;
abc
{code}

Here is the EXPLAIN plan:
{code}
EXPLAIN
SELECT 
  t1.c1
FROM 
  t t1
JOIN
(SELECT 
   t2.c1,
   MAX(t2.c2) AS c2
 FROM 
   t t2 
 GROUP BY 
   t2.c1
) t3
ON t1.c2=t3.c2;
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: t2
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: c1 (type: string), c2 (type: int)
  outputColumnNames: c1, c2
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  Group By Operator
aggregations: max(c2)
keys: c1 (type: string)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
Reduce Output Operator
  key expressions: _col0 (type: string)
  sort order: +
  Map-reduce partition columns: _col0 (type: string)
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
  value expressions: _col1 (type: int)
  Reduce Operator Tree:
Group By Operator
  aggregations: max(VALUE._col0)
  keys: KEY._col0 (type: string)
  mode: mergepartial
  outputColumnNames: _col0, _col1
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
NONE
  Select Operator
expressions: _col1 (type: int)
outputColumnNames: _col1
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
Filter Operator
  predicate: _col1 is not null (type: boolean)
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
  Select Operator
expressions: _col1 (type: int)
outputColumnNames: _col1
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
File Output Operator
  compressed: false
  table:
  input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe

  Stage: Stage-2
Map Reduce
  Map Operator Tree:
  TableScan
Reduce Output Operator
  key expressions: _col1 (type: int)
  sort order: +
  Map-reduce partition columns: _col1 (type: int)
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
  TableScan
alias: t1
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
Filter Operator
  predicate: c2 is not null (type: boolean)
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  Reduce Output Operator
key expressions: c2 (type: int)
sort order: +
Map-reduce partition columns: c2 (type: int)
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE 
Column stats: NONE
value expressions: c1 (type: string)
  Reduce Operator Tree:
Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 c2 (type: int)
1 _col1 (type: int)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
  File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column 
stats: NONE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor

[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results

2015-07-31 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648854#comment-14648854
 ] 

Matt McCline commented on HIVE-11410:
-

By the way, thank you for the create repro description.

 Join with subquery containing a group by incorrectly returns no results
 ---

 Key: HIVE-11410
 URL: https://issues.apache.org/jira/browse/HIVE-11410
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Nicholas Brenwald
Assignee: Matt McCline
Priority: Minor
 Attachments: hive-site.xml


 Start by creating a table *t* with columns *c1* and *c2* and populate with 1 
 row of data. For example create table *t* from an existing table which 
 contains at least 1 row of data by running:
 {code}
 create table t as select 'abc' as c1, 0 as c2 from Y limit 1; 
 {code}
 Table *t* looks like the following:
 ||c1||c2||
 |abc|0|
 Running the following query then returns zero results.
 {code}
 SELECT 
   t1.c1
 FROM 
   t t1
 JOIN
 (SELECT 
t2.c1,
MAX(t2.c2) AS c2
  FROM 
t t2 
  GROUP BY 
t2.c1
 ) t3
 ON t1.c2=t3.c2
 {code}
 However, we expected to see the following:
 ||c1||
 |abc|
 The problem seems to relate to the fact that in the subquery, we group by 
 column *c1*, but this is not subsequently used in the join condition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648836#comment-14648836
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648838#comment-14648838
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648834#comment-14648834
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648839#comment-14648839
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648837#comment-14648837
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0


[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648835#comment-14648835
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Thanks [~spena] for this information. Patch is updated and please help me 
review it. Thank you!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks


[ 
https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648857#comment-14648857
 ] 

Hive QA commented on HIVE-11413:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748049/HIVE-11413.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9276 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4770/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4770/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4770/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748049 - PreCommit-HIVE-TRUNK-Build

 Error in detecting availability of HiveSemanticAnalyzerHooks
 

 Key: HIVE-11413
 URL: https://issues.apache.org/jira/browse/HIVE-11413
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.0
Reporter: Raajay Viswanathan
Assignee: Raajay Viswanathan
Priority: Trivial
  Labels: newbie
 Fix For: 2.0.0

 Attachments: HIVE-11413.patch


 In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of 
 available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the 
 {{getHooks}} method. This method always  returns a {{List}} of hooks. 
 However, while checking for availability of hooks, the current version of the 
 code uses a comparison of _saHooks_ with NULL. This is incorrect, as the 
 segment of code designed to call pre and post Analyze functions gets executed 
 even when the list is empty. The comparison should be changed to 
 {{saHooks.size()  0}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11397) Parse Hive OR clauses as they are written into the AST

2015-07-31 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11397:
---
Attachment: HIVE-11397.1.patch

 Parse Hive OR clauses as they are written into the AST
 --

 Key: HIVE-11397
 URL: https://issues.apache.org/jira/browse/HIVE-11397
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11397.1.patch, HIVE-11397.patch


 When parsing A OR B OR C, hive converts it into 
 (C OR B) OR A
 instead of turning it into
 A OR (B OR C)
 {code}
 GenericUDFOPOr or = new GenericUDFOPOr();
 ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2);
 expressions.add(previous);
 expressions.add(current);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures

2015-07-31 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HIVE-10410.

Resolution: Fixed

[~rich williams], thanks a lot for reporting the issue, and the verification. I 
marked the issue fixed for now.

 Apparent race condition in HiveServer2 causing intermittent query failures
 --

 Key: HIVE-10410
 URL: https://issues.apache.org/jira/browse/HIVE-10410
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
 Environment: CDH 5.3.3
 CentOS 6.4
Reporter: Richard Williams
 Attachments: HIVE-10410.1.patch


 On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
 occasionally trigger odd Thrift exceptions with messages such as Read a 
 negative frame size (-2147418110)! or out of sequence response in 
 HiveServer2's connections to the metastore. For certain metastore calls (for 
 example, showDatabases), these Thrift exceptions are converted to 
 MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
 from retrying these calls and thus causes the failure to bubble out to the 
 JDBC client.
 Note that as far as we can tell, this issue appears to only affect queries 
 that are submitted with the runAsync flag on TExecuteStatementReq set to true 
 (which, in practice, seems to mean all JDBC queries), and it appears to only 
 manifest when HiveServer2 is using the new HTTP transport mechanism. When 
 both these conditions hold, we are able to fairly reliably reproduce the 
 issue by spawning about 100 simple, concurrent hive queries (we have been 
 using show databases), two or three of which typically fail. However, when 
 either of these conditions do not hold, we are no longer able to reproduce 
 the issue.
 Some example stack traces from the HiveServer2 logs:
 {noformat}
 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: 
 org.apache.thrift.transport.TTransportException Read a negative frame size 
 (-2147418110)!
 org.apache.thrift.transport.TTransportException: Read a negative frame size 
 (-2147418110)!
 at 
 org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435)
 at 
 org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414)
 at 
 org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
 at 
 org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
 at 
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600)
 at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837)
 at 
 org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60)
 at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
 at com.sun.proxy.$Proxy6.getDatabases(Unknown Source)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145)
 at 
 org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
 at

[jira] [Commented] (HIVE-11432) Hive macro give same result for different arguments

2015-07-31 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650119#comment-14650119
 ] 

Pengcheng Xiong commented on HIVE-11432:


[~mendax], I was facing a similar issue here. Do you mind if I assign this JIRA 
to myself? Thanks.

 Hive macro give same result for different arguments
 ---

 Key: HIVE-11432
 URL: https://issues.apache.org/jira/browse/HIVE-11432
 Project: Hive
  Issue Type: Bug
Reporter: Jay Pandya

 If you use hive macro more than once while processing same row, hive returns 
 same result for all invocations even if the argument are different. 
 Example : 
  CREATE  TABLE macro_testing(
   a int,
   b int,
   c int)
  select * from macro_testing;
 1 2   3
 4 5   6
 7 8   9
 1011  12
  create temporary macro math_square(x int)
x*x;
  select math_square(a), b, math_square(c)  from macro_testing;
 9 2   9
 365   36
 818   81
 144   11  144



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid


[ 
https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650143#comment-14650143
 ] 

Hive QA commented on HIVE-11087:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748282/HIVE-11087.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4782/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4782/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4782/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4782/succeeded/TestJdbcWithMiniHS2,
 remoteFile=/home/hiveptest/54.80.40.35-hiveptest-0/logs/, getExitCode()=12, 
getException()=null, getUser()=hiveptest, getHost()=54.80.40.35, 
getInstance()=0]: 'Address 54.80.40.35 maps to 
ec2-54-80-40-35.compute-1.amazonaws.com, but this does not map back to the 
address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml
   0   0%0.00kB/s0:00:00
5793 100%1.38MB/s0:00:00 (xfer#1, to-check=3/5)
hive.log
   0   0%0.00kB/s0:00:00
45350912   0%   43.25MB/s0:07:00
91979776   0%   43.86MB/s0:06:54
   137822208   0%   43.81MB/s0:06:53
   182976512   0%   43.62MB/s0:06:54
   228982784   1%   43.78MB/s0:06:51
   274923520   1%   43.63MB/s0:06:52
   320864256   1%   43.64MB/s0:06:50
   366444544   1%   43.75MB/s0:06:48
   386138112   2%   35.38MB/s0:08:25
   400031744   2%   28.13MB/s0:10:34
   438435840   2%   26.44MB/s0:11:14
   483688448   2%   26.36MB/s0:11:14
   513540096   2%   29.97MB/s0:09:52
   531333120   2%   30.93MB/s0:09:33
   559415296   2%   27.94MB/s0:10:33
   569966592   3%   19.92MB/s0:14:47
   582942720   3%   16.22MB/s0:18:09
   588251136   3%   13.19MB/s0:22:19
   608174080   3%   11.19MB/s0:26:17
   643694592   3%   16.93MB/s0:17:20
   690487296   3%   24.68MB/s0:11:52
   735903744   3%   34.16MB/s0:08:33
   780861440   4%   41.17MB/s0:07:04
   827326464   4%   43.78MB/s0:06:38
   844365824   4%   36.38MB/s0:07:58
   869269504   4%   31.28MB/s0:09:16
   880279552   4%   23.27MB/s0:12:27
   88804   4%   14.20MB/s0:20:23
   893616128   4%   11.58MB/s0:25:00
   923893760   4%   12.94MB/s0:22:20
   968720384   5%   21.00MB/s0:13:43
  1012924416   5%   29.69MB/s0:09:41
  1058373632   5%   39.31MB/s0:07:17
  1086062592   5%   38.34MB/s0:07:28
  1092091904   5%   28.72MB/s0:09:58
  1104576512   5%   21.33MB/s0:13:24
  1135050752   6%   17.84MB/s0:16:00
  1180008448   6%   22.06MB/s0:12:54
  1217495040   6%   29.90MB/s0:09:30
  1263009792   6%   37.76MB/s0:07:30
  1307475968   6%   41.11MB/s0:06:52
  1341259776   7%   38.45MB/s0:07:20
  1354235904   7%   32.40MB/s0:08:42
  1366818816   7%   24.29MB/s0:11:36
  1384382464   7%   17.62MB/s0:15:58
  1397325824   7%   12.85MB/s0:21:54
  1440284672   7%   19.84MB/s0:14:08
  1486520320   7%   27.95MB/s0:10:00
  1535606784   8%   36.07MB/s0:07:44
  1578696704   8%   43.25MB/s0:06:26
  1617428480   8%   42.17MB/s0:06:35
  1624244224   8%   32.28MB/s0:08:36
  1632108544   8%   22.40MB/s0:12:23
  1639972864   8%   14.13MB/s0:19:38
  1672282112   8%   12.68MB/s0:21:50
  1714487296   9%   21.18MB/s0:13:02
  1756954624   9%   29.59MB/s0:09:18
  1800634368   9%   38.31MB/s0:07:10
  1841758208   9%   40.42MB/s0:06:46
  1879048192  10%   38.58MB/s0:07:05
  1889009664  10%   30.85MB/s0:08:51
  1910046720  10%   25.57MB/s0:10:40
  1925709824  10%   19.41MB/s0:14:03
  1943240704  10%   15.10MB/s0:18:03
  1987117056  10%   23.15MB/s0:11:44
  2030927872  10%   28.51MB/s0:09:30
  2075328512  11%   35.65MB/s0:07:35
  2118025216  11%   41.65MB/s0:06:28
  2156068864  11%   40.26MB/s0:06:40
  2172649472  11%   33.54MB/s0:08:00
  2187067392  11%   26.35MB/s0:10:11
  2204598272  11%   20.42MB/s0:13:08
  2236252160  11%   18.90MB/s0:14:09
  2279604224  12%   25.39MB/s0:10:30
  2322923520  12%   32.38MB/s0:08:13
  2366799872  12%   38.66MB/s0:06:52
  2410414080  12%   41.50MB/s0:06:22
  2438463488  13%   37.31MB/s0:07:05
  2452094976  13%   30.33MB/s0:08:42
  2460647424  13%   22.04MB/s0:11:58

[jira] [Resolved] (HIVE-11045) ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0

2015-07-31 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K resolved HIVE-11045.
---
Resolution: Not A Problem

As noted, this has already been fixed.

 ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0
 

 Key: HIVE-11045
 URL: https://issues.apache.org/jira/browse/HIVE-11045
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
 Environment: Hive 1.2.0, HDP 2.2, Hadoop 2.6, Tez 0.7.0
Reporter: Soundararajan Velu

  TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row (tag=0) 
 {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}}
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row (tag=0) 
 {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}}
 at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)
 at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 ... 14 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row (tag=0) 
 {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}}
 at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
 at 
 org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292)
 ... 16 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error while processing row (tag=1) 
 {key:{_col0:6417306,_col1:{0:{_col0:2014-08-01 
 02:14:02}}},value:{_col0:2014-08-01 
 02:14:02,_col1:20140801,_col2:sc_jarvis_b2c,_col3:action_override,_col4:WITHIN_GRACE_PERIOD,_col5:policy_override}}
 at 
 org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:413)
 at 
 org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:381)
 at 
 org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:206)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
 at

[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in

2015-07-31 Thread Zheng Shao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649973#comment-14649973
 ] 

Zheng Shao commented on HIVE-5457:
--

We hit this problem also in our Hive 0.11 HMS.  The problem continues to be 
there (and fails a lot of workflows) until we restart the metaserver.


 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0
Reporter: Lenni Kuff
Priority: Critical

 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures


 [ 
https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11430:
---
Description: 
{code}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
{code}

As show in 
https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.

  was:
{code}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
{code}

As show in .


 Followup HIVE-10166: investigate and fix the two test failures
 --

 Key: HIVE-11430
 URL: https://issues.apache.org/jira/browse/HIVE-11430
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 2.0.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang

 {code}
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
 {code}
 As show in 
 https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-11046) Filesystem Closed Exception

2015-07-31 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K resolved HIVE-11046.
---
Resolution: Not A Problem

As noted, this works with the released version.

 Filesystem Closed Exception
 ---

 Key: HIVE-11046
 URL: https://issues.apache.org/jira/browse/HIVE-11046
 Project: Hive
  Issue Type: Bug
  Components: Hive, Tez
Affects Versions: 0.7.0, 1.2.0
 Environment: Hive 1.2.0, Tez0.7.0, HDP2.2, Hadoop 2.6
Reporter: Soundararajan Velu

  TaskAttempt 2 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
 Filesystem closed
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
 at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
 at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.io.IOException: Filesystem closed
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:290)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
 ... 14 more
 Caused by: java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:795)
 at 
 org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:629)
 at java.io.FilterInputStream.close(FilterInputStream.java:181)
 at 
 org.apache.hadoop.io.compress.DecompressorStream.close(DecompressorStream.java:205)
 at org.apache.hadoop.util.LineReader.close(LineReader.java:150)
 at 
 org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:282)
 at 
 org.apache.hadoop.hive.ql.io.HiveRecordReader.doClose(HiveRecordReader.java:50)
 at 
 org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:104)
 at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:170)
 at 
 org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:138)
 at 
 org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
 at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
 ... 16 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11427) Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079

2015-07-31 Thread Grisha Trubetskoy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grisha Trubetskoy updated HIVE-11427:
-
Description: 
If a user _does not_ have HDFS write permissions to the _default_ database, and 
attempts to create a table in a _private_ database to which the user _does_ 
have permissions, the following happens:
{code}
create table grisha.blahblah as select * from some_table;
FAILED: SemanticException 0:0 Error creating temporary folder on: 
hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
'TOK_TMP_FILE’

I've edited this issue because my initial explanation was completely bogus. A 
more likely explanation is in 
https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278
{code}
-fname = ctx.getExternalTmpPath(
+fname = ctx.getExtTmpPathRelTo(
{code}

In any event - the bug is that the location chosen for the temporary storage 
has to be in the same place as the target table because that is where 
presumably the user running the query would have write permissions to.

  was:
If a user _does not_ have HDFS write permissions to the _default_ database, and 
attempts to create a table in a _private_ database to which the user _does_ 
have permissions, the following happens:
{code}
create table grisha.blahblah as select * from some_table;
FAILED: SemanticException 0:0 Error creating temporary folder on: 
hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
'TOK_TMP_FILE’
{code}
The reason for this seems to be 
https://github.com/apache/hive/commit/05a2aff71c2682e01331cd333189ce7802233a75#diff-f2040374293a91cbcc6594ee571b20e4L1425,
 specifically this line: 
https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L1787,
 which changed like this in the aforementioned commit:
{code}
-location = 
wh.getDatabasePath(db.getDatabase(newTable.getDbName()));
+location = wh.getDatabasePath(db.getDatabase(names[0]));
{code}

So before the database of the new table was used, and now the database of the 
table from the select is used as I understand it.

NB: This was all inferred from just reading the code, I have not verified it.


 Location of temporary table for CREATE TABLE  SELECT broken by HIVE-7079
 

 Key: HIVE-11427
 URL: https://issues.apache.org/jira/browse/HIVE-11427
 Project: Hive
  Issue Type: Bug
Reporter: Grisha Trubetskoy

 If a user _does not_ have HDFS write permissions to the _default_ database, 
 and attempts to create a table in a _private_ database to which the user 
 _does_ have permissions, the following happens:
 {code}
 create table grisha.blahblah as select * from some_table;
 FAILED: SemanticException 0:0 Error creating temporary folder on: 
 hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
 'TOK_TMP_FILE’
 I've edited this issue because my initial explanation was completely bogus. A 
 more likely explanation is in 
 https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278
 {code}
 -fname = ctx.getExternalTmpPath(
 +fname = ctx.getExtTmpPathRelTo(
 {code}
 In any event - the bug is that the location chosen for the temporary storage 
 has to be in the same place as the target table because that is where 
 presumably the user running the query would have write permissions to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11431) Vectorization: select * Left Semi Join projections NPE


 [ 
https://issues.apache.org/jira/browse/HIVE-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11431:
---
Attachment: left-semi-bug.sql

 Vectorization: select * Left Semi Join projections NPE
 --

 Key: HIVE-11431
 URL: https://issues.apache.org/jira/browse/HIVE-11431
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 1.2.1
Reporter: Gopal V
Assignee: Matt McCline
 Attachments: left-semi-bug.sql


 The select * is meant to only apply to the left most table, not the right 
 most - the unprojected d from tmp1 triggers this NPE.
 {code}
 select * from tmp2 left semi join tmp1 where c1 = id and c0 = q;
 {code}
 {code}
 Caused by: java.lang.NullPointerException
 at java.lang.System.arraycopy(Native Method)
 at org.apache.hadoop.io.Text.set(Text.java:225)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow$StringExtractorByValue.extract(VectorExtractRow.java:472)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:732)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:96)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:136)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 at 
 org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:117)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11427) Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079

2015-07-31 Thread Grisha Trubetskoy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grisha Trubetskoy updated HIVE-11427:
-
Description: 
If a user _does not_ have HDFS write permissions to the _default_ database, and 
attempts to create a table in a _private_ database to which the user _does_ 
have permissions using CREATE TABLE AS SELECT from a table in the default 
database, the following happens:
{code}
use default;
create table grisha.blahblah as select * from some_table;
FAILED: SemanticException 0:0 Error creating temporary folder on: 
hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
'TOK_TMP_FILE’
{code}

I've edited this issue because my initial explanation was completely bogus. A 
more likely explanation is in 
https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278

{code}
 -fname = ctx.getExternalTmpPath(
 +fname = ctx.getExtTmpPathRelTo(  
// and then something incorrect happens in getExtTmpPathRelTo()
{code}

In any event - the bug is that the location chosen for the temporary storage is 
not in the same place as the target table. It should be same as the target 
table (/user/hive/warehouse/grisha.db in the above example) because this is 
where presumably the user running the query would have write permissions to.

  was:
If a user _does not_ have HDFS write permissions to the _default_ database, and 
attempts to create a table in a _private_ database to which the user _does_ 
have permissions, the following happens:
{code}
create table grisha.blahblah as select * from some_table;
FAILED: SemanticException 0:0 Error creating temporary folder on: 
hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
'TOK_TMP_FILE’

I've edited this issue because my initial explanation was completely bogus. A 
more likely explanation is in 
https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278
{code}
-fname = ctx.getExternalTmpPath(
+fname = ctx.getExtTmpPathRelTo(
{code}

In any event - the bug is that the location chosen for the temporary storage 
has to be in the same place as the target table because that is where 
presumably the user running the query would have write permissions to.


 Location of temporary table for CREATE TABLE  SELECT broken by HIVE-7079
 

 Key: HIVE-11427
 URL: https://issues.apache.org/jira/browse/HIVE-11427
 Project: Hive
  Issue Type: Bug
Reporter: Grisha Trubetskoy

 If a user _does not_ have HDFS write permissions to the _default_ database, 
 and attempts to create a table in a _private_ database to which the user 
 _does_ have permissions using CREATE TABLE AS SELECT from a table in the 
 default database, the following happens:
 {code}
 use default;
 create table grisha.blahblah as select * from some_table;
 FAILED: SemanticException 0:0 Error creating temporary folder on: 
 hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 
 'TOK_TMP_FILE’
 {code}
 I've edited this issue because my initial explanation was completely bogus. A 
 more likely explanation is in 
 https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278
 {code}
  -fname = ctx.getExternalTmpPath(
  +fname = ctx.getExtTmpPathRelTo(  
 // and then something incorrect happens in getExtTmpPathRelTo()
 {code}
 In any event - the bug is that the location chosen for the temporary storage 
 is not in the same place as the target table. It should be same as the target 
 table (/user/hive/warehouse/grisha.db in the above example) because this is 
 where presumably the user running the query would have write permissions to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid inde

2015-07-31 Thread Zheng Shao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-5457:
-
Attachment: HIVE-5457.workaround.patch

This patch will kill metastore on such error.  It's *not* a proper fix, but is 
going to be used in production in our environment to make sure that the failing 
MetastoreServer does not continue to fail our workflows.


 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 ---

 Key: HIVE-5457
 URL: https://issues.apache.org/jira/browse/HIVE-5457
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0
Reporter: Lenni Kuff
Priority: Critical
 Attachments: HIVE-5457.workaround.patch


 Concurrent calls to getTable() result in: MetaException: 
 org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.  NucleusException: Invalid index 1 for DataStoreMapping
 This happens when using a Hive Metastore Service directly connecting to the 
 backend metastore db. I have been able to hit this with as few as 2 
 concurrent calls.  When I update my app to serialize all calls to getTable() 
 this problem is resolved. 
 Stack Trace:
 {code}
 Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for 
 DataStoreMapping.
 at 
 org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257)
 at 
 org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46)
 at 
 org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440)
 at org.datanucleus.sco.backed.List.size(List.java:557) 
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
 at $Proxy6.getTable(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7517) RecordIdentifier overrides equals() but not hashCode()


 [ 
https://issues.apache.org/jira/browse/HIVE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7517:
-
Component/s: Transactions

 RecordIdentifier overrides equals() but not hashCode()
 --

 Key: HIVE-7517
 URL: https://issues.apache.org/jira/browse/HIVE-7517
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Transactions
Affects Versions: 0.13.1
Reporter: Eugene Koifman
Assignee: Eugene Koifman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8323) Ensure transactional tbl property can only be set on tables using AcidOutputFormat


 [ 
https://issues.apache.org/jira/browse/HIVE-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-8323:
-
Component/s: (was: Metastore)
 Transactions

 Ensure transactional tbl property can only be set on tables using 
 AcidOutputFormat
 

 Key: HIVE-8323
 URL: https://issues.apache.org/jira/browse/HIVE-8323
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11353) Map env does not reflect in the Local Map Join

2015-07-31 Thread Ryu Kobayashi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648921#comment-14648921
 ] 

Ryu Kobayashi commented on HIVE-11353:
--

Please let me know if there are any problems in the code.

 Map env does not reflect in the Local Map Join
 --

 Key: HIVE-11353
 URL: https://issues.apache.org/jira/browse/HIVE-11353
 Project: Hive
  Issue Type: Bug
Reporter: Ryu Kobayashi
Assignee: Ryu Kobayashi
 Attachments: HIVE-11353.1.patch


 mapreduce.map.env is not reflected when the Local Map Join is ran. Following 
 a sample query:
 {code}
 hive set mapreduce.map.env=AAA=111,BBB=222,CCC=333;
 hive select
reflect(java.lang.System, getenv, CCC) as CCC,
a.AAA,
b.BBB
  from (
SELECT
  reflect(java.lang.System, getenv, AAA) as AAA
from
  foo
  ) a
  join (
select
  reflect(java.lang.System, getenv, BBB) as BBB
from
  foo
  ) b
  limit 1;
 Warning: Map Join MAPJOIN[10][bigTable=?] in task 'Stage-3:MAPRED' is a cross 
 product
 Query ID = root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13
 Total jobs = 1
 15/07/16 01:36:46 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where applicable
 Execution log at: 
 /tmp/root/root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13.log
 2015-07-16 01:36:47 Starting to launch local task to process map join;
   maximum memory = 477102080
 2015-07-16 01:36:48 Dump the side-table for tag: 0 with group count: 1 
 into file: 
 file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
 2015-07-16 01:36:48 Uploaded 1 File to: 
 file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
  (282 bytes)
 2015-07-16 01:36:48 End of local task; Time Taken: 0.934 sec.
 Execution completed successfully
 MapredLocal task succeeded
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_1436962851556_0015, Tracking URL = 
 http://hadoop27:8088/proxy/application_1436962851556_0015/
 Kill Command = /usr/local/hadoop/bin/hadoop job  -kill job_1436962851556_0015
 Hadoop job information for Stage-3: number of mappers: 1; number of reducers:  0
 2015-07-16 01:36:56,488 Stage-3 map = 0%,  reduce = 0%
 2015-07-16 01:37:01,656 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 1.28 
 sec
 MapReduce Total cumulative CPU time: 1 seconds 280 msec
 Ended Job = job_1436962851556_0015
 MapReduce Jobs Launched:
 Stage-Stage-3: Map: 1   Cumulative CPU: 1.28 sec   HDFS Read: 5428 HDFS 
 Write: 13 SUCCESS
 Total MapReduce CPU Time Spent: 1 seconds 280 msec
 OK
 333 null222
 Time taken: 19.562 seconds, Fetched: 1 row(s)
 {code}
 The attached patch will include those taken from Hadoop's code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-7292) Hive on Spark

2015-07-31 Thread Li Mingyang (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Li Mingyang reassigned HIVE-7292:
-

Assignee: Li Mingyang (was: Xuefu Zhang)

Hive on Spark
-

Key: HIVE-7292
URL: https://issues.apache.org/jira/browse/HIVE-7292
Project: Hive
Issue Type: Improvement
Components: Spark
Reporter: Xuefu Zhang
Assignee: Li Mingyang
Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5
Attachments: Hive-on-Spark.pdf

Spark as an open-source data analytics cluster computing framework has gained
significant momentum recently. Many Hive users already have Spark installed
as their computing backbone. To take advantages of Hive, they still need to
have either MapReduce or Tez on their cluster. This initiative will provide
user a new alternative so that those user can consolidate their backend.
Secondly, providing such an alternative further increases Hive's adoption as
it exposes Spark users to a viable, feature-rich de facto standard SQL tools
on Hadoop.
Finally, allowing Hive to run on Spark also has performance benefits. Hive
queries, especially those involving multiple reducer stages, will run faster,
thus improving user experience as Tez does.
This is an umbrella JIRA which will cover many coming subtask. Design doc
will be attached here shortly, and will be on the wiki as well. Feedback from
the community is greatly appreciated!

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression


[ 
https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648958#comment-14648958
 ] 

Hive QA commented on HIVE-11405:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748096/HIVE-11405.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9277 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4771/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748096 - PreCommit-HIVE-TRUNK-Build

 Add early termination for recursion in 
 StatsRulesProcFactory$FilterStatsRule.evaluateExpression  for OR expression
 --

 Key: HIVE-11405
 URL: https://issues.apache.org/jira/browse/HIVE-11405
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11405.patch


 Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330.  Quoting 
 him,
 The recursion protection works well with an AND expr, but it doesn't work 
 against
 (OR a=1 (OR a=2 (OR a=3 (OR ...)
 since the for the rows will never be reduced during recursion due to the 
 nature of the OR.
 We need to execute a short-circuit to satisfy the OR properly - no case which 
 matches a=1 qualifies for the rest of the filters.
 Recursion should pass in the numRows - branch1Rows for the branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015