[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)

2016-08-15 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422294#comment-15422294
 ] 

Alexander Behm commented on HIVE-6590:
--

A related and severe issue is that ALTER TABLE DROP PARTITION may drop 
partitions that were not specified in the ALTER statement. This could lead to 
accidental data loss.

Reproduction:
{code}
CREATE TABLE broken (c int) PARTITIONED BY (b1 BOOLEAN, s STRING, b2 BOOLEAN, i 
INT);

# Insert a few variants of 'false' partition-key values.
INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=false,i=0) VALUES(1);
INSERT INTO TABLE broken PARTITION(b1=FALSE,s='a',b2=false,i=0) VALUES(3);
INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=False,i=0) VALUES(5);
INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=FalsE,i=0) VALUES(7);

# Insert a few variants of 'true' partition-key values.
INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=true,i=0) VALUES(2);
INSERT INTO TABLE broken PARTITION(b1=TRUE,s='a',b2=true,i=0) VALUES(4);
INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=True,i=0) VALUES(6);
INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=TruE,i=0) VALUES(8);

# Insert a few variants of mixed 'true'/'false' partition-key values.
INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=true,i=0) VALUES(100);
INSERT INTO TABLE broken PARTITION(b1=FALSE,s='a',b2=TRUE,i=0) VALUES(1000);
INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=false,i=0) VALUES(1);
INSERT INTO TABLE broken PARTITION(b1=tRUe,s='a',b2=fALSe,i=0) VALUES(10);

# Very broken partition drop.
hive> ALTER TABLE broken DROP PARTITION(b1=true,s='a',b2=true,i=0);
Dropped the partition b1=false/s=a/b2=false/i=0
Dropped the partition b1=false/s=a/b2=False/i=0
Dropped the partition b1=false/s=a/b2=FalsE/i=0
Dropped the partition b1=FALSE/s=a/b2=false/i=0
Dropped the partition b1=false/s=a/b2=true/i=0
Dropped the partition b1=FALSE/s=a/b2=TRUE/i=0
Dropped the partition b1=true/s=a/b2=false/i=0
Dropped the partition b1=tRUe/s=a/b2=fALSe/i=0
Dropped the partition b1=true/s=a/b2=true/i=0
Dropped the partition b1=true/s=a/b2=True/i=0
Dropped the partition b1=true/s=a/b2=TruE/i=0
Dropped the partition b1=TRUE/s=a/b2=true/i=0
OK
Time taken: 1.387 seconds
{code}


> Hive does not work properly with boolean partition columns (wrong results and 
> inserts to incorrect HDFS path)
> -
>
> Key: HIVE-6590
> URL: https://issues.apache.org/jira/browse/HIVE-6590
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Metastore
>Affects Versions: 0.10.0
>Reporter: Lenni Kuff
>
> Hive does not work properly with boolean partition columns. Queries return 
> wrong results and also insert to incorrect HDFS paths.
> {code}
> create table bool_part(int_col int) partitioned by(bool_col boolean);
> # This works, creating 3 unique partitions!
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE);
> ALTER TABLE bool_table ADD PARTITION (bool_col=false);
> ALTER TABLE bool_table ADD PARTITION (bool_col=False);
> {code}
> The first problem is that Hive cannot filter on a bool partition key column. 
> "select * from bool_part" returns the correct results, but if you apply a 
> filter on the bool partition key column hive won't return any results.
> The second problem is that Hive seems to just call "toString()" on the 
> boolean literal value. This means you can end up with multiple partitions 
> (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, 
> if you can add three partition in have for the same logic value "false" doing:
> ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> 
> /test-warehouse/bool_table/bool_col=FALSE/
> ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> 
> /test-warehouse/bool_table/bool_col=false/
> ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> 
> /test-warehouse/bool_table/bool_col=False/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14405) Have tests log to the console along with hive.log

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422218#comment-15422218
 ] 

Hive QA commented on HIVE-14405:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823801/HIVE-14405.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10472 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/892/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/892/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-892/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823801 - PreCommit-HIVE-MASTER-Build

> Have tests log to the console along with hive.log
> -
>
> Key: HIVE-14405
> URL: https://issues.apache.org/jira/browse/HIVE-14405
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14405.01.patch, HIVE-14405.02.patch
>
>
> When running tests from the IDE (not itests), logs end up going to hive.log - 
> making it difficult to debug tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14542) VirtualColumn::equals() should use object equality

2016-08-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14542:
---
Description: 
The VirtualColumn() constructor is private and is only called to initialize 5 
static objects.

!virtual-columns.png!

There's no reason for VirtualColumn::equals() to do a deep type inspection for 
each access of a complex type like ROW__ID.


{code}
  else if(vc.equals(VirtualColumn.ROWID)) {
if(ctx.getIoCxt().getRecordIdentifier() == null) {
  vcValues[i] = null;
}
{code}


  was:
The VirtualColumn() constructor is private and is only called to initialize 5 
static objects.

!virtual-columns.png!

There's no reason for VirtualColumn::equals() to do a deep type inspection for 
each access of a complex type like ROW__ID.






> VirtualColumn::equals() should use object equality
> --
>
> Key: HIVE-14542
> URL: https://issues.apache.org/jira/browse/HIVE-14542
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gopal V
>Priority: Minor
> Attachments: virtual-columns.png
>
>
> The VirtualColumn() constructor is private and is only called to initialize 5 
> static objects.
> !virtual-columns.png!
> There's no reason for VirtualColumn::equals() to do a deep type inspection 
> for each access of a complex type like ROW__ID.
> {code}
>   else if(vc.equals(VirtualColumn.ROWID)) {
> if(ctx.getIoCxt().getRecordIdentifier() == null) {
>   vcValues[i] = null;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14542) VirtualColumn::equals() should use object equality

2016-08-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14542:
---
Attachment: virtual-columns.png

> VirtualColumn::equals() should use object equality
> --
>
> Key: HIVE-14542
> URL: https://issues.apache.org/jira/browse/HIVE-14542
> Project: Hive
>  Issue Type: Improvement
>Reporter: Gopal V
>Priority: Minor
> Attachments: virtual-columns.png
>
>
> The VirtualColumn() constructor is private and is only called to initialize 5 
> static objects.
> !virtual-columns.png!
> There's no reason for VirtualColumn::equals() to do a deep type inspection 
> for each access of a complex type like ROW__ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2016-08-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422200#comment-15422200
 ] 

Rui Li commented on HIVE-14412:
---

[~xuefuz], thanks for your comments. In {{TimestampColumnVector}}, we store the 
time and nanos of each timestamp, then we read/write them in 
{{TimestampTreeReader}} and {{TimestampTreeWriter}} accordingly. I guess we 
can't maintain compatibility here since we're adding a new field.
Another possible solution is, instead of extending timestamp, we treat 
HiveTimestamp as a totally new data type. What do you think?

> Add a timezone-aware timestamp
> --
>
> Key: HIVE-14412
> URL: https://issues.apache.org/jira/browse/HIVE-14412
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, 
> HIVE-14412.1.patch
>
>
> Java's Timestamp stores the time elapsed since the epoch. While it's by 
> itself unambiguous, ambiguity comes when we parse a string into timestamp, or 
> convert a timestamp to string, causing problems like HIVE-14305.
> To solve the issue, I think we should make timestamp aware of timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default

2016-08-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422123#comment-15422123
 ] 

Ashutosh Chauhan commented on HIVE-12656:
-

+1

> Turn hive.compute.query.using.stats on by default
> -
>
> Key: HIVE-12656
> URL: https://issues.apache.org/jira/browse/HIVE-12656
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12656.01.patch, HIVE-12656.02.patch, 
> HIVE-12656.03.patch, HIVE-12656.04.patch, HIVE-12656.05.patch
>
>
> We now have hive.compute.query.using.stats=false by default. We plan to turn 
> it on by default so that we can have better performance. We can also set it 
> to false in some test cases to maintain the original purpose of those tests..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422117#comment-15422117
 ] 

Hive QA commented on HIVE-14463:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823785/HIVE-14463.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10472 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/891/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/891/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-891/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823785 - PreCommit-HIVE-MASTER-Build

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
> 

[jira] [Commented] (HIVE-14503) Remove explicit order by in qfiles for union tests

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422037#comment-15422037
 ] 

Hive QA commented on HIVE-14503:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823782/HIVE-14503.1.patch

{color:green}SUCCESS:{color} +1 due to 35 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 10472 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_view]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[unionDistinct_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[union_type_chk]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union23]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union32]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union34]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_10]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_11]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_12]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_13]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_14]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_15]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_16]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_17]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_18]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_19]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_1]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_20]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_21]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_22]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_23]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_24]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_25]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_2]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_3]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_4]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_5]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_6]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_6_subq]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_7]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_8]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_9]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_script]
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_view]
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/890/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/890/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-890/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 40 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823782 - PreCommit-HIVE-MASTER-Build

> Remove explicit order by in qfiles for union tests
> --
>
> Key: HIVE-14503
> URL: https://issues.apache.org/jira/browse/HIVE-14503
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14503.1.patch
>
>
> Identify qfiles with explicit order by and replace them with 
> SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14405) Have tests log to the console along with hive.log

2016-08-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14405:
--
Attachment: HIVE-14405.02.patch

Updated patch which defaults the console output to INFO logging.

> Have tests log to the console along with hive.log
> -
>
> Key: HIVE-14405
> URL: https://issues.apache.org/jira/browse/HIVE-14405
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14405.01.patch, HIVE-14405.02.patch
>
>
> When running tests from the IDE (not itests), logs end up going to hive.log - 
> making it difficult to debug tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Subramanyam Pattipaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421970#comment-15421970
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

[~pxiong], can you please make following extra changes

1. Check for configs mapred.input.dir.recursive and 
hive.mapred.supports.subdirectories are enabled and then ignore having 
directories after you reach depth same as number of partition columns.
2. If you find any files at unexpected locations then please check for a config 
(can't remember the config name) and produce error for each such path and move 
ahead. Otherwise, fail the operation.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default

2016-08-15 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421967#comment-15421967
 ] 

Pengcheng Xiong commented on HIVE-12656:


[~ashutoshc], I believe the failed tests are related to golden file updates. 
Could u please take a look? Thanks.

> Turn hive.compute.query.using.stats on by default
> -
>
> Key: HIVE-12656
> URL: https://issues.apache.org/jira/browse/HIVE-12656
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12656.01.patch, HIVE-12656.02.patch, 
> HIVE-12656.03.patch, HIVE-12656.04.patch, HIVE-12656.05.patch
>
>
> We now have hive.compute.query.using.stats=false by default. We plan to turn 
> it on by default so that we can have better performance. We can also set it 
> to false in some test cases to maintain the original purpose of those tests..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421962#comment-15421962
 ] 

Ashutosh Chauhan commented on HIVE-14463:
-

+1

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStre

[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421924#comment-15421924
 ] 

Hive QA commented on HIVE-14418:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823749/HIVE-14418.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10442 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_sortmerge_join_7.q-cbo_windowing.q-scriptfile1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynamic_partition_pruning.q-vector_char_mapjoin1.q-unionDistinct_2.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[mapjoin_mapjoin]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/889/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/889/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-889/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823749 - PreCommit-HIVE-MASTER-Build

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14463:
-
Status: Patch Available  (was: Open)

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBuf

[jira] [Comment Edited] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421904#comment-15421904
 ] 

Hari Sankar Sivarama Subramaniyan edited comment on HIVE-14463 at 8/15/16 11:55 
PM:


Root cause of this problem is similar to HIVE-14424. The query is failing 
because of :
{code}
java.lang.RuntimeException: Error applying authorization policy on hive 
configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
{code}
The reason the test used to hang was because consumer.receive() is a blocking 
call. Part of the issue, i.e. hang has been fixed by HIVE-14520 by converting 
the blocking calls to non-blocking calls. The other issue, i.e. the actual 
error is fixed by the patch attached.


was (Author: hsubramaniyan):
Looks like a problem similar to HIVE-14424. The query is failing and the 
consumer.receive() is a blocking call. Part of the issue, i.e. hang has been 
fixed by HIVE-14520 by converting the blocking to non-blocking calls. The other 
issue, i.e. the actual error is fixed by the patch attached.

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.re

[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14463:
-
Attachment: HIVE-14463.1.patch

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedI

[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14463:
-
Attachment: HIVE-14463.1.patch

cc [~ashutoshc] for review.

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.active

[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14463:
-
Attachment: (was: HIVE-14463.1.patch)

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14463.1.patch
>
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.T

[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421904#comment-15421904
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-14463:
--

Looks like a problem similar to HIVE-14424. The query is failing and the 
consumer.receive() is a blocking call. Part of the issue, i.e. hang has been 
fixed by HIVE-14520 by converting the blocking to non-blocking calls. The other 
issue, i.e. the actual error is fixed by the patch attached.

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUN

[jira] [Assigned] (HIVE-14463) hcatalog server extensions test cases getting stuck

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan reassigned HIVE-14463:


Assignee: Hari Sankar Sivarama Subramaniyan

> hcatalog server extensions test cases getting stuck
> ---
>
> Key: HIVE-14463
> URL: https://issues.apache.org/jira/browse/HIVE-14463
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> The module is getting stuck in tests and not coming out for as long as 2 
> days. 
> Specifically, TestMsgBusConnection is the test case which has this problem. I 
> ran the tests on local environment and took a thread dump after it got stuck. 
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode):
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition 
> [0x000117b74000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "InactivityMonitor Async Task: 
> java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty 
> queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition 
> [0x00011786b000]
>java.lang.Thread.State: TIMED_WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00078166f0b8> (a 
> java.util.concurrent.SynchronousQueue$TransferStack)
>   at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>   at 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
>   at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942)
>   at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 
> tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561)
>   at java.io.DataInputStream.readInt(DataInputStream.java:387)
>   at 
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>   at 
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>   at java.lang.Thread.run(Thread.java:745)
> "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 
> tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.read(SocketInputStream.java:152)
>   at java.net.SocketInputStream.read(SocketInputStream.java:122)
>   at 
> org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBuffe

[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation

2016-08-15 Thread Abdullah Yousufi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421901#comment-15421901
 ] 

Abdullah Yousufi commented on HIVE-14165:
-

So I did try the listFiles() optimization locally and modified Hive to call the 
function on the root directory of a partitioned table. While this does give a 
speedup for a select * query on a partitioned table, this approach is not 
really extensible to queries that do partition elimination, since in those 
cases it makes sense to just pass in the relevant partitions, as Hive currently 
does.

I'm thinking it might make sense to remove the following list call on Hive in 
the case of S3 partitioned tables since the listing for the split computation 
is going to happen later anyway in Hadoop's FileInputFormat.java.

FetchOperator.java#getNextPath()
{code}
if (fs.exists(currPath)) {
  for (FileStatus fStat : listStatusUnderPath(fs, currPath)) {
if (fStat.getLen() > 0) {
  return true;
}
  }
}
{code}

My question is if it sounds good to remove this check. It seems that there may 
be errors that FileInputFormat.java#getSplits() may return if the partition 
directory does not have any files, but is there a better way to handle that?

> Enable faster S3 Split Computation
> --
>
> Key: HIVE-14165
> URL: https://issues.apache.org/jira/browse/HIVE-14165
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
>
> Split size computation be may improved by the optimizations for listFiles() 
> in HADOOP-13208



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests

2016-08-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14503:
-
Attachment: HIVE-14503.1.patch

> Remove explicit order by in qfiles for union tests
> --
>
> Key: HIVE-14503
> URL: https://issues.apache.org/jira/browse/HIVE-14503
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14503.1.patch
>
>
> Identify qfiles with explicit order by and replace them with 
> SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests

2016-08-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14503:
-
Status: Patch Available  (was: Open)

> Remove explicit order by in qfiles for union tests
> --
>
> Key: HIVE-14503
> URL: https://issues.apache.org/jira/browse/HIVE-14503
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14503.1.patch
>
>
> Identify qfiles with explicit order by and replace them with 
> SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421857#comment-15421857
 ] 

Prasanth Jayachandran edited comment on HIVE-14527 at 8/15/16 11:11 PM:


Thinking about it do we need to run these tests in CliDriver?. With MiniLlap 
these tests are really fast (~18 mins). I don't think schema evolution is tied 
to the execution engine. So should be safe to drop these tests from CliDriver. 
[~mmccline] Thoughts?


was (Author: prasanth_j):
Thinking about it do we need to run these tests in CliDriver. With MiniLlap 
these tests are really fast (~18 mins). I don't think schema evolution is tied 
to the execution engine. So should be safe to drop these tests from CliDriver. 
[~mmccline] Thoughts?

> Schema evolution tests are not running in TestCliDriver
> ---
>
> Key: HIVE-14527
> URL: https://issues.apache.org/jira/browse/HIVE-14527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch
>
>
> HIVE-14376 broke something that makes schema evolution tests being excluded 
> from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421857#comment-15421857
 ] 

Prasanth Jayachandran commented on HIVE-14527:
--

Thinking about it do we need to run these tests in CliDriver. With MiniLlap 
these tests are really fast (~18 mins). I don't think schema evolution is tied 
to the execution engine. So should be safe to drop these tests from CliDriver. 
[~mmccline] Thoughts?

> Schema evolution tests are not running in TestCliDriver
> ---
>
> Key: HIVE-14527
> URL: https://issues.apache.org/jira/browse/HIVE-14527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch
>
>
> HIVE-14376 broke something that makes schema evolution tests being excluded 
> from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14480) ORC ETLSplitStrategy should use thread pool when computing splits

2016-08-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14480:
-
   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch [~rajesh.balamohan]! Committed patch to master and 
branch-2.1

> ORC ETLSplitStrategy should use thread pool when computing splits
> -
>
> Key: HIVE-14480
> URL: https://issues.apache.org/jira/browse/HIVE-14480
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14480.1.patch, HIVE-14480.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13563) Hive Streaming does not honor orc.compress.size and orc.stripe.size table properties

2016-08-15 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421784#comment-15421784
 ] 

Wei Zheng commented on HIVE-13563:
--

[~leftylev] Wiki has been updated.

> Hive Streaming does not honor orc.compress.size and orc.stripe.size table 
> properties
> 
>
> Key: HIVE-13563
> URL: https://issues.apache.org/jira/browse/HIVE-13563
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13563.1.patch, HIVE-13563.2.patch, 
> HIVE-13563.3.patch, HIVE-13563.4.patch, HIVE-13563.branch-1.patch
>
>
> According to the doc:
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-HiveQLSyntax
> One should be able to specify tblproperties for many ORC options.
> But the settings for orc.compress.size and orc.stripe.size don't take effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421780#comment-15421780
 ] 

Hive QA commented on HIVE-14290:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823745/HIVE-14290.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10472 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/888/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/888/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-888/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823745 - PreCommit-HIVE-MASTER-Build

> Refactor HIVE-14054 to use Collections#newSetFromMap
> 
>
> Key: HIVE-14290
> URL: https://issues.apache.org/jira/browse/HIVE-14290
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Peter Slawski
>Assignee: Peter Slawski
>Priority: Trivial
> Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch, 
> HIVE-14290.1.patch
>
>
> There is a minor refactor that can be made to HiveMetaStoreChecker so that it 
> cleanly creates and uses a set that is backed by a Map implementation. In 
> this case, the underlying Map implementation is ConcurrentHashMap. This 
> refactor will help prevent issues such as the one reported in HIVE-14054.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421777#comment-15421777
 ] 

Sergey Shelukhin commented on HIVE-14511:
-

Ignoring the files makes sense in this case

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request

2016-08-15 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421768#comment-15421768
 ] 

Wei Zheng commented on HIVE-13354:
--

[~leftylev] Wiki has been updated:
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

> Add ability to specify Compaction options per table and per request
> ---
>
> Key: HIVE-13354
> URL: https://issues.apache.org/jira/browse/HIVE-13354
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13354.1.patch, 
> HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, 
> HIVE-13354.3.patch, HIVE-13354.branch-1.patch
>
>
> Currently the are a few options that determine when automatic compaction is 
> triggered.  They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be 
> compacted more often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is 
> currently no way to control job parameters (like memory, for example) except 
> to specify it in hive-site.xml for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if 
> launched via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Subramanyam Pattipaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421763#comment-15421763
 ] 

Subramanyam Pattipaka edited comment on HIVE-14511 at 8/15/16 10:07 PM:


[~sershe], Even if we introduce another command to be flexible to cater this 
scenario, what if the user data has changed in terms of directory structure. 
Why does the user has to recreate all tables again? Why not repair table is 
also flexible (with this patch) such that configs mapred.input.dir.recursive 
and hive.mapred.supports.subdirectories are supported add relevant partitions. 
Further having two commands may be confusing. 

I don't mean to add file here  a=1/00_0 f. I mean only to ignore these and 
list them in error log if a config is enabled such that users can act on them. 
Error is better instead of debug. This way, all configurations would give these 
details. For example if we have following files

tbldir/a=1/file1.txt
tbldir/a=2/b=1/file2.txt
tbldir/a=2/b=1/c=1/file3.txt

and we are trying to create partitioned table with partitions on a and b with 
root directory tbldir 

Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect 
structure if ignore config is set. Otherwise, operation is failed.

We add only one partition with values (2, 1).

msck is still restrict and the ask here is to support configs 
mapred.input.dir.recursive and hive.mapred.supports.subdirectories.



was (Author: pattipaka):
[~sershe], Even if we introduce another command to be flexible to cater this 
scenario, what if the user data has changed in terms of directory structure. 
Why does the user has to recreate all tables again? Why not repair table is 
also flexible (with this patch) such that configs mapred.input.dir.recursive 
and hive.mapred.supports.subdirectories are supported add relevant partitions. 
Further having two commands may be confusing. 

I don't mean to add file here  a=1/00_0 f. I mean only to ignore these and 
list them in error log if a config is enabled such that users can act on them. 
Error is better instead of debug. This way, all configurations would give these 
details. For example if we have following files

tbldir/a=1/file1.txt
tbldir/a=2/b=1/file2.txt

and we are trying to create partitioned table with partitions on a and b with 
root directory tbldir 

Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect 
structure if ignore config is set. Otherwise, operation is failed.

We add only one partition with values (2, 1).

msck is still restrict and the ask here is to support configs 
mapred.input.dir.recursive and hive.mapred.supports.subdirectories.


> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14505:
-
Assignee: Vaibhav Gumashta  (was: Hari Sankar Sivarama Subramaniyan)

>  Analyze 
> org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 
> failure
> 
>
> Key: HIVE-14505
> URL: https://issues.apache.org/jira/browse/HIVE-14505
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14505.1.patch
>
>
> Flaky test failure. Fails ~50% of the time locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421766#comment-15421766
 ] 

Prasanth Jayachandran commented on HIVE-14527:
--

I verified that schema evolution tests ran in TestCliDriver in last run. Other 
failures are unrelated. [~sseth] Can you please take a look?

> Schema evolution tests are not running in TestCliDriver
> ---
>
> Key: HIVE-14527
> URL: https://issues.apache.org/jira/browse/HIVE-14527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch
>
>
> HIVE-14376 broke something that makes schema evolution tests being excluded 
> from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Subramanyam Pattipaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421763#comment-15421763
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

[~sershe], Even if we introduce another command to be flexible to cater this 
scenario, what if the user data has changed in terms of directory structure. 
Why does the user has to recreate all tables again? Why not repair table is 
also flexible (with this patch) such that configs mapred.input.dir.recursive 
and hive.mapred.supports.subdirectories are supported add relevant partitions. 
Further having two commands may be confusing. 

I don't mean to add file here  a=1/00_0 f. I mean only to ignore these and 
list them in error log if a config is enabled such that users can act on them. 
Error is better instead of debug. This way, all configurations would give these 
details. For example if we have following files

tbldir/a=1/file1.txt
tbldir/a=2/b=1/file2.txt

and we are trying to create partitioned table with partitions on a and b with 
root directory tbldir 

Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect 
structure if ignore config is set. Otherwise, operation is failed.

We add only one partition with values (2, 1).

msck is still restrict and the ask here is to support configs 
mapred.input.dir.recursive and hive.mapred.supports.subdirectories.


> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421717#comment-15421717
 ] 

Sergey Shelukhin commented on HIVE-14511:
-

Shouldn't the table schema inform the correct partition directory structure? 
So, in the above case, if the table has p1 partition column, the partition 
should be added and file1 should follow the setting (ignore/fail); likewise if 
it doesn't.
I actually wonder if patch should be updated to look for specific level? I.e. 
if the table is partitioned on a and b, adding a=1/00_0 file makes no sense.

This brings it back to using right tools for the right job. msck needs to be 
strict as it's primarily intended for repair, and the use for ETL is 
incidental. If we need "load my partitions" command that is more flexible for 
ETL it should be a separate feature...

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14533:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   1.3.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch [~tfriedr]! Committed patch to master and branch-1.

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14503) Remove explicit order by in qfiles for union tests

2016-08-15 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421703#comment-15421703
 ] 

Prasanth Jayachandran commented on HIVE-14503:
--

I will batch the tests based on features and put up separate patches so that 
it's easier to review. 

> Remove explicit order by in qfiles for union tests
> --
>
> Key: HIVE-14503
> URL: https://issues.apache.org/jira/browse/HIVE-14503
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Identify qfiles with explicit order by and replace them with 
> SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable

2016-08-15 Thread Thomas Friedrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421700#comment-15421700
 ] 

Thomas Friedrich commented on HIVE-14533:
-

I checked the test failures and they are not related to the patch. The 
Tez-related tests fail in other prebuilds as well and I ran 
TestJdbcWithMiniHS2.testAddJarConstructorUnCaching successfully locally.

> improve performance of enforceMaxLength in 
> HiveCharWritable/HiveVarcharWritable
> ---
>
> Key: HIVE-14533
> URL: https://issues.apache.org/jira/browse/HIVE-14533
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Thomas Friedrich
>Assignee: Thomas Friedrich
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-14533.patch
>
>
> The enforceMaxLength method in HiveVarcharWritable calls 
> set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), 
> maxLength); no matter how long the string is. The calls to getHiveVarchar() 
> and getHiveChar() decode the string every time the method is called 
> (Text.toString() calls Text.decode). This can be very expensive and is 
> unnecessary if the string is shorter than maxLength for HiveVarcharWritable 
> or different than maxLength for HiveCharWritable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests

2016-08-15 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14503:
-
Summary: Remove explicit order by in qfiles for union tests  (was: Remove 
explicit order by in qfiles and replace them with SORT_QUERY_RESULTS)

> Remove explicit order by in qfiles for union tests
> --
>
> Key: HIVE-14503
> URL: https://issues.apache.org/jira/browse/HIVE-14503
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Identify qfiles with explicit order by and replace them with 
> SORT_QUERY_RESULTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14170) Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used

2016-08-15 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421695#comment-15421695
 ] 

Sahil Takiar commented on HIVE-14170:
-

[~taoli-hwx] or [~thejas] any other comments on this?

--Sahil

> Beeline IncrementalRows should buffer rows and incrementally re-calculate 
> width if TableOutputFormat is used
> 
>
> Key: HIVE-14170
> URL: https://issues.apache.org/jira/browse/HIVE-14170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Beeline
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-14170.1.patch, HIVE-14170.2.patch, 
> HIVE-14170.3.patch, HIVE-14170.4.patch
>
>
> If {{--incremental}} is specified in Beeline, rows are meant to be printed 
> out immediately. However, if {{TableOutputFormat}} is used with this option 
> the formatting can look really off.
> The reason is that {{IncrementalRows}} does not do a global calculation of 
> the optimal width size for {{TableOutputFormat}} (it can't because it only 
> sees one row at a time). The output of {{BufferedRows}} looks much better 
> because it can do this global calculation.
> If {{--incremental}} is used, and {{TableOutputFormat}} is used, the width 
> should be re-calculated every "x" rows ("x" can be configurable and by 
> default it can be 1000).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Subramanyam Pattipaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421694#comment-15421694
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

Yes. That's correct. We should also check that there are no files exists until 
the required depth. May be thats what you wanted here? For example, files like

tbldir/file1
tbldir/p1=1/file2

exists then partition creation should fail. If there is ignore config option 
set then probably we should move ahead ignoring these files. But, please log 
them under debug mode such that those can be collected and may be user may want 
to act on them once they have the list instead of deleting one at a time and 
rerunning msck.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13930) upgrade Hive to latest Hadoop version

2016-08-15 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421691#comment-15421691
 ] 

Sahil Takiar edited comment on HIVE-13930 at 8/15/16 9:19 PM:
--

Sorry for the delay, I was out of the office for a few weeks. I looked into 
this some more and believe I found the root cause.

Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on 
EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each 
run. It just does a git pull, git clean, and mvn clean before the job starts. 
Looking at the itests/pom.xml file (contains the script to download the Spark 
tar-ball), it seems that the tar-ball will not be downloaded if it is already 
present on the local filesystem. So even though the file on S3 has been 
updated, the PTest2 Infra will not re-download it. This explains why the error 
is still occurring.

I can think of a few solutions to this:

1: Simply delete the file on the PTest2 Infra Master 
(/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz).
 This should trigger the build to download the new version of the tar-ball. 
This may cause HoS itests to fail in other Hive QA runs since the new tar-ball 
includes Hadoop 2.7 jars, but it should be fine.

2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean 
is invoked. Nice because it will avoid this in the future, at least until 
HIVE-14240 has been resolved.

3: Re-name the Spark tar-ball to something like 
spark-spark.version-bin-hadoop2.7-without-hive (instead of hadoop2), and update 
the itests/pom.xml file to use the new name (the file name may need to be 
updated in a few other places)


was (Author: stakiar):
Sorry for the delay, I was out of the office for a few weeks. I looked into 
this some more and believe I found the root cause.

Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on 
EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each 
run. It just does a git pull, git clean, and mvn clean before the job starts. 
Looking at the itests/pom.xml file (contains the script to download the Spark 
tar-ball), it seems that the tar-ball will not be downloaded if it is already 
present on the local filesystem. So even though the file on S3 has been 
updated, the PTest2 Infra will not re-download it. This explains why the error 
is still occurring.

I can think of a few solutions to this:

1: Simply delete the file on the PTest2 Infra Master 
(/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz).
 This should trigger the build to download the new version of the tar-ball. 
This may cause HoS itests to fail in other Hive QA runs since the new tar-ball 
includes Hadoop 2.7 jars, but it should be fine.

2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean 
is invoked. Nice because it will avoid this in the future, at least until 
HIVE-14240 has been resolved.

3: Re-name the Spark tar-ball to something like 
spark-${spark.version}-bin-hadoop2.7-without-hive (instead of -hadoop2-), and 
update the itests/pom.xml file to use the new name (the file name may need to 
be updated in a few other places)

> upgrade Hive to latest Hadoop version
> -
>
> Key: HIVE-13930
> URL: https://issues.apache.org/jira/browse/HIVE-13930
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, 
> HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, 
> HIVE-13930.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13930) upgrade Hive to latest Hadoop version

2016-08-15 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421691#comment-15421691
 ] 

Sahil Takiar commented on HIVE-13930:
-

Sorry for the delay, I was out of the office for a few weeks. I looked into 
this some more and believe I found the root cause.

Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on 
EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each 
run. It just does a git pull, git clean, and mvn clean before the job starts. 
Looking at the itests/pom.xml file (contains the script to download the Spark 
tar-ball), it seems that the tar-ball will not be downloaded if it is already 
present on the local filesystem. So even though the file on S3 has been 
updated, the PTest2 Infra will not re-download it. This explains why the error 
is still occurring.

I can think of a few solutions to this:

1: Simply delete the file on the PTest2 Infra Master 
(/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz).
 This should trigger the build to download the new version of the tar-ball. 
This may cause HoS itests to fail in other Hive QA runs since the new tar-ball 
includes Hadoop 2.7 jars, but it should be fine.

2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean 
is invoked. Nice because it will avoid this in the future, at least until 
HIVE-14240 has been resolved.

3: Re-name the Spark tar-ball to something like 
spark-${spark.version}-bin-hadoop2.7-without-hive (instead of -hadoop2-), and 
update the itests/pom.xml file to use the new name (the file name may need to 
be updated in a few other places)

> upgrade Hive to latest Hadoop version
> -
>
> Key: HIVE-13930
> URL: https://issues.apache.org/jira/browse/HIVE-13930
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, 
> HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, 
> HIVE-13930.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 9:09 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat}


Alternatively you can use the Docker image that I have been using for 
development: https://github.com/kliewkliew/docker-hive-dev/tree/HIVE-13680


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421676#comment-15421676
 ] 

Ashutosh Chauhan commented on HIVE-14418:
-

Can you add comment in code and here on jira in release notes on how reset is 
different than unset ?

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2016-08-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14483:

   Resolution: Fixed
Fix Version/s: 2.0.2
   2.1.1
   1.3.0
   Status: Resolved  (was: Patch Available)

Committed to branches.

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
> Fix For: 1.3.0, 2.2.0, 2.1.1, 2.0.2
>
> Attachments: HIVE-14483.01.patch
>
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14189) backport HIVE-13945 to branch-1

2016-08-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14189:

Attachment: HIVE-14189.05-branch-1.patch

Same patch again...

> backport HIVE-13945 to branch-1
> ---
>
> Key: HIVE-14189
> URL: https://issues.apache.org/jira/browse/HIVE-14189
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC1.3
> Attachments: HIVE-14189-branch-1.patch, HIVE-14189.01-branch-1.patch, 
> HIVE-14189.02-branch-1.patch, HIVE-14189.03-branch-1.patch, 
> HIVE-14189.04-branch-1.patch, HIVE-14189.05-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14189) backport HIVE-13945 to branch-1

2016-08-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421624#comment-15421624
 ] 

Sergey Shelukhin commented on HIVE-14189:
-

[~spena] is it still supposed to work? It seems it never gets picked up. I 
wonder if it's just HiveQA operating as usual, or something specific to branch-1

> backport HIVE-13945 to branch-1
> ---
>
> Key: HIVE-14189
> URL: https://issues.apache.org/jira/browse/HIVE-14189
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC1.3
> Attachments: HIVE-14189-branch-1.patch, HIVE-14189.01-branch-1.patch, 
> HIVE-14189.02-branch-1.patch, HIVE-14189.03-branch-1.patch, 
> HIVE-14189.04-branch-1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14418:

Attachment: HIVE-14418.03.patch

Hmm, the old approach makes it impossible to set a string parameter to an empty 
string.
Adding UnsetProcessor and the explicit unset command.

[~ashutoshc] can you review the new patch? It even has tests ;)

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap

2016-08-15 Thread Peter Slawski (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Slawski updated HIVE-14290:
-
Attachment: HIVE-14290.1.patch

Re-uploading patch to trigger Hive QA

> Refactor HIVE-14054 to use Collections#newSetFromMap
> 
>
> Key: HIVE-14290
> URL: https://issues.apache.org/jira/browse/HIVE-14290
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Peter Slawski
>Assignee: Peter Slawski
>Priority: Trivial
> Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch, 
> HIVE-14290.1.patch
>
>
> There is a minor refactor that can be made to HiveMetaStoreChecker so that it 
> cleanly creates and uses a set that is backed by a Map implementation. In 
> this case, the underlying Map implementation is ConcurrentHashMap. This 
> refactor will help prevent issues such as the one reported in HIVE-14054.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421596#comment-15421596
 ] 

Hive QA commented on HIVE-12806:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12781930/HIVE-12806.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/887/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/887/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-887/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-887/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at e841edc HIVE-14345 : Beeline result table has erroneous 
characters (Miklos Csanady via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at e841edc HIVE-14345 : Beeline result table has erroneous 
characters (Miklos Csanady via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12781930 - PreCommit-HIVE-MASTER-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
> ---
>
> Key: HIVE-12806
> URL: https://issues.apache.org/jira/browse/HIVE-12806
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Vineet Garg
> Attachments: HIVE-12806.1.patch
>
>
> Step to reproduce:
> mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q 
> -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true
> Query :
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> Stack trace :
> {code}
> 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: 
> SessionState (SessionState.java:printError(1010)) - Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, 
> diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\
> illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, 
> vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: 
> java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: \
> Relative path in absolute URI: subq1:amerge.xml
> at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640)
> a

[jira] [Commented] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure

2016-08-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421590#comment-15421590
 ] 

Hive QA commented on HIVE-14505:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823722/HIVE-14505.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10470 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1]
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/886/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/886/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-886/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823722 - PreCommit-HIVE-MASTER-Build

>  Analyze 
> org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 
> failure
> 
>
> Key: HIVE-14505
> URL: https://issues.apache.org/jira/browse/HIVE-14505
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14505.1.patch
>
>
> Flaky test failure. Fails ~50% of the time locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14405) Have tests log to the console along with hive.log

2016-08-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421583#comment-15421583
 ] 

Siddharth Seth commented on HIVE-14405:
---

hadoop.ipc etc is already setup to log at a higher level. I'll see if I can 
change console logging to INFO level, otherwise will commit the patch as is.

> Have tests log to the console along with hive.log
> -
>
> Key: HIVE-14405
> URL: https://issues.apache.org/jira/browse/HIVE-14405
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14405.01.patch
>
>
> When running tests from the IDE (not itests), logs end up going to hive.log - 
> making it difficult to debug tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure

2016-08-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-12806:
--

Assignee: Vineet Garg

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
> ---
>
> Key: HIVE-12806
> URL: https://issues.apache.org/jira/browse/HIVE-12806
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Vineet Garg
> Attachments: HIVE-12806.1.patch
>
>
> Step to reproduce:
> mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q 
> -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true
> Query :
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> Stack trace :
> {code}
> 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: 
> SessionState (SessionState.java:printError(1010)) - Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, 
> diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\
> illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, 
> vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: 
> java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: \
> Relative path in absolute URI: subq1:amerge.xml
> at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:640)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:629)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: subq1:amerge.xml
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getMergeWork(Utilities.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.populateMapWork(SplitGrouper.java:260)
> at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:172)
> at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:277)
> ... 12 more
> Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: subq1:amerge.xml
> at org.apache.hadoop.fs.Path.initialize(Path.java:206)
> at org.apache.hadoop.fs.Path.(Path.java:172)
> at org.apache.hadoop.fs.Path.(Path.java:94)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getPlanPath(Utilities.java:588)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:387)
> ... 16 more
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> subq1:amerge.xml
> at java.net.URI.checkPath(URI.java:1804)
> at java.net.URI.(URI.java:752)
> at org.apache.hadoop.fs.Path.initialize(Path.java:203)
> ... 20 more
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure

2016-08-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421505#comment-15421505
 ] 

Vineet Garg commented on HIVE-12806:


Still failing. I'll take a look

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
> ---
>
> Key: HIVE-12806
> URL: https://issues.apache.org/jira/browse/HIVE-12806
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12806.1.patch
>
>
> Step to reproduce:
> mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q 
> -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true
> Query :
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> Stack trace :
> {code}
> 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: 
> SessionState (SessionState.java:printError(1010)) - Vertex failed, 
> vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, 
> diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\
> illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, 
> vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: 
> java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: \
> Relative path in absolute URI: subq1:amerge.xml
> at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:640)
> at 
> org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:629)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: subq1:amerge.xml
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getMergeWork(Utilities.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.populateMapWork(SplitGrouper.java:260)
> at 
> org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:172)
> at 
> org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:277)
> ... 12 more
> Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: subq1:amerge.xml
> at org.apache.hadoop.fs.Path.initialize(Path.java:206)
> at org.apache.hadoop.fs.Path.(Path.java:172)
> at org.apache.hadoop.fs.Path.(Path.java:94)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getPlanPath(Utilities.java:588)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:387)
> ... 16 more
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> subq1:amerge.xml
> at java.net.URI.checkPath(URI.java:1804)
> at java.net.URI.(URI.java:752)
> at org.apache.hadoop.fs.Path.initialize(Path.java:203)
> ... 20 more
> ]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14480) ORC ETLSplitStrategy should use thread pool when computing splits

2016-08-15 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421480#comment-15421480
 ] 

Sergey Shelukhin commented on HIVE-14480:
-

Already +1d above... 

> ORC ETLSplitStrategy should use thread pool when computing splits
> -
>
> Key: HIVE-14480
> URL: https://issues.apache.org/jira/browse/HIVE-14480
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14480.1.patch, HIVE-14480.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2016-08-15 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-11581:
-
Assignee: Vaibhav Gumashta  (was: Arpit Gupta)

> HiveServer2 should store connection params in ZK when using dynamic service 
> discovery for simpler client connection string.
> ---
>
> Key: HIVE-11581
> URL: https://issues.apache.org/jira/browse/HIVE-11581
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
> HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch
>
>
> Currently, the client needs to specify several parameters based on which an 
> appropriate connection is created with the server. In case of dynamic service 
> discovery, when multiple HS2 instances are running, it is much more usable 
> for the server to add its config parameters to ZK which the driver can use to 
> configure the connection, instead of the jdbc/odbc user adding those in 
> connection string.
> However, at minimum, client will need to specify zookeeper ensemble and that 
> she wants the JDBC driver to use ZooKeeper:
> {noformat}
> beeline> !connect 
> jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
> {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.

2016-08-15 Thread Arpit Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta reassigned HIVE-11581:
--

Assignee: Arpit Gupta  (was: Vaibhav Gumashta)

> HiveServer2 should store connection params in ZK when using dynamic service 
> discovery for simpler client connection string.
> ---
>
> Key: HIVE-11581
> URL: https://issues.apache.org/jira/browse/HIVE-11581
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Arpit Gupta
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, 
> HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch
>
>
> Currently, the client needs to specify several parameters based on which an 
> appropriate connection is created with the server. In case of dynamic service 
> discovery, when multiple HS2 instances are running, it is much more usable 
> for the server to add its config parameters to ZK which the driver can use to 
> configure the connection, instead of the jdbc/odbc user adding those in 
> connection string.
> However, at minimum, client will need to specify zookeeper ensemble and that 
> she wants the JDBC driver to use ZooKeeper:
> {noformat}
> beeline> !connect 
> jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver
> {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12203) CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results

2016-08-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421433#comment-15421433
 ] 

Vineet Garg commented on HIVE-12203:


Interestingly I am seeing NullPointerException on my local system. I am going 
to take a look on this.

> CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results
> ---
>
> Key: HIVE-12203
> URL: https://issues.apache.org/jira/browse/HIVE-12203
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12203.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12203) CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results

2016-08-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-12203:
--

Assignee: Vineet Garg  (was: Jesus Camacho Rodriguez)

> CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results
> ---
>
> Key: HIVE-12203
> URL: https://issues.apache.org/jira/browse/HIVE-12203
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vineet Garg
> Attachments: HIVE-12203.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:16 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{code}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM:


Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{noformat}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code:xml}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM:


Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{code}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM:


Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  .


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}
, add the example CompDe to the Hive lib folder, 
and start beeline with
{code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code}


was (Author: kliew):
Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}

and start beeline with
{code}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{code}

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390
 ] 

Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:10 PM:


Example  compressor attached. Configure the server with 

{code}

  hive.server2.thrift.resultset.server.compressor.list
  snappy.snappy


  hive.server2.thrift.resultset.serialize.in.tasks
  true

{code}

and start beeline with
{code}--hiveconf 
hive.server2.thrift.resultset.compressor.list=snappy.snappy{code}


was (Author: kliew):
Example Snappy compressor attached.

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: SnappyCompDe.zip

Example Snappy compressor attached.

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-08-15 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421386#comment-15421386
 ] 

Wei Zheng commented on HIVE-13249:
--

[~leftylev] Wiki has been updated

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13249.1.patch, HIVE-13249.10.patch, 
> HIVE-13249.11.patch, HIVE-13249.12.patch, HIVE-13249.2.patch, 
> HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, 
> HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, 
> HIVE-13249.9.patch, HIVE-13249.branch-1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14505:
-
Status: Patch Available  (was: Open)

>  Analyze 
> org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 
> failure
> 
>
> Key: HIVE-14505
> URL: https://issues.apache.org/jira/browse/HIVE-14505
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14505.1.patch
>
>
> Flaky test failure. Fails ~50% of the time locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure

2016-08-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14505:
-
Attachment: HIVE-14505.1.patch

>  Analyze 
> org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 
> failure
> 
>
> Key: HIVE-14505
> URL: https://issues.apache.org/jira/browse/HIVE-14505
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14505.1.patch
>
>
> Flaky test failure. Fails ~50% of the time locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2016-08-15 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421358#comment-15421358
 ] 

Xuefu Zhang commented on HIVE-14412:


[~lirui], your proposal looks good to me, especially it's backward compatible. 
I'm not sure if this has any impact on vectorization, but it's great to make 
the encoding work well in vectorized mode.

> Add a timezone-aware timestamp
> --
>
> Key: HIVE-14412
> URL: https://issues.apache.org/jira/browse/HIVE-14412
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, 
> HIVE-14412.1.patch
>
>
> Java's Timestamp stores the time elapsed since the epoch. While it's by 
> itself unambiguous, ambiguity comes when we parse a string into timestamp, or 
> convert a timestamp to string, causing problems like HIVE-14305.
> To solve the issue, I think we should make timestamp aware of timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12637) make retryable SQLExceptions in TxnHandler configurable

2016-08-15 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421357#comment-15421357
 ] 

Wei Zheng commented on HIVE-12637:
--

[~leftylev] Wiki has been updated.

> make retryable SQLExceptions in TxnHandler configurable
> ---
>
> Key: HIVE-12637
> URL: https://issues.apache.org/jira/browse/HIVE-12637
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12637.1.patch, HIVE-12637.2.patch
>
>
> same for CompactionTxnHandler
> would be convenient if the user could specify some RegEx (perhaps by db type) 
> which will tell TxnHandler.checkRetryable() that this is should be retried.
> The regex should probably apply to String produced by 
> {noformat}
>   private static String getMessage(SQLException ex) {
> return ex.getMessage() + "(SQLState=" + ex.getSQLState() + ",ErrorCode=" 
> + ex.getErrorCode() + ")";
>   }
> {noformat}
> This make it flexible.
> See if we need to add Db type (and possibly version) of the DB being used.
> With 5 different DBs supported this gives control end users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421350#comment-15421350
 ] 

Ashutosh Chauhan commented on HIVE-14511:
-

[~pattipaka] Just to clarify your dir structure is as following:
{code}
tbldir/p1=1/p2=1/p3=1
tbldir/p1=1/p2=1/p3=2
tbldir/p1=1/p2=1/p3=3
tbldir/p1=1/p2=2/p3=1
tbldir/p1=1/p2=2/p3=2
tbldir/p1=2/p2=1/p3=1
{code}
and your tbl is partitioned on (p1,p2). Correct?

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Status: Patch Available  (was: In Progress)

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13680 started by Kevin Liew.
-
> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-15 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: HIVE-13680.patch

First patch submitted. 

HiveConf settings submitted by the client are required to be prefixed by 
"set:hiveconf" so in ThriftCLIService we have to preserve this prefix to ensure 
that the SessionState is generated correctly. 
To make the code cleaner, we could expose functions (which are currently out of 
scope) to parse the prefix, or we could have the client send the list of 
compressors and list of configs in new fields in the Thrift message (after 
compressor negotiation, the CompDe would be stored in a new field in 
SessionState instead of in SessionState.sessConf). I prefer the second option.

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.patch, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-15 Thread Subramanyam Pattipaka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421268#comment-15421268
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

I mean to add only p1=1/p2=1. For example if you have following structure

data/p1=1/p2=1/p3=1
  /p3=2
  /p3=3
 /p2=2/p3=1
  /p3=2
/p1=2/p2=1/p3=1

Now, I want to add only (1,1), (1,2) and (2,1) as partitions. If you remove the 
above check then this is possible.

In first iteration you would list 

p1=1
p1=2

in next iteration you would list 

/p1=1/p2=1
/p1=1/p2=3
/p1=2/p2=1

As depth is 0 we stop here and these are the paths for partitions if user want 
to create on p1 and p2 as partition columns. If you want you can check for 
existence of use of config mapred.input.dir.recursive and 
hive.mapred.supports.subdirectories.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14345) Beeline result table has erroneous characters

2016-08-15 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14345:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Beeline result table has erroneous characters 
> --
>
> Key: HIVE-14345
> URL: https://issues.apache.org/jira/browse/HIVE-14345
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Jeremy Beard
>Assignee: Miklos Csanady
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14345.3.patch, HIVE-14345.4.patch, 
> HIVE-14345.5.patch, HIVE-14345.patch
>
>
> Beeline returns query results with erroneous characters. For example:
> {code}
> 0: jdbc:hive2://:1/def> select 10;
> +--+--+
> | _c0  |
> +--+--+
> | 10   |
> +--+--+
> 1 row selected (3.207 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-15 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421097#comment-15421097
 ] 

Naveen Gangam commented on HIVE-14513:
--


[~leftylev] I have update the LDAP authentication configuration documentation 
[here|https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2].
 Could you please review it to make it consistent with other pages? Thank you 
in advance

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13936) Add streaming support for row_number

2016-08-15 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421010#comment-15421010
 ] 

Chaoyu Tang commented on HIVE-13936:


Similar to the streaming support in rank. LGTM, +1

> Add streaming support for row_number
> 
>
> Key: HIVE-13936
> URL: https://issues.apache.org/jira/browse/HIVE-13936
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Johndee Burks
>Assignee: Yongzhi Chen
> Attachments: HIVE-13936.1.patch
>
>
> Without this support row_number will cause heap issues in reducers. Example 
> query below against 10 million records will cause failure. 
> {code}
> select a, row_number() over (partition by a order by a desc) as row_num from 
> j100mil;
> {code}
> Same issue different function in JIRA HIVE-7062



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-15 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421007#comment-15421007
 ] 

Naveen Gangam commented on HIVE-14513:
--


I already documented this feature this in the past but yes, I plan on enhancing 
it at 
https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2#UserandGroupFilterSupportwithLDAPAtnProviderinHiveServer2-CustomQueryString

which needed a bit more details to begin with. I do not think it was clear as 
to what the custom query should result in.

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-08-15 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420986#comment-15420986
 ] 

Illya Yalovyy commented on HIVE-14373:
--

[~kgyrtkirk],

Thank you for the heads up. I think [~ayousufi] is actively working on his 
path. I have added my implementation only for the reference. If for any reason 
he is not able to finish this project, I can pick it up. I think it make sense 
is to update this CR: https://reviews.apache.org/r/50938/

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Abdullah Yousufi
> Attachments: HIVE-14373.02.patch, HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14525) beeline still writing log data to stdout as of version 2.1.0

2016-08-15 Thread Miklos Csanady (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420959#comment-15420959
 ] 

Miklos Csanady commented on HIVE-14525:
---

The prompt and other debug info is echoed to stdout only if --silent=true is 
not set.
These are supressed if on commandline -f file AND --silent=true are present.

Beeline cannot detect command line output redirection. 

> beeline still writing log data to stdout as of version 2.1.0
> 
>
> Key: HIVE-14525
> URL: https://issues.apache.org/jira/browse/HIVE-14525
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.0
>Reporter: stephen sprague
>
> simple test. note that i'm looking to get a tsv file back.
> {code}
> $ beeline -u dwrdevnn1 --showHeader=false --outputformat=tsv2  2>stderr
> > select count(*)
> > from default.dual;
> > SQL
> {code}
> instead i get this in stdout:
> {code}
> $ cat stdout
> 0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> select count(*)
> . . . . . . . . . . . . . . . . . . . . . . .> from default.dual;
> 0
> 0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> 
> {code}
> i should only get one row which is the *result* of the query (which is 0) - 
> not the ovthe loggy kind of lines you see above. that stuff goes to stderr my 
> friends.
> also i refer to this ticket b/c the last comment suggested so - its close but 
> not exactly the same.
> https://issues.apache.org/jira/browse/HIVE-14183



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp

2016-08-15 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420848#comment-15420848
 ] 

Rui Li commented on HIVE-14412:
---

Somehow the tests worked this time. {{TestStandardObjectInspectors}} is related 
because of the new type.
I'd like to get more feedback before moving on. Pinging [~sershe], 
[~ashutoshc], [~xuefuz] for opinions. Do you think the proposal makes sense, or 
is there better way to achieve this? Thanks.

> Add a timezone-aware timestamp
> --
>
> Key: HIVE-14412
> URL: https://issues.apache.org/jira/browse/HIVE-14412
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, 
> HIVE-14412.1.patch
>
>
> Java's Timestamp stores the time elapsed since the epoch. While it's by 
> itself unambiguous, ambiguity comes when we parse a string into timestamp, or 
> convert a timestamp to string, causing problems like HIVE-14305.
> To solve the issue, I think we should make timestamp aware of timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2016-08-15 Thread Sergey Zadoroshnyak (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420716#comment-15420716
 ] 

Sergey Zadoroshnyak commented on HIVE-14483:


[~sershe]

.patch looks good and no test failures.

Who has responsibility to push into master? 

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14483.01.patch
>
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13175) Disallow making external tables transactional

2016-08-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420715#comment-15420715
 ] 

Lefty Leverenz commented on HIVE-13175:
---

Removed both of the TODOC labels.  Woo hoo for the docs, [~wzheng]!

> Disallow making external tables transactional
> -
>
> Key: HIVE-13175
> URL: https://issues.apache.org/jira/browse/HIVE-13175
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13175.1.patch, HIVE-13175.2.patch, 
> HIVE-13175.3.patch, HIVE-13175.4.patch
>
>
> The fact that compactor rewrites contents of ACID tables is in conflict with 
> what is expected of external tables.
> Conversely, end user can write to External table which certainly not what is 
> expected of ACID table.
> So we should explicitly disallow making an external table ACID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13175) Disallow making external tables transactional

2016-08-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-13175:
--
Labels:   (was: TODOC1.3 TODOC2.1)

> Disallow making external tables transactional
> -
>
> Key: HIVE-13175
> URL: https://issues.apache.org/jira/browse/HIVE-13175
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13175.1.patch, HIVE-13175.2.patch, 
> HIVE-13175.3.patch, HIVE-13175.4.patch
>
>
> The fact that compactor rewrites contents of ACID tables is in conflict with 
> what is expected of external tables.
> Conversely, end user can write to External table which certainly not what is 
> expected of ACID table.
> So we should explicitly disallow making an external table ACID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12634) Add command to kill an ACID transaction

2016-08-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420714#comment-15420714
 ] 

Lefty Leverenz commented on HIVE-12634:
---

Removed both of the TODOC labels.  Thanks for the docs, [~wzheng]!

> Add command to kill an ACID transaction
> ---
>
> Key: HIVE-12634
> URL: https://issues.apache.org/jira/browse/HIVE-12634
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12634.1.patch, HIVE-12634.2.patch, 
> HIVE-12634.3.patch, HIVE-12634.4.patch, HIVE-12634.5.patch, 
> HIVE-12634.6.patch, HIVE-12634.7.patch, HIVE-12634.branch-1.patch
>
>
> Should add a CLI command to abort a (runaway) transaction.
> This should clean up all state related to this txn.
> The initiator of this (if still alive) will get an error trying to 
> heartbeat/commit, i.e. will become aware that the txn is dead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12634) Add command to kill an ACID transaction

2016-08-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12634:
--
Labels:   (was: TODOC1.3 TODOC2.1)

> Add command to kill an ACID transaction
> ---
>
> Key: HIVE-12634
> URL: https://issues.apache.org/jira/browse/HIVE-12634
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12634.1.patch, HIVE-12634.2.patch, 
> HIVE-12634.3.patch, HIVE-12634.4.patch, HIVE-12634.5.patch, 
> HIVE-12634.6.patch, HIVE-12634.7.patch, HIVE-12634.branch-1.patch
>
>
> Should add a CLI command to abort a (runaway) transaction.
> This should clean up all state related to this txn.
> The initiator of this (if still alive) will get an error trying to 
> heartbeat/commit, i.e. will become aware that the txn is dead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-08-15 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420709#comment-15420709
 ] 

Lefty Leverenz commented on HIVE-12366:
---

Removed the TODOC1.3 label.  Thanks for the docs, [~wzheng].

Here are the doc links:

* [Configuration Properties -- hive.txn.heartbeat.threadpool.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.txn.heartbeat.threadpool.size]
* [Hive Transactions -- New Configuration Parameters for Transactions | 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-NewConfigurationParametersforTransactions]

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, 
> HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, 
> HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, 
> HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2016-08-15 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-12366:
--
Labels:   (was: TODOC1.3)

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, 
> HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, 
> HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, 
> HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, 
> HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, 
> HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)