[jira] [Commented] (HIVE-6590) Hive does not work properly with boolean partition columns (wrong results and inserts to incorrect HDFS path)
[ https://issues.apache.org/jira/browse/HIVE-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422294#comment-15422294 ] Alexander Behm commented on HIVE-6590: -- A related and severe issue is that ALTER TABLE DROP PARTITION may drop partitions that were not specified in the ALTER statement. This could lead to accidental data loss. Reproduction: {code} CREATE TABLE broken (c int) PARTITIONED BY (b1 BOOLEAN, s STRING, b2 BOOLEAN, i INT); # Insert a few variants of 'false' partition-key values. INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=false,i=0) VALUES(1); INSERT INTO TABLE broken PARTITION(b1=FALSE,s='a',b2=false,i=0) VALUES(3); INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=False,i=0) VALUES(5); INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=FalsE,i=0) VALUES(7); # Insert a few variants of 'true' partition-key values. INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=true,i=0) VALUES(2); INSERT INTO TABLE broken PARTITION(b1=TRUE,s='a',b2=true,i=0) VALUES(4); INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=True,i=0) VALUES(6); INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=TruE,i=0) VALUES(8); # Insert a few variants of mixed 'true'/'false' partition-key values. INSERT INTO TABLE broken PARTITION(b1=false,s='a',b2=true,i=0) VALUES(100); INSERT INTO TABLE broken PARTITION(b1=FALSE,s='a',b2=TRUE,i=0) VALUES(1000); INSERT INTO TABLE broken PARTITION(b1=true,s='a',b2=false,i=0) VALUES(1); INSERT INTO TABLE broken PARTITION(b1=tRUe,s='a',b2=fALSe,i=0) VALUES(10); # Very broken partition drop. hive> ALTER TABLE broken DROP PARTITION(b1=true,s='a',b2=true,i=0); Dropped the partition b1=false/s=a/b2=false/i=0 Dropped the partition b1=false/s=a/b2=False/i=0 Dropped the partition b1=false/s=a/b2=FalsE/i=0 Dropped the partition b1=FALSE/s=a/b2=false/i=0 Dropped the partition b1=false/s=a/b2=true/i=0 Dropped the partition b1=FALSE/s=a/b2=TRUE/i=0 Dropped the partition b1=true/s=a/b2=false/i=0 Dropped the partition b1=tRUe/s=a/b2=fALSe/i=0 Dropped the partition b1=true/s=a/b2=true/i=0 Dropped the partition b1=true/s=a/b2=True/i=0 Dropped the partition b1=true/s=a/b2=TruE/i=0 Dropped the partition b1=TRUE/s=a/b2=true/i=0 OK Time taken: 1.387 seconds {code} > Hive does not work properly with boolean partition columns (wrong results and > inserts to incorrect HDFS path) > - > > Key: HIVE-6590 > URL: https://issues.apache.org/jira/browse/HIVE-6590 > Project: Hive > Issue Type: Bug > Components: Database/Schema, Metastore >Affects Versions: 0.10.0 >Reporter: Lenni Kuff > > Hive does not work properly with boolean partition columns. Queries return > wrong results and also insert to incorrect HDFS paths. > {code} > create table bool_part(int_col int) partitioned by(bool_col boolean); > # This works, creating 3 unique partitions! > ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE); > ALTER TABLE bool_table ADD PARTITION (bool_col=false); > ALTER TABLE bool_table ADD PARTITION (bool_col=False); > {code} > The first problem is that Hive cannot filter on a bool partition key column. > "select * from bool_part" returns the correct results, but if you apply a > filter on the bool partition key column hive won't return any results. > The second problem is that Hive seems to just call "toString()" on the > boolean literal value. This means you can end up with multiple partitions > (FALSE, false, FaLSE, etc) mapping to the literal value 'FALSE'. For example, > if you can add three partition in have for the same logic value "false" doing: > ALTER TABLE bool_table ADD PARTITION (bool_col=FALSE) -> > /test-warehouse/bool_table/bool_col=FALSE/ > ALTER TABLE bool_table ADD PARTITION (bool_col=false) -> > /test-warehouse/bool_table/bool_col=false/ > ALTER TABLE bool_table ADD PARTITION (bool_col=False) -> > /test-warehouse/bool_table/bool_col=False/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14405) Have tests log to the console along with hive.log
[ https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422218#comment-15422218 ] Hive QA commented on HIVE-14405: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823801/HIVE-14405.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10472 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/892/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/892/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-892/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823801 - PreCommit-HIVE-MASTER-Build > Have tests log to the console along with hive.log > - > > Key: HIVE-14405 > URL: https://issues.apache.org/jira/browse/HIVE-14405 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14405.01.patch, HIVE-14405.02.patch > > > When running tests from the IDE (not itests), logs end up going to hive.log - > making it difficult to debug tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14542) VirtualColumn::equals() should use object equality
[ https://issues.apache.org/jira/browse/HIVE-14542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-14542: --- Description: The VirtualColumn() constructor is private and is only called to initialize 5 static objects. !virtual-columns.png! There's no reason for VirtualColumn::equals() to do a deep type inspection for each access of a complex type like ROW__ID. {code} else if(vc.equals(VirtualColumn.ROWID)) { if(ctx.getIoCxt().getRecordIdentifier() == null) { vcValues[i] = null; } {code} was: The VirtualColumn() constructor is private and is only called to initialize 5 static objects. !virtual-columns.png! There's no reason for VirtualColumn::equals() to do a deep type inspection for each access of a complex type like ROW__ID. > VirtualColumn::equals() should use object equality > -- > > Key: HIVE-14542 > URL: https://issues.apache.org/jira/browse/HIVE-14542 > Project: Hive > Issue Type: Improvement >Reporter: Gopal V >Priority: Minor > Attachments: virtual-columns.png > > > The VirtualColumn() constructor is private and is only called to initialize 5 > static objects. > !virtual-columns.png! > There's no reason for VirtualColumn::equals() to do a deep type inspection > for each access of a complex type like ROW__ID. > {code} > else if(vc.equals(VirtualColumn.ROWID)) { > if(ctx.getIoCxt().getRecordIdentifier() == null) { > vcValues[i] = null; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14542) VirtualColumn::equals() should use object equality
[ https://issues.apache.org/jira/browse/HIVE-14542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-14542: --- Attachment: virtual-columns.png > VirtualColumn::equals() should use object equality > -- > > Key: HIVE-14542 > URL: https://issues.apache.org/jira/browse/HIVE-14542 > Project: Hive > Issue Type: Improvement >Reporter: Gopal V >Priority: Minor > Attachments: virtual-columns.png > > > The VirtualColumn() constructor is private and is only called to initialize 5 > static objects. > !virtual-columns.png! > There's no reason for VirtualColumn::equals() to do a deep type inspection > for each access of a complex type like ROW__ID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp
[ https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422200#comment-15422200 ] Rui Li commented on HIVE-14412: --- [~xuefuz], thanks for your comments. In {{TimestampColumnVector}}, we store the time and nanos of each timestamp, then we read/write them in {{TimestampTreeReader}} and {{TimestampTreeWriter}} accordingly. I guess we can't maintain compatibility here since we're adding a new field. Another possible solution is, instead of extending timestamp, we treat HiveTimestamp as a totally new data type. What do you think? > Add a timezone-aware timestamp > -- > > Key: HIVE-14412 > URL: https://issues.apache.org/jira/browse/HIVE-14412 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, > HIVE-14412.1.patch > > > Java's Timestamp stores the time elapsed since the epoch. While it's by > itself unambiguous, ambiguity comes when we parse a string into timestamp, or > convert a timestamp to string, causing problems like HIVE-14305. > To solve the issue, I think we should make timestamp aware of timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default
[ https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422123#comment-15422123 ] Ashutosh Chauhan commented on HIVE-12656: - +1 > Turn hive.compute.query.using.stats on by default > - > > Key: HIVE-12656 > URL: https://issues.apache.org/jira/browse/HIVE-12656 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12656.01.patch, HIVE-12656.02.patch, > HIVE-12656.03.patch, HIVE-12656.04.patch, HIVE-12656.05.patch > > > We now have hive.compute.query.using.stats=false by default. We plan to turn > it on by default so that we can have better performance. We can also set it > to false in some test cases to maintain the original purpose of those tests.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422117#comment-15422117 ] Hive QA commented on HIVE-14463: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823785/HIVE-14463.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10472 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/891/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/891/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-891/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823785 - PreCommit-HIVE-MASTER-Build > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) >
[jira] [Commented] (HIVE-14503) Remove explicit order by in qfiles for union tests
[ https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422037#comment-15422037 ] Hive QA commented on HIVE-14503: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823782/HIVE-14503.1.patch {color:green}SUCCESS:{color} +1 due to 35 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 10472 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_view] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[unionDistinct_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[union_type_chk] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union23] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union32] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union34] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_10] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_11] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_12] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_13] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_14] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_15] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_16] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_17] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_18] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_19] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_1] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_20] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_21] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_22] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_23] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_24] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_25] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_2] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_3] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_4] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_5] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_6] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_6_subq] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_7] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_8] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_9] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_script] org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_view] org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/890/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/890/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-890/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 40 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823782 - PreCommit-HIVE-MASTER-Build > Remove explicit order by in qfiles for union tests > -- > > Key: HIVE-14503 > URL: https://issues.apache.org/jira/browse/HIVE-14503 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14503.1.patch > > > Identify qfiles with explicit order by and replace them with > SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14405) Have tests log to the console along with hive.log
[ https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14405: -- Attachment: HIVE-14405.02.patch Updated patch which defaults the console output to INFO logging. > Have tests log to the console along with hive.log > - > > Key: HIVE-14405 > URL: https://issues.apache.org/jira/browse/HIVE-14405 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14405.01.patch, HIVE-14405.02.patch > > > When running tests from the IDE (not itests), logs end up going to hive.log - > making it difficult to debug tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421970#comment-15421970 ] Subramanyam Pattipaka commented on HIVE-14511: -- [~pxiong], can you please make following extra changes 1. Check for configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are enabled and then ignore having directories after you reach depth same as number of partition columns. 2. If you find any files at unexpected locations then please check for a config (can't remember the config name) and produce error for each such path and move ahead. Otherwise, fail the operation. > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default
[ https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421967#comment-15421967 ] Pengcheng Xiong commented on HIVE-12656: [~ashutoshc], I believe the failed tests are related to golden file updates. Could u please take a look? Thanks. > Turn hive.compute.query.using.stats on by default > - > > Key: HIVE-12656 > URL: https://issues.apache.org/jira/browse/HIVE-12656 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12656.01.patch, HIVE-12656.02.patch, > HIVE-12656.03.patch, HIVE-12656.04.patch, HIVE-12656.05.patch > > > We now have hive.compute.query.using.stats=false by default. We plan to turn > it on by default so that we can have better performance. We can also set it > to false in some test cases to maintain the original purpose of those tests.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421962#comment-15421962 ] Ashutosh Chauhan commented on HIVE-14463: - +1 > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStre
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421924#comment-15421924 ] Hive QA commented on HIVE-14418: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823749/HIVE-14418.03.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10442 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-auto_sortmerge_join_7.q-cbo_windowing.q-scriptfile1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-dynamic_partition_pruning.q-vector_char_mapjoin1.q-unionDistinct_2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[mapjoin_mapjoin] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/889/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/889/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-889/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823749 - PreCommit-HIVE-MASTER-Build > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14463: - Status: Patch Available (was: Open) > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBuf
[jira] [Comment Edited] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421904#comment-15421904 ] Hari Sankar Sivarama Subramaniyan edited comment on HIVE-14463 at 8/15/16 11:55 PM: Root cause of this problem is similar to HIVE-14424. The query is failing because of : {code} java.lang.RuntimeException: Error applying authorization policy on hive configuration: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest {code} The reason the test used to hang was because consumer.receive() is a blocking call. Part of the issue, i.e. hang has been fixed by HIVE-14520 by converting the blocking calls to non-blocking calls. The other issue, i.e. the actual error is fixed by the patch attached. was (Author: hsubramaniyan): Looks like a problem similar to HIVE-14424. The query is failing and the consumer.receive() is a blocking call. Part of the issue, i.e. hang has been fixed by HIVE-14520 by converting the blocking to non-blocking calls. The other issue, i.e. the actual error is fixed by the patch attached. > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.re
[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14463: - Attachment: HIVE-14463.1.patch > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedI
[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14463: - Attachment: HIVE-14463.1.patch cc [~ashutoshc] for review. > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.active
[jira] [Updated] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14463: - Attachment: (was: HIVE-14463.1.patch) > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14463.1.patch > > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.T
[jira] [Commented] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421904#comment-15421904 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-14463: -- Looks like a problem similar to HIVE-14424. The query is failing and the consumer.receive() is a blocking call. Part of the issue, i.e. hang has been fixed by HIVE-14520 by converting the blocking to non-blocking calls. The other issue, i.e. the actual error is fixed by the patch attached. > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUN
[jira] [Assigned] (HIVE-14463) hcatalog server extensions test cases getting stuck
[ https://issues.apache.org/jira/browse/HIVE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan reassigned HIVE-14463: Assignee: Hari Sankar Sivarama Subramaniyan > hcatalog server extensions test cases getting stuck > --- > > Key: HIVE-14463 > URL: https://issues.apache.org/jira/browse/HIVE-14463 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Rajat Khandelwal >Assignee: Hari Sankar Sivarama Subramaniyan > > The module is getting stuck in tests and not coming out for as long as 2 > days. > Specifically, TestMsgBusConnection is the test case which has this problem. I > ran the tests on local environment and took a thread dump after it got stuck. > {noformat} > Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode): > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@2c040428[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d89e000 nid=0x8827 waiting on condition > [0x000117b74000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "InactivityMonitor Async Task: > java.util.concurrent.ThreadPoolExecutor$Worker@182a483f[State = -1, empty > queue]" daemon prio=5 tid=0x7fe90d801000 nid=0x585f waiting on condition > [0x00011786b000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00078166f0b8> (a > java.util.concurrent.SynchronousQueue$TransferStack) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) > at > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359) > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:942) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp:///127.0.0.1:56883" daemon prio=5 > tid=0x7fe90c83e800 nid=0x8403 runnable [0x0001196ab000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBufferedInputStream.java:50) > at > org.apache.activemq.transport.tcp.TcpTransport$2.fill(TcpTransport.java:576) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.read(TcpBufferedInputStream.java:58) > at > org.apache.activemq.transport.tcp.TcpTransport$2.read(TcpTransport.java:561) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) > at > org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) > at java.lang.Thread.run(Thread.java:745) > "ActiveMQ Transport: tcp://localhost/127.0.0.1:61616" prio=5 > tid=0x7fe90b81e800 nid=0x8003 runnable [0x0001194a5000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at > org.apache.activemq.transport.tcp.TcpBufferedInputStream.fill(TcpBuffe
[jira] [Commented] (HIVE-14165) Enable faster S3 Split Computation
[ https://issues.apache.org/jira/browse/HIVE-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421901#comment-15421901 ] Abdullah Yousufi commented on HIVE-14165: - So I did try the listFiles() optimization locally and modified Hive to call the function on the root directory of a partitioned table. While this does give a speedup for a select * query on a partitioned table, this approach is not really extensible to queries that do partition elimination, since in those cases it makes sense to just pass in the relevant partitions, as Hive currently does. I'm thinking it might make sense to remove the following list call on Hive in the case of S3 partitioned tables since the listing for the split computation is going to happen later anyway in Hadoop's FileInputFormat.java. FetchOperator.java#getNextPath() {code} if (fs.exists(currPath)) { for (FileStatus fStat : listStatusUnderPath(fs, currPath)) { if (fStat.getLen() > 0) { return true; } } } {code} My question is if it sounds good to remove this check. It seems that there may be errors that FileInputFormat.java#getSplits() may return if the partition directory does not have any files, but is there a better way to handle that? > Enable faster S3 Split Computation > -- > > Key: HIVE-14165 > URL: https://issues.apache.org/jira/browse/HIVE-14165 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Abdullah Yousufi >Assignee: Abdullah Yousufi > > Split size computation be may improved by the optimizations for listFiles() > in HADOOP-13208 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests
[ https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14503: - Attachment: HIVE-14503.1.patch > Remove explicit order by in qfiles for union tests > -- > > Key: HIVE-14503 > URL: https://issues.apache.org/jira/browse/HIVE-14503 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14503.1.patch > > > Identify qfiles with explicit order by and replace them with > SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests
[ https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14503: - Status: Patch Available (was: Open) > Remove explicit order by in qfiles for union tests > -- > > Key: HIVE-14503 > URL: https://issues.apache.org/jira/browse/HIVE-14503 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-14503.1.patch > > > Identify qfiles with explicit order by and replace them with > SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14527) Schema evolution tests are not running in TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421857#comment-15421857 ] Prasanth Jayachandran edited comment on HIVE-14527 at 8/15/16 11:11 PM: Thinking about it do we need to run these tests in CliDriver?. With MiniLlap these tests are really fast (~18 mins). I don't think schema evolution is tied to the execution engine. So should be safe to drop these tests from CliDriver. [~mmccline] Thoughts? was (Author: prasanth_j): Thinking about it do we need to run these tests in CliDriver. With MiniLlap these tests are really fast (~18 mins). I don't think schema evolution is tied to the execution engine. So should be safe to drop these tests from CliDriver. [~mmccline] Thoughts? > Schema evolution tests are not running in TestCliDriver > --- > > Key: HIVE-14527 > URL: https://issues.apache.org/jira/browse/HIVE-14527 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Matt McCline >Assignee: Prasanth Jayachandran > Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch > > > HIVE-14376 broke something that makes schema evolution tests being excluded > from TestCliDriver test suite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14527) Schema evolution tests are not running in TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421857#comment-15421857 ] Prasanth Jayachandran commented on HIVE-14527: -- Thinking about it do we need to run these tests in CliDriver. With MiniLlap these tests are really fast (~18 mins). I don't think schema evolution is tied to the execution engine. So should be safe to drop these tests from CliDriver. [~mmccline] Thoughts? > Schema evolution tests are not running in TestCliDriver > --- > > Key: HIVE-14527 > URL: https://issues.apache.org/jira/browse/HIVE-14527 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Matt McCline >Assignee: Prasanth Jayachandran > Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch > > > HIVE-14376 broke something that makes schema evolution tests being excluded > from TestCliDriver test suite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14480) ORC ETLSplitStrategy should use thread pool when computing splits
[ https://issues.apache.org/jira/browse/HIVE-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14480: - Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Thanks for the patch [~rajesh.balamohan]! Committed patch to master and branch-2.1 > ORC ETLSplitStrategy should use thread pool when computing splits > - > > Key: HIVE-14480 > URL: https://issues.apache.org/jira/browse/HIVE-14480 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14480.1.patch, HIVE-14480.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13563) Hive Streaming does not honor orc.compress.size and orc.stripe.size table properties
[ https://issues.apache.org/jira/browse/HIVE-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421784#comment-15421784 ] Wei Zheng commented on HIVE-13563: -- [~leftylev] Wiki has been updated. > Hive Streaming does not honor orc.compress.size and orc.stripe.size table > properties > > > Key: HIVE-13563 > URL: https://issues.apache.org/jira/browse/HIVE-13563 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Labels: TODOC2.1 > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13563.1.patch, HIVE-13563.2.patch, > HIVE-13563.3.patch, HIVE-13563.4.patch, HIVE-13563.branch-1.patch > > > According to the doc: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-HiveQLSyntax > One should be able to specify tblproperties for many ORC options. > But the settings for orc.compress.size and orc.stripe.size don't take effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421780#comment-15421780 ] Hive QA commented on HIVE-14290: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823745/HIVE-14290.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10472 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/888/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/888/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-888/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823745 - PreCommit-HIVE-MASTER-Build > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch, > HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421777#comment-15421777 ] Sergey Shelukhin commented on HIVE-14511: - Ignoring the files makes sense in this case > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request
[ https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421768#comment-15421768 ] Wei Zheng commented on HIVE-13354: -- [~leftylev] Wiki has been updated: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL > Add ability to specify Compaction options per table and per request > --- > > Key: HIVE-13354 > URL: https://issues.apache.org/jira/browse/HIVE-13354 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.3.0, 2.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Labels: TODOC2.1 > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13354.1.patch, > HIVE-13354.1.withoutSchemaChange.patch, HIVE-13354.2.patch, > HIVE-13354.3.patch, HIVE-13354.branch-1.patch > > > Currently the are a few options that determine when automatic compaction is > triggered. They are specified once for the warehouse. > This doesn't make sense - some table may be more important and need to be > compacted more often. > We should allow specifying these on per table basis. > Also, compaction is an MR job launched from within the metastore. There is > currently no way to control job parameters (like memory, for example) except > to specify it in hive-site.xml for metastore which means they are site wide. > Should add a way to specify these per table (perhaps even per compaction if > launched via ALTER TABLE) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421763#comment-15421763 ] Subramanyam Pattipaka edited comment on HIVE-14511 at 8/15/16 10:07 PM: [~sershe], Even if we introduce another command to be flexible to cater this scenario, what if the user data has changed in terms of directory structure. Why does the user has to recreate all tables again? Why not repair table is also flexible (with this patch) such that configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are supported add relevant partitions. Further having two commands may be confusing. I don't mean to add file here a=1/00_0 f. I mean only to ignore these and list them in error log if a config is enabled such that users can act on them. Error is better instead of debug. This way, all configurations would give these details. For example if we have following files tbldir/a=1/file1.txt tbldir/a=2/b=1/file2.txt tbldir/a=2/b=1/c=1/file3.txt and we are trying to create partitioned table with partitions on a and b with root directory tbldir Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect structure if ignore config is set. Otherwise, operation is failed. We add only one partition with values (2, 1). msck is still restrict and the ask here is to support configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories. was (Author: pattipaka): [~sershe], Even if we introduce another command to be flexible to cater this scenario, what if the user data has changed in terms of directory structure. Why does the user has to recreate all tables again? Why not repair table is also flexible (with this patch) such that configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are supported add relevant partitions. Further having two commands may be confusing. I don't mean to add file here a=1/00_0 f. I mean only to ignore these and list them in error log if a config is enabled such that users can act on them. Error is better instead of debug. This way, all configurations would give these details. For example if we have following files tbldir/a=1/file1.txt tbldir/a=2/b=1/file2.txt and we are trying to create partitioned table with partitions on a and b with root directory tbldir Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect structure if ignore config is set. Otherwise, operation is failed. We add only one partition with values (2, 1). msck is still restrict and the ask here is to support configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories. > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure
[ https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14505: - Assignee: Vaibhav Gumashta (was: Hari Sankar Sivarama Subramaniyan) > Analyze > org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching > failure > > > Key: HIVE-14505 > URL: https://issues.apache.org/jira/browse/HIVE-14505 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Vaibhav Gumashta > Attachments: HIVE-14505.1.patch > > > Flaky test failure. Fails ~50% of the time locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14527) Schema evolution tests are not running in TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421766#comment-15421766 ] Prasanth Jayachandran commented on HIVE-14527: -- I verified that schema evolution tests ran in TestCliDriver in last run. Other failures are unrelated. [~sseth] Can you please take a look? > Schema evolution tests are not running in TestCliDriver > --- > > Key: HIVE-14527 > URL: https://issues.apache.org/jira/browse/HIVE-14527 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Matt McCline >Assignee: Prasanth Jayachandran > Attachments: HIVE-14527.1.patch, HIVE-14527.2.patch > > > HIVE-14376 broke something that makes schema evolution tests being excluded > from TestCliDriver test suite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421763#comment-15421763 ] Subramanyam Pattipaka commented on HIVE-14511: -- [~sershe], Even if we introduce another command to be flexible to cater this scenario, what if the user data has changed in terms of directory structure. Why does the user has to recreate all tables again? Why not repair table is also flexible (with this patch) such that configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories are supported add relevant partitions. Further having two commands may be confusing. I don't mean to add file here a=1/00_0 f. I mean only to ignore these and list them in error log if a config is enabled such that users can act on them. Error is better instead of debug. This way, all configurations would give these details. For example if we have following files tbldir/a=1/file1.txt tbldir/a=2/b=1/file2.txt and we are trying to create partitioned table with partitions on a and b with root directory tbldir Here ERROR log would say ignoring file tbldir/a=1/file1.txt due to incorrect structure if ignore config is set. Otherwise, operation is failed. We add only one partition with values (2, 1). msck is still restrict and the ask here is to support configs mapred.input.dir.recursive and hive.mapred.supports.subdirectories. > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421717#comment-15421717 ] Sergey Shelukhin commented on HIVE-14511: - Shouldn't the table schema inform the correct partition directory structure? So, in the above case, if the table has p1 partition column, the partition should be added and file1 should follow the setting (ignore/fail); likewise if it doesn't. I actually wonder if patch should be updated to look for specific level? I.e. if the table is partitioned on a and b, adding a=1/00_0 file makes no sense. This brings it back to using right tools for the right job. msck needs to be strict as it's primarily intended for repair, and the use for ETL is incidental. If we need "load my partitions" command that is more flexible for ETL it should be a separate feature... > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable
[ https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14533: - Resolution: Fixed Fix Version/s: 2.2.0 1.3.0 Status: Resolved (was: Patch Available) Thanks for the patch [~tfriedr]! Committed patch to master and branch-1. > improve performance of enforceMaxLength in > HiveCharWritable/HiveVarcharWritable > --- > > Key: HIVE-14533 > URL: https://issues.apache.org/jira/browse/HIVE-14533 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 1.2.1, 2.1.0 >Reporter: Thomas Friedrich >Assignee: Thomas Friedrich >Priority: Minor > Labels: performance > Fix For: 1.3.0, 2.2.0 > > Attachments: HIVE-14533.patch > > > The enforceMaxLength method in HiveVarcharWritable calls > set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), > maxLength); no matter how long the string is. The calls to getHiveVarchar() > and getHiveChar() decode the string every time the method is called > (Text.toString() calls Text.decode). This can be very expensive and is > unnecessary if the string is shorter than maxLength for HiveVarcharWritable > or different than maxLength for HiveCharWritable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14503) Remove explicit order by in qfiles for union tests
[ https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421703#comment-15421703 ] Prasanth Jayachandran commented on HIVE-14503: -- I will batch the tests based on features and put up separate patches so that it's easier to review. > Remove explicit order by in qfiles for union tests > -- > > Key: HIVE-14503 > URL: https://issues.apache.org/jira/browse/HIVE-14503 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Identify qfiles with explicit order by and replace them with > SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14533) improve performance of enforceMaxLength in HiveCharWritable/HiveVarcharWritable
[ https://issues.apache.org/jira/browse/HIVE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421700#comment-15421700 ] Thomas Friedrich commented on HIVE-14533: - I checked the test failures and they are not related to the patch. The Tez-related tests fail in other prebuilds as well and I ran TestJdbcWithMiniHS2.testAddJarConstructorUnCaching successfully locally. > improve performance of enforceMaxLength in > HiveCharWritable/HiveVarcharWritable > --- > > Key: HIVE-14533 > URL: https://issues.apache.org/jira/browse/HIVE-14533 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 1.2.1, 2.1.0 >Reporter: Thomas Friedrich >Assignee: Thomas Friedrich >Priority: Minor > Labels: performance > Attachments: HIVE-14533.patch > > > The enforceMaxLength method in HiveVarcharWritable calls > set(getHiveVarchar(), maxLength); and in HiveCharWritable set(getHiveChar(), > maxLength); no matter how long the string is. The calls to getHiveVarchar() > and getHiveChar() decode the string every time the method is called > (Text.toString() calls Text.decode). This can be very expensive and is > unnecessary if the string is shorter than maxLength for HiveVarcharWritable > or different than maxLength for HiveCharWritable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14503) Remove explicit order by in qfiles for union tests
[ https://issues.apache.org/jira/browse/HIVE-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14503: - Summary: Remove explicit order by in qfiles for union tests (was: Remove explicit order by in qfiles and replace them with SORT_QUERY_RESULTS) > Remove explicit order by in qfiles for union tests > -- > > Key: HIVE-14503 > URL: https://issues.apache.org/jira/browse/HIVE-14503 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Identify qfiles with explicit order by and replace them with > SORT_QUERY_RESULTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14170) Beeline IncrementalRows should buffer rows and incrementally re-calculate width if TableOutputFormat is used
[ https://issues.apache.org/jira/browse/HIVE-14170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421695#comment-15421695 ] Sahil Takiar commented on HIVE-14170: - [~taoli-hwx] or [~thejas] any other comments on this? --Sahil > Beeline IncrementalRows should buffer rows and incrementally re-calculate > width if TableOutputFormat is used > > > Key: HIVE-14170 > URL: https://issues.apache.org/jira/browse/HIVE-14170 > Project: Hive > Issue Type: Sub-task > Components: Beeline >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-14170.1.patch, HIVE-14170.2.patch, > HIVE-14170.3.patch, HIVE-14170.4.patch > > > If {{--incremental}} is specified in Beeline, rows are meant to be printed > out immediately. However, if {{TableOutputFormat}} is used with this option > the formatting can look really off. > The reason is that {{IncrementalRows}} does not do a global calculation of > the optimal width size for {{TableOutputFormat}} (it can't because it only > sees one row at a time). The output of {{BufferedRows}} looks much better > because it can do this global calculation. > If {{--incremental}} is used, and {{TableOutputFormat}} is used, the width > should be re-calculated every "x" rows ("x" can be configurable and by > default it can be 1000). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421694#comment-15421694 ] Subramanyam Pattipaka commented on HIVE-14511: -- Yes. That's correct. We should also check that there are no files exists until the required depth. May be thats what you wanted here? For example, files like tbldir/file1 tbldir/p1=1/file2 exists then partition creation should fail. If there is ignore config option set then probably we should move ahead ignoring these files. But, please log them under debug mode such that those can be collected and may be user may want to act on them once they have the list instead of deleting one at a time and rerunning msck. > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13930) upgrade Hive to latest Hadoop version
[ https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421691#comment-15421691 ] Sahil Takiar edited comment on HIVE-13930 at 8/15/16 9:19 PM: -- Sorry for the delay, I was out of the office for a few weeks. I looked into this some more and believe I found the root cause. Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each run. It just does a git pull, git clean, and mvn clean before the job starts. Looking at the itests/pom.xml file (contains the script to download the Spark tar-ball), it seems that the tar-ball will not be downloaded if it is already present on the local filesystem. So even though the file on S3 has been updated, the PTest2 Infra will not re-download it. This explains why the error is still occurring. I can think of a few solutions to this: 1: Simply delete the file on the PTest2 Infra Master (/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz). This should trigger the build to download the new version of the tar-ball. This may cause HoS itests to fail in other Hive QA runs since the new tar-ball includes Hadoop 2.7 jars, but it should be fine. 2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean is invoked. Nice because it will avoid this in the future, at least until HIVE-14240 has been resolved. 3: Re-name the Spark tar-ball to something like spark-spark.version-bin-hadoop2.7-without-hive (instead of hadoop2), and update the itests/pom.xml file to use the new name (the file name may need to be updated in a few other places) was (Author: stakiar): Sorry for the delay, I was out of the office for a few weeks. I looked into this some more and believe I found the root cause. Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each run. It just does a git pull, git clean, and mvn clean before the job starts. Looking at the itests/pom.xml file (contains the script to download the Spark tar-ball), it seems that the tar-ball will not be downloaded if it is already present on the local filesystem. So even though the file on S3 has been updated, the PTest2 Infra will not re-download it. This explains why the error is still occurring. I can think of a few solutions to this: 1: Simply delete the file on the PTest2 Infra Master (/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz). This should trigger the build to download the new version of the tar-ball. This may cause HoS itests to fail in other Hive QA runs since the new tar-ball includes Hadoop 2.7 jars, but it should be fine. 2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean is invoked. Nice because it will avoid this in the future, at least until HIVE-14240 has been resolved. 3: Re-name the Spark tar-ball to something like spark-${spark.version}-bin-hadoop2.7-without-hive (instead of -hadoop2-), and update the itests/pom.xml file to use the new name (the file name may need to be updated in a few other places) > upgrade Hive to latest Hadoop version > - > > Key: HIVE-13930 > URL: https://issues.apache.org/jira/browse/HIVE-13930 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, > HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, > HIVE-13930.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13930) upgrade Hive to latest Hadoop version
[ https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421691#comment-15421691 ] Sahil Takiar commented on HIVE-13930: - Sorry for the delay, I was out of the office for a few weeks. I looked into this some more and believe I found the root cause. Based on the logs from a Jenkins job, the Hive PTest2 Infra Master (runs on EC2) doesn't do a fresh clone of the Hive repo, it uses the same repo for each run. It just does a git pull, git clean, and mvn clean before the job starts. Looking at the itests/pom.xml file (contains the script to download the Spark tar-ball), it seems that the tar-ball will not be downloaded if it is already present on the local filesystem. So even though the file on S3 has been updated, the PTest2 Infra will not re-download it. This explains why the error is still occurring. I can think of a few solutions to this: 1: Simply delete the file on the PTest2 Infra Master (/data/hive-ptest/working/apache-github-source-source/itests/thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz). This should trigger the build to download the new version of the tar-ball. This may cause HoS itests to fail in other Hive QA runs since the new tar-ball includes Hadoop 2.7 jars, but it should be fine. 2: Merge HIVE-12984 - this patch will delete the Spark tar-ball when mvn clean is invoked. Nice because it will avoid this in the future, at least until HIVE-14240 has been resolved. 3: Re-name the Spark tar-ball to something like spark-${spark.version}-bin-hadoop2.7-without-hive (instead of -hadoop2-), and update the itests/pom.xml file to use the new name (the file name may need to be updated in a few other places) > upgrade Hive to latest Hadoop version > - > > Key: HIVE-13930 > URL: https://issues.apache.org/jira/browse/HIVE-13930 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, > HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, > HIVE-13930.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 9:09 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat} Alternatively you can use the Docker image that I have been using for development: https://github.com/kliewkliew/docker-hive-dev/tree/HIVE-13680 was (Author: kliew): Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421676#comment-15421676 ] Ashutosh Chauhan commented on HIVE-14418: - Can you add comment in code and here on jira in release notes on how reset is different than unset ? > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14483: Resolution: Fixed Fix Version/s: 2.0.2 2.1.1 1.3.0 Status: Resolved (was: Patch Available) Committed to branches. > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 1.3.0, 2.2.0, 2.1.1, 2.0.2 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14189) backport HIVE-13945 to branch-1
[ https://issues.apache.org/jira/browse/HIVE-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14189: Attachment: HIVE-14189.05-branch-1.patch Same patch again... > backport HIVE-13945 to branch-1 > --- > > Key: HIVE-14189 > URL: https://issues.apache.org/jira/browse/HIVE-14189 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC1.3 > Attachments: HIVE-14189-branch-1.patch, HIVE-14189.01-branch-1.patch, > HIVE-14189.02-branch-1.patch, HIVE-14189.03-branch-1.patch, > HIVE-14189.04-branch-1.patch, HIVE-14189.05-branch-1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14189) backport HIVE-13945 to branch-1
[ https://issues.apache.org/jira/browse/HIVE-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421624#comment-15421624 ] Sergey Shelukhin commented on HIVE-14189: - [~spena] is it still supposed to work? It seems it never gets picked up. I wonder if it's just HiveQA operating as usual, or something specific to branch-1 > backport HIVE-13945 to branch-1 > --- > > Key: HIVE-14189 > URL: https://issues.apache.org/jira/browse/HIVE-14189 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC1.3 > Attachments: HIVE-14189-branch-1.patch, HIVE-14189.01-branch-1.patch, > HIVE-14189.02-branch-1.patch, HIVE-14189.03-branch-1.patch, > HIVE-14189.04-branch-1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14418: Attachment: HIVE-14418.03.patch Hmm, the old approach makes it impossible to set a string parameter to an empty string. Adding UnsetProcessor and the explicit unset command. [~ashutoshc] can you review the new patch? It even has tests ;) > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14290) Refactor HIVE-14054 to use Collections#newSetFromMap
[ https://issues.apache.org/jira/browse/HIVE-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-14290: - Attachment: HIVE-14290.1.patch Re-uploading patch to trigger Hive QA > Refactor HIVE-14054 to use Collections#newSetFromMap > > > Key: HIVE-14290 > URL: https://issues.apache.org/jira/browse/HIVE-14290 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 2.1.0 >Reporter: Peter Slawski >Assignee: Peter Slawski >Priority: Trivial > Attachments: HIVE-14290.1.patch, HIVE-14290.1.patch, > HIVE-14290.1.patch > > > There is a minor refactor that can be made to HiveMetaStoreChecker so that it > cleanly creates and uses a set that is backed by a Map implementation. In > this case, the underlying Map implementation is ConcurrentHashMap. This > refactor will help prevent issues such as the one reported in HIVE-14054. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
[ https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421596#comment-15421596 ] Hive QA commented on HIVE-12806: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12781930/HIVE-12806.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/887/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/887/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-887/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]] + export JAVA_HOME=/usr/java/jdk1.8.0_25 + JAVA_HOME=/usr/java/jdk1.8.0_25 + export PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-887/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at e841edc HIVE-14345 : Beeline result table has erroneous characters (Miklos Csanady via Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at e841edc HIVE-14345 : Beeline result table has erroneous characters (Miklos Csanady via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12781930 - PreCommit-HIVE-MASTER-Build > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure > --- > > Key: HIVE-12806 > URL: https://issues.apache.org/jira/browse/HIVE-12806 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Vineet Garg > Attachments: HIVE-12806.1.patch > > > Step to reproduce: > mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q > -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true > Query : > {code} > select count(*) from ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 > b on a.key = b.key > ) subq1 > {code} > Stack trace : > {code} > 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: > SessionState (SessionState.java:printError(1010)) - Vertex failed, > vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, > diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\ > illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, > vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: > java.lang.RuntimeException: Failed to load plan: null: > java.lang.IllegalArgumentException: java.net.URISyntaxException: \ > Relative path in absolute URI: subq1:amerge.xml > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640) > a
[jira] [Commented] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure
[ https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421590#comment-15421590 ] Hive QA commented on HIVE-14505: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12823722/HIVE-14505.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10470 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_2] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[load_dyn_part1] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez_join_hash] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[transform_ppr1] org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/886/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/886/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-886/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12823722 - PreCommit-HIVE-MASTER-Build > Analyze > org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching > failure > > > Key: HIVE-14505 > URL: https://issues.apache.org/jira/browse/HIVE-14505 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14505.1.patch > > > Flaky test failure. Fails ~50% of the time locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14405) Have tests log to the console along with hive.log
[ https://issues.apache.org/jira/browse/HIVE-14405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421583#comment-15421583 ] Siddharth Seth commented on HIVE-14405: --- hadoop.ipc etc is already setup to log at a higher level. I'll see if I can change console logging to INFO level, otherwise will commit the patch as is. > Have tests log to the console along with hive.log > - > > Key: HIVE-14405 > URL: https://issues.apache.org/jira/browse/HIVE-14405 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14405.01.patch > > > When running tests from the IDE (not itests), logs end up going to hive.log - > making it difficult to debug tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
[ https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-12806: -- Assignee: Vineet Garg > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure > --- > > Key: HIVE-12806 > URL: https://issues.apache.org/jira/browse/HIVE-12806 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Vineet Garg > Attachments: HIVE-12806.1.patch > > > Step to reproduce: > mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q > -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true > Query : > {code} > select count(*) from ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 > b on a.key = b.key > ) subq1 > {code} > Stack trace : > {code} > 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: > SessionState (SessionState.java:printError(1010)) - Vertex failed, > vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, > diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\ > illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, > vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: > java.lang.RuntimeException: Failed to load plan: null: > java.lang.IllegalArgumentException: java.net.URISyntaxException: \ > Relative path in absolute URI: subq1:amerge.xml > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:640) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:629) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Failed to load plan: null: > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: subq1:amerge.xml > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451) > at > org.apache.hadoop.hive.ql.exec.Utilities.getMergeWork(Utilities.java:339) > at > org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.populateMapWork(SplitGrouper.java:260) > at > org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:172) > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:277) > ... 12 more > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: subq1:amerge.xml > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.(Path.java:172) > at org.apache.hadoop.fs.Path.(Path.java:94) > at > org.apache.hadoop.hive.ql.exec.Utilities.getPlanPath(Utilities.java:588) > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:387) > ... 16 more > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > subq1:amerge.xml > at java.net.URI.checkPath(URI.java:1804) > at java.net.URI.(URI.java:752) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > ... 20 more > ] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12806) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure
[ https://issues.apache.org/jira/browse/HIVE-12806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421505#comment-15421505 ] Vineet Garg commented on HIVE-12806: Still failing. I'll take a look > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver vector_auto_smb_mapjoin_14.q failure > --- > > Key: HIVE-12806 > URL: https://issues.apache.org/jira/browse/HIVE-12806 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12806.1.patch > > > Step to reproduce: > mvn test -Dtest=TestMiniTezCliDriver -Dqfile=vector_auto_smb_mapjoin_14.q > -Dhive.cbo.returnpath.hiveop=true -Dtest.output.overwrite=true > Query : > {code} > select count(*) from ( > select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 > b on a.key = b.key > ) subq1 > {code} > Stack trace : > {code} > 2016-01-07T14:08:04,803 ERROR [da534038-d792-4d16-86e9-87b9f971adda main[]]: > SessionState (SessionState.java:printError(1010)) - Vertex failed, > vertexName=Map 1, vertexId=vertex_1452204324051_0001_33_00, > diagnostics=[Vertex vertex_1452204324051_0001_33_00 [Map 1] k\ > illed/failed due to:AM_USERCODE_FAILURE, Exception in VertexManager, > vertex:vertex_1452204324051_0001_33_00 [Map 1], java.lang.RuntimeException: > java.lang.RuntimeException: Failed to load plan: null: > java.lang.IllegalArgumentException: java.net.URISyntaxException: \ > Relative path in absolute URI: subq1:amerge.xml > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:314) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventRootInputInitialized.invoke(VertexManager.java:624) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:645) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:640) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:640) > at > org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:629) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Failed to load plan: null: > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: subq1:amerge.xml > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451) > at > org.apache.hadoop.hive.ql.exec.Utilities.getMergeWork(Utilities.java:339) > at > org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.populateMapWork(SplitGrouper.java:260) > at > org.apache.hadoop.hive.ql.exec.tez.SplitGrouper.generateGroupedSplits(SplitGrouper.java:172) > at > org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.onRootVertexInitialized(CustomPartitionVertex.java:277) > ... 12 more > Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: subq1:amerge.xml > at org.apache.hadoop.fs.Path.initialize(Path.java:206) > at org.apache.hadoop.fs.Path.(Path.java:172) > at org.apache.hadoop.fs.Path.(Path.java:94) > at > org.apache.hadoop.hive.ql.exec.Utilities.getPlanPath(Utilities.java:588) > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:387) > ... 16 more > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > subq1:amerge.xml > at java.net.URI.checkPath(URI.java:1804) > at java.net.URI.(URI.java:752) > at org.apache.hadoop.fs.Path.initialize(Path.java:203) > ... 20 more > ] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14480) ORC ETLSplitStrategy should use thread pool when computing splits
[ https://issues.apache.org/jira/browse/HIVE-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421480#comment-15421480 ] Sergey Shelukhin commented on HIVE-14480: - Already +1d above... > ORC ETLSplitStrategy should use thread pool when computing splits > - > > Key: HIVE-14480 > URL: https://issues.apache.org/jira/browse/HIVE-14480 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14480.1.patch, HIVE-14480.2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.
[ https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-11581: - Assignee: Vaibhav Gumashta (was: Arpit Gupta) > HiveServer2 should store connection params in ZK when using dynamic service > discovery for simpler client connection string. > --- > > Key: HIVE-11581 > URL: https://issues.apache.org/jira/browse/HIVE-11581 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.3.0, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Labels: TODOC1.3 > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, > HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch > > > Currently, the client needs to specify several parameters based on which an > appropriate connection is created with the server. In case of dynamic service > discovery, when multiple HS2 instances are running, it is much more usable > for the server to add its config parameters to ZK which the driver can use to > configure the connection, instead of the jdbc/odbc user adding those in > connection string. > However, at minimum, client will need to specify zookeeper ensemble and that > she wants the JDBC driver to use ZooKeeper: > {noformat} > beeline> !connect > jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 > vgumashta vgumashta org.apache.hive.jdbc.HiveDriver > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11581) HiveServer2 should store connection params in ZK when using dynamic service discovery for simpler client connection string.
[ https://issues.apache.org/jira/browse/HIVE-11581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta reassigned HIVE-11581: -- Assignee: Arpit Gupta (was: Vaibhav Gumashta) > HiveServer2 should store connection params in ZK when using dynamic service > discovery for simpler client connection string. > --- > > Key: HIVE-11581 > URL: https://issues.apache.org/jira/browse/HIVE-11581 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 1.3.0, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Arpit Gupta > Labels: TODOC1.3 > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11581.1.patch, HIVE-11581.2.patch, > HIVE-11581.3.patch, HIVE-11581.3.patch, HIVE-11581.4.patch > > > Currently, the client needs to specify several parameters based on which an > appropriate connection is created with the server. In case of dynamic service > discovery, when multiple HS2 instances are running, it is much more usable > for the server to add its config parameters to ZK which the driver can use to > configure the connection, instead of the jdbc/odbc user adding those in > connection string. > However, at minimum, client will need to specify zookeeper ensemble and that > she wants the JDBC driver to use ZooKeeper: > {noformat} > beeline> !connect > jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 > vgumashta vgumashta org.apache.hive.jdbc.HiveDriver > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12203) CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results
[ https://issues.apache.org/jira/browse/HIVE-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421433#comment-15421433 ] Vineet Garg commented on HIVE-12203: Interestingly I am seeing NullPointerException on my local system. I am going to take a look on this. > CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results > --- > > Key: HIVE-12203 > URL: https://issues.apache.org/jira/browse/HIVE-12203 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12203.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12203) CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results
[ https://issues.apache.org/jira/browse/HIVE-12203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-12203: -- Assignee: Vineet Garg (was: Jesus Camacho Rodriguez) > CBO (Calcite Return Path): groupby_grouping_id2.q returns wrong results > --- > > Key: HIVE-12203 > URL: https://issues.apache.org/jira/browse/HIVE-12203 > Project: Hive > Issue Type: Sub-task > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Vineet Garg > Attachments: HIVE-12203.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:16 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} was (Author: kliew): Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat} was (Author: kliew): Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} was (Author: kliew): Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=.{noformat} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code} was (Author: kliew): Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{code} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:12 PM: Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {noformat}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{noformat} was (Author: kliew): Example compressor attached. Configure the server with {code:xml} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM: Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{code} was (Author: kliew): Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:11 PM: Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list . hive.server2.thrift.resultset.serialize.in.tasks true {code} , add the example CompDe to the Hive lib folder, and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=.{code} was (Author: kliew): Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{code} > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421390#comment-15421390 ] Kevin Liew edited comment on HIVE-13680 at 8/15/16 6:10 PM: Example compressor attached. Configure the server with {code} hive.server2.thrift.resultset.server.compressor.list snappy.snappy hive.server2.thrift.resultset.serialize.in.tasks true {code} and start beeline with {code}--hiveconf hive.server2.thrift.resultset.compressor.list=snappy.snappy{code} was (Author: kliew): Example Snappy compressor attached. > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Liew updated HIVE-13680: -- Attachment: SnappyCompDe.zip Example Snappy compressor attached. > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421386#comment-15421386 ] Wei Zheng commented on HIVE-13249: -- [~leftylev] Wiki has been updated > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Labels: TODOC1.3, TODOC2.1 > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13249.1.patch, HIVE-13249.10.patch, > HIVE-13249.11.patch, HIVE-13249.12.patch, HIVE-13249.2.patch, > HIVE-13249.3.patch, HIVE-13249.4.patch, HIVE-13249.5.patch, > HIVE-13249.6.patch, HIVE-13249.7.patch, HIVE-13249.8.patch, > HIVE-13249.9.patch, HIVE-13249.branch-1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure
[ https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14505: - Status: Patch Available (was: Open) > Analyze > org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching > failure > > > Key: HIVE-14505 > URL: https://issues.apache.org/jira/browse/HIVE-14505 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14505.1.patch > > > Flaky test failure. Fails ~50% of the time locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14505) Analyze org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching failure
[ https://issues.apache.org/jira/browse/HIVE-14505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-14505: - Attachment: HIVE-14505.1.patch > Analyze > org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching > failure > > > Key: HIVE-14505 > URL: https://issues.apache.org/jira/browse/HIVE-14505 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14505.1.patch > > > Flaky test failure. Fails ~50% of the time locally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp
[ https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421358#comment-15421358 ] Xuefu Zhang commented on HIVE-14412: [~lirui], your proposal looks good to me, especially it's backward compatible. I'm not sure if this has any impact on vectorization, but it's great to make the encoding work well in vectorized mode. > Add a timezone-aware timestamp > -- > > Key: HIVE-14412 > URL: https://issues.apache.org/jira/browse/HIVE-14412 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, > HIVE-14412.1.patch > > > Java's Timestamp stores the time elapsed since the epoch. While it's by > itself unambiguous, ambiguity comes when we parse a string into timestamp, or > convert a timestamp to string, causing problems like HIVE-14305. > To solve the issue, I think we should make timestamp aware of timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12637) make retryable SQLExceptions in TxnHandler configurable
[ https://issues.apache.org/jira/browse/HIVE-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421357#comment-15421357 ] Wei Zheng commented on HIVE-12637: -- [~leftylev] Wiki has been updated. > make retryable SQLExceptions in TxnHandler configurable > --- > > Key: HIVE-12637 > URL: https://issues.apache.org/jira/browse/HIVE-12637 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Labels: TODOC1.3, TODOC2.1 > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12637.1.patch, HIVE-12637.2.patch > > > same for CompactionTxnHandler > would be convenient if the user could specify some RegEx (perhaps by db type) > which will tell TxnHandler.checkRetryable() that this is should be retried. > The regex should probably apply to String produced by > {noformat} > private static String getMessage(SQLException ex) { > return ex.getMessage() + "(SQLState=" + ex.getSQLState() + ",ErrorCode=" > + ex.getErrorCode() + ")"; > } > {noformat} > This make it flexible. > See if we need to add Db type (and possibly version) of the DB being used. > With 5 different DBs supported this gives control end users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421350#comment-15421350 ] Ashutosh Chauhan commented on HIVE-14511: - [~pattipaka] Just to clarify your dir structure is as following: {code} tbldir/p1=1/p2=1/p3=1 tbldir/p1=1/p2=1/p3=2 tbldir/p1=1/p2=1/p3=3 tbldir/p1=1/p2=2/p3=1 tbldir/p1=1/p2=2/p3=2 tbldir/p1=2/p2=1/p3=1 {code} and your tbl is partitioned on (p1,p2). Correct? > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Liew updated HIVE-13680: -- Status: Patch Available (was: In Progress) > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13680 started by Kevin Liew. - > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Liew updated HIVE-13680: -- Attachment: HIVE-13680.patch First patch submitted. HiveConf settings submitted by the client are required to be prefixed by "set:hiveconf" so in ThriftCLIService we have to preserve this prefix to ensure that the SessionState is generated correctly. To make the code cleaner, we could expose functions (which are currently out of scope) to parse the prefix, or we could have the client send the list of compressors and list of configs in new fields in the Thrift message (after compressor negotiation, the CompDe would be stored in a new field in SessionState instead of in SessionState.sessConf). I prefer the second option. > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.patch, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
[ https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421268#comment-15421268 ] Subramanyam Pattipaka commented on HIVE-14511: -- I mean to add only p1=1/p2=1. For example if you have following structure data/p1=1/p2=1/p3=1 /p3=2 /p3=3 /p2=2/p3=1 /p3=2 /p1=2/p2=1/p3=1 Now, I want to add only (1,1), (1,2) and (2,1) as partitions. If you remove the above check then this is possible. In first iteration you would list p1=1 p1=2 in next iteration you would list /p1=1/p2=1 /p1=1/p2=3 /p1=2/p2=1 As depth is 0 we stop here and these are the paths for partitions if user want to create on p1 and p2 as partition columns. If you want you can check for existence of use of config mapred.input.dir.recursive and hive.mapred.supports.subdirectories. > Improve MSCK for partitioned table to deal with special cases > - > > Key: HIVE-14511 > URL: https://issues.apache.org/jira/browse/HIVE-14511 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14511.01.patch > > > Some users will have a folder rather than a file under the last partition > folder. However, msck is going to search for the leaf folder rather than the > last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14345) Beeline result table has erroneous characters
[ https://issues.apache.org/jira/browse/HIVE-14345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-14345: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Pushed to master. Thanks, Miklos! > Beeline result table has erroneous characters > -- > > Key: HIVE-14345 > URL: https://issues.apache.org/jira/browse/HIVE-14345 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 1.1.0, 2.2.0 >Reporter: Jeremy Beard >Assignee: Miklos Csanady >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-14345.3.patch, HIVE-14345.4.patch, > HIVE-14345.5.patch, HIVE-14345.patch > > > Beeline returns query results with erroneous characters. For example: > {code} > 0: jdbc:hive2://:1/def> select 10; > +--+--+ > | _c0 | > +--+--+ > | 10 | > +--+--+ > 1 row selected (3.207 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups
[ https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421097#comment-15421097 ] Naveen Gangam commented on HIVE-14513: -- [~leftylev] I have update the LDAP authentication configuration documentation [here|https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2]. Could you please review it to make it consistent with other pages? Thank you in advance > Enhance custom query feature in LDAP atn to support resultset of ldap groups > > > Key: HIVE-14513 > URL: https://issues.apache.org/jira/browse/HIVE-14513 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14513.patch > > > LDAP Authenticator can be configured to use a result set from a LDAP query to > authenticate. However, is it expected that this LDAP query would only result > a set of users (aka full DNs for the users in LDAP). > However, its not always straightforward to be able to author queries that > return users. For example, say you would like to allow "all users from group1 > and group2" to be authenticated. The LDAP query has to return a union of all > members of the group1 and group2. > For example, one common configuration is that groups contain a list of its > users > "dn: uid=group1,ou=Groups,dc=example,dc=com", > "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com", > "objectClass: top", > "objectClass: groupOfNames", > "objectClass: ExtensibleObject", > "cn: group1", > "ou: Groups", > "sn: group1", > "member: uid=user1,ou=People,dc=example,dc=com", > The query > {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}} > will return the entries > uid=group1,ou=Groups,dc=example,dc=com > uid=group2,ou=Groups,dc=example,dc=com > but there is no means to form a query that would return just the values of > "member" attributes. (ldap client tools are able to do by filtering out the > attributes on these entries. > So it will be useful to have such support to be able to specify queries that > return groups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13936) Add streaming support for row_number
[ https://issues.apache.org/jira/browse/HIVE-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421010#comment-15421010 ] Chaoyu Tang commented on HIVE-13936: Similar to the streaming support in rank. LGTM, +1 > Add streaming support for row_number > > > Key: HIVE-13936 > URL: https://issues.apache.org/jira/browse/HIVE-13936 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Johndee Burks >Assignee: Yongzhi Chen > Attachments: HIVE-13936.1.patch > > > Without this support row_number will cause heap issues in reducers. Example > query below against 10 million records will cause failure. > {code} > select a, row_number() over (partition by a order by a desc) as row_num from > j100mil; > {code} > Same issue different function in JIRA HIVE-7062 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups
[ https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421007#comment-15421007 ] Naveen Gangam commented on HIVE-14513: -- I already documented this feature this in the past but yes, I plan on enhancing it at https://cwiki.apache.org/confluence/display/Hive/User+and+Group+Filter+Support+with+LDAP+Atn+Provider+in+HiveServer2#UserandGroupFilterSupportwithLDAPAtnProviderinHiveServer2-CustomQueryString which needed a bit more details to begin with. I do not think it was clear as to what the custom query should result in. > Enhance custom query feature in LDAP atn to support resultset of ldap groups > > > Key: HIVE-14513 > URL: https://issues.apache.org/jira/browse/HIVE-14513 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14513.patch > > > LDAP Authenticator can be configured to use a result set from a LDAP query to > authenticate. However, is it expected that this LDAP query would only result > a set of users (aka full DNs for the users in LDAP). > However, its not always straightforward to be able to author queries that > return users. For example, say you would like to allow "all users from group1 > and group2" to be authenticated. The LDAP query has to return a union of all > members of the group1 and group2. > For example, one common configuration is that groups contain a list of its > users > "dn: uid=group1,ou=Groups,dc=example,dc=com", > "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com", > "objectClass: top", > "objectClass: groupOfNames", > "objectClass: ExtensibleObject", > "cn: group1", > "ou: Groups", > "sn: group1", > "member: uid=user1,ou=People,dc=example,dc=com", > The query > {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}} > will return the entries > uid=group1,ou=Groups,dc=example,dc=com > uid=group2,ou=Groups,dc=example,dc=com > but there is no means to form a query that would return just the values of > "member" attributes. (ldap client tools are able to do by filtering out the > attributes on these entries. > So it will be useful to have such support to be able to specify queries that > return groups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420986#comment-15420986 ] Illya Yalovyy commented on HIVE-14373: -- [~kgyrtkirk], Thank you for the heads up. I think [~ayousufi] is actively working on his path. I have added my implementation only for the reference. If for any reason he is not able to finish this project, I can pick it up. I think it make sense is to update this CR: https://reviews.apache.org/r/50938/ > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14525) beeline still writing log data to stdout as of version 2.1.0
[ https://issues.apache.org/jira/browse/HIVE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420959#comment-15420959 ] Miklos Csanady commented on HIVE-14525: --- The prompt and other debug info is echoed to stdout only if --silent=true is not set. These are supressed if on commandline -f file AND --silent=true are present. Beeline cannot detect command line output redirection. > beeline still writing log data to stdout as of version 2.1.0 > > > Key: HIVE-14525 > URL: https://issues.apache.org/jira/browse/HIVE-14525 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 2.1.0 >Reporter: stephen sprague > > simple test. note that i'm looking to get a tsv file back. > {code} > $ beeline -u dwrdevnn1 --showHeader=false --outputformat=tsv22>stderr > > select count(*) > > from default.dual; > > SQL > {code} > instead i get this in stdout: > {code} > $ cat stdout > 0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> select count(*) > . . . . . . . . . . . . . . . . . . . . . . .> from default.dual; > 0 > 0: jdbc:hive2://dwrdevnn1.sv2.trulia.com:1000> > {code} > i should only get one row which is the *result* of the query (which is 0) - > not the ovthe loggy kind of lines you see above. that stuff goes to stderr my > friends. > also i refer to this ticket b/c the last comment suggested so - its close but > not exactly the same. > https://issues.apache.org/jira/browse/HIVE-14183 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14412) Add a timezone-aware timestamp
[ https://issues.apache.org/jira/browse/HIVE-14412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420848#comment-15420848 ] Rui Li commented on HIVE-14412: --- Somehow the tests worked this time. {{TestStandardObjectInspectors}} is related because of the new type. I'd like to get more feedback before moving on. Pinging [~sershe], [~ashutoshc], [~xuefuz] for opinions. Do you think the proposal makes sense, or is there better way to achieve this? Thanks. > Add a timezone-aware timestamp > -- > > Key: HIVE-14412 > URL: https://issues.apache.org/jira/browse/HIVE-14412 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-14412.1.patch, HIVE-14412.1.patch, > HIVE-14412.1.patch > > > Java's Timestamp stores the time elapsed since the epoch. While it's by > itself unambiguous, ambiguity comes when we parse a string into timestamp, or > convert a timestamp to string, causing problems like HIVE-14305. > To solve the issue, I think we should make timestamp aware of timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
[ https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420716#comment-15420716 ] Sergey Zadoroshnyak commented on HIVE-14483: [~sershe] .patch looks good and no test failures. Who has responsibility to push into master? > java.lang.ArrayIndexOutOfBoundsException > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays > -- > > Key: HIVE-14483 > URL: https://issues.apache.org/jira/browse/HIVE-14483 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Sergey Zadoroshnyak >Assignee: Sergey Zadoroshnyak >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14483.01.patch > > > Error message: > Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 > at > org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231) > at > org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268) > at > org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368) > at > org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212) > at > org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737) > at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) > ... 22 more > How to reproduce? > Configure StringTreeReader which contains StringDirectTreeReader as > TreeReader (DIRECT or DIRECT_V2 column encoding) > batchSize = 1026; > invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final > int batchSize) > scratchlcv is LongColumnVector with long[] vector (length 1024) > which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, > scratchlcv,result, batchSize); > as result in method commonReadByteArrays(stream, lengths, scratchlcv, > result, (int) batchSize) we received > ArrayIndexOutOfBoundsException. > If we use StringDictionaryTreeReader, then there is no exception, as we have > a verification scratchlcv.ensureSize((int) batchSize, false) before > reader.nextVector(scratchlcv, scratchlcv.vector, batchSize); > These changes were made for Hive 2.1.0 by corresponding commit > https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 > for task https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley > How to fix? > add only one line : > scratchlcv.ensureSize((int) batchSize, false) ; > in method > org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream > stream, IntegerReader lengths, > LongColumnVector scratchlcv, > BytesColumnVector result, final int batchSize) before invocation > lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13175) Disallow making external tables transactional
[ https://issues.apache.org/jira/browse/HIVE-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420715#comment-15420715 ] Lefty Leverenz commented on HIVE-13175: --- Removed both of the TODOC labels. Woo hoo for the docs, [~wzheng]! > Disallow making external tables transactional > - > > Key: HIVE-13175 > URL: https://issues.apache.org/jira/browse/HIVE-13175 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13175.1.patch, HIVE-13175.2.patch, > HIVE-13175.3.patch, HIVE-13175.4.patch > > > The fact that compactor rewrites contents of ACID tables is in conflict with > what is expected of external tables. > Conversely, end user can write to External table which certainly not what is > expected of ACID table. > So we should explicitly disallow making an external table ACID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13175) Disallow making external tables transactional
[ https://issues.apache.org/jira/browse/HIVE-13175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-13175: -- Labels: (was: TODOC1.3 TODOC2.1) > Disallow making external tables transactional > - > > Key: HIVE-13175 > URL: https://issues.apache.org/jira/browse/HIVE-13175 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13175.1.patch, HIVE-13175.2.patch, > HIVE-13175.3.patch, HIVE-13175.4.patch > > > The fact that compactor rewrites contents of ACID tables is in conflict with > what is expected of external tables. > Conversely, end user can write to External table which certainly not what is > expected of ACID table. > So we should explicitly disallow making an external table ACID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12634) Add command to kill an ACID transaction
[ https://issues.apache.org/jira/browse/HIVE-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420714#comment-15420714 ] Lefty Leverenz commented on HIVE-12634: --- Removed both of the TODOC labels. Thanks for the docs, [~wzheng]! > Add command to kill an ACID transaction > --- > > Key: HIVE-12634 > URL: https://issues.apache.org/jira/browse/HIVE-12634 > Project: Hive > Issue Type: New Feature > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12634.1.patch, HIVE-12634.2.patch, > HIVE-12634.3.patch, HIVE-12634.4.patch, HIVE-12634.5.patch, > HIVE-12634.6.patch, HIVE-12634.7.patch, HIVE-12634.branch-1.patch > > > Should add a CLI command to abort a (runaway) transaction. > This should clean up all state related to this txn. > The initiator of this (if still alive) will get an error trying to > heartbeat/commit, i.e. will become aware that the txn is dead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12634) Add command to kill an ACID transaction
[ https://issues.apache.org/jira/browse/HIVE-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12634: -- Labels: (was: TODOC1.3 TODOC2.1) > Add command to kill an ACID transaction > --- > > Key: HIVE-12634 > URL: https://issues.apache.org/jira/browse/HIVE-12634 > Project: Hive > Issue Type: New Feature > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12634.1.patch, HIVE-12634.2.patch, > HIVE-12634.3.patch, HIVE-12634.4.patch, HIVE-12634.5.patch, > HIVE-12634.6.patch, HIVE-12634.7.patch, HIVE-12634.branch-1.patch > > > Should add a CLI command to abort a (runaway) transaction. > This should clean up all state related to this txn. > The initiator of this (if still alive) will get an error trying to > heartbeat/commit, i.e. will become aware that the txn is dead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420709#comment-15420709 ] Lefty Leverenz commented on HIVE-12366: --- Removed the TODOC1.3 label. Thanks for the docs, [~wzheng]. Here are the doc links: * [Configuration Properties -- hive.txn.heartbeat.threadpool.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.txn.heartbeat.threadpool.size] * [Hive Transactions -- New Configuration Parameters for Transactions | https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-NewConfigurationParametersforTransactions] > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12366: -- Labels: (was: TODOC1.3) > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12366.1.patch, HIVE-12366.11.patch, > HIVE-12366.12.patch, HIVE-12366.13.patch, HIVE-12366.14.patch, > HIVE-12366.15.patch, HIVE-12366.2.patch, HIVE-12366.3.patch, > HIVE-12366.4.patch, HIVE-12366.5.patch, HIVE-12366.6.patch, > HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch, > HIVE-12366.branch-1.patch, HIVE-12366.branch-2.0.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)