[jira] [Commented] (HIVE-11883) 'transactional' table property for ACID should be case insensitive

2015-10-02 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942147#comment-14942147
 ] 

Lefty Leverenz commented on HIVE-11883:
---

Doc note:  The "Table Properties" section of the Hive Transactions wikidoc 
needs to be updated for this change, with version information.

* [Hive Transactions -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-TableProperties]

The DDL wikidoc doesn't need any change -- it just shows lowercase 
"transactional" and refers to Hive Transactions for more information.  (However 
it does need revision for HIVE-8308, which made "NO_AUTO_COMPACTION" 
case-insensitive in 1.1.0.)

> 'transactional' table property for ACID should be case insensitive
> --
>
> Key: HIVE-11883
> URL: https://issues.apache.org/jira/browse/HIVE-11883
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: TODOC1.3
> Fix For: 1.3.0
>
> Attachments: HIVE-11883.patch
>
>
> Given:
> {noformat}
> CREATE TABLE mytable (col1 int, col2 string)
> CLUSTERED BY (col1) INTO 2 BUCKETS
> STORED AS ORC TBLPROPERTIES('TRANSACTIONAL'='TRUE');
> {noformat}
> update/delete statements will fail with 
> {noformat}
> FAILED: SemanticException [Error 10122]: Bucketized tables do not support 
> INSERT INTO: Table: default.mytable
> {noformat}
> but 'transactional' (in lower case) works fine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11883) 'transactional' table property for ACID should be case insensitive

2015-10-02 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11883:
--
Labels: TODOC1.3  (was: )

> 'transactional' table property for ACID should be case insensitive
> --
>
> Key: HIVE-11883
> URL: https://issues.apache.org/jira/browse/HIVE-11883
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: TODOC1.3
> Fix For: 1.3.0
>
> Attachments: HIVE-11883.patch
>
>
> Given:
> {noformat}
> CREATE TABLE mytable (col1 int, col2 string)
> CLUSTERED BY (col1) INTO 2 BUCKETS
> STORED AS ORC TBLPROPERTIES('TRANSACTIONAL'='TRUE');
> {noformat}
> update/delete statements will fail with 
> {noformat}
> FAILED: SemanticException [Error 10122]: Bucketized tables do not support 
> INSERT INTO: Table: default.mytable
> {noformat}
> but 'transactional' (in lower case) works fine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12026) Add test case to check permissions when truncating partition

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942133#comment-14942133
 ] 

Hive QA commented on HIVE-12026:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764853/HIVE-12026.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9643 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5507/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5507/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5507/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764853 - PreCommit-HIVE-TRUNK-Build

> Add test case to check permissions when truncating partition
> 
>
> Key: HIVE-12026
> URL: https://issues.apache.org/jira/browse/HIVE-12026
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12026.1.patch
>
>
> Add to the tests added during HIVE-9474, for TRUNCATE PARTITION



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11699) Support special characters in quoted table names

2015-10-02 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942117#comment-14942117
 ] 

Lefty Leverenz commented on HIVE-11699:
---

+1 on the configuration parameter description.  (Thanks for the changes.)

> Support special characters in quoted table names
> 
>
> Key: HIVE-11699
> URL: https://issues.apache.org/jira/browse/HIVE-11699
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, 
> HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch, 
> HIVE-11699.06.patch, HIVE-11699.07.patch, HIVE-11699.08.patch, 
> HIVE-11699.09.patch
>
>
> Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
> investigate how much change there should be if we would like to support 
> special characters, e.g., "/" in table names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942098#comment-14942098
 ] 

Hive QA commented on HIVE-11983:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764855/HIVE-11983.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9646 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5506/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5506/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5506/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764855 - PreCommit-HIVE-TRUNK-Build

> Hive streaming API uses incorrect logic to assign buckets to incoming records
> -
>
> Key: HIVE-11983
> URL: https://issues.apache.org/jira/browse/HIVE-11983
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: streaming, streaming_api
> Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, 
> HIVE-11983.5.patch, HIVE-11983.patch
>
>
> The Streaming API tries to distribute records evenly into buckets. 
> All records in every Transaction that is part of TransactionBatch goes to the 
> same bucket and a new bucket number is chose for each TransactionBatch.
> Fix: API needs to hash each record to determine which bucket it belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942066#comment-14942066
 ] 

Hive QA commented on HIVE-11954:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764563/HIVE-11954.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5505/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5505/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5505/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5505/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at bbb312f HIVE-11913 : Verify existence of tests for new changes 
in HiveQA (Szehon, reviewed by Sergio Pena)
+ git clean -f -d
Removing ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java.orig
Removing ql/src/test/queries/clientpositive/union36.q
Removing ql/src/test/results/clientpositive/union36.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at bbb312f HIVE-11913 : Verify existence of tests for new changes 
in HiveQA (Szehon, reviewed by Sergio Pena)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764563 - PreCommit-HIVE-TRUNK-Build

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.01.patch, HIVE-11954.02.patch, 
> HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, 

[jira] [Commented] (HIVE-11919) Hive Union Type Mismatch

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942065#comment-14942065
 ] 

Hive QA commented on HIVE-11919:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764527/HIVE-11919.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9644 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5503/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5503/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5503/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764527 - PreCommit-HIVE-TRUNK-Build

> Hive Union Type Mismatch
> 
>
> Key: HIVE-11919
> URL: https://issues.apache.org/jira/browse/HIVE-11919
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-11919.1.patch, HIVE-11919.2.patch
>
>
> In Hive for union right most type wins out for most primitive types during 
> plan gen. However when union op gets initialized the type gets switched.
> This could result in bad data & type exceptions.
> This happens only in non cbo mode.
> In CBO mode, Hive would add explicit type casts that would prevent such type 
> issues.
> Sample Query: 
> select cd/sum(cd) over() from(select cd from u1 union all select cd from u2 
> union all select cd from u3)u4;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12005) Remove hbase based stats collection mechanism

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942023#comment-14942023
 ] 

Hive QA commented on HIVE-12005:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764521/HIVE-12005.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9622 tests executed
*Failed tests:*
{noformat}
TestHS2AuthzContext - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-vector_interval_2.q-load_dyn_part2.q-and-12-more
 - did not produce a TEST-*.xml file
TestStorageBasedMetastoreAuthorizationProviderWithACL - did not produce a 
TEST-*.xml file
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5502/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5502/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5502/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764521 - PreCommit-HIVE-TRUNK-Build

> Remove hbase based stats collection mechanism
> -
>
> Key: HIVE-12005
> URL: https://issues.apache.org/jira/browse/HIVE-12005
> Project: Hive
>  Issue Type: Task
>  Components: Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12005.patch
>
>
> Currently, hbase is one of the mechanism to collect and store statistics. I 
> have never come across anyone using it. FileSystem based collection mechanism 
> is default for few releases and is working well. We shall remove hbase stats 
> collector.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941997#comment-14941997
 ] 

Gopal V commented on HIVE-11969:


[~sershe]: LGTM - +1.

Minor comment on the isOpenOrOpening() - can you split that up into 2 different 
functions, so that the isOpen() does not change.

That allows the if(isOpening()) to cancel the future & the isOpen() to close 
the session.

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch, Screen Shot 2015-10-02 at 14.23.17 .png
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11642:

Attachment: HIVE-11642.18.patch

Update the patch for conflicts. We will eventually do the merge again but I 
don't want to do them twice a day...

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.17.patch, HIVE-11642.18.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization

2015-10-02 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941981#comment-14941981
 ] 

Thejas M Nair commented on HIVE-11175:
--

Yes, I think that should take care of it. (you are sharp! :) )


> create function using jar does not work with sql std authorization
> --
>
> Key: HIVE-11175
> URL: https://issues.apache.org/jira/browse/HIVE-11175
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Olaf Flebbe
> Fix For: 2.0.0
>
> Attachments: HIVE-11175.1.patch
>
>
> {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} 
> gives error code for need of accessing a local foo.jar  resource with ADMIN 
> privileges. Same for HDFS (DFS_URI) .
> problem is that the semantic analysis enforces the ADMIN privilege for write 
> but the jar is clearly input not output. 
> Patch und Testcase appendend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12027) simplify file metadata cache ppd api

2015-10-02 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941934#comment-14941934
 ] 

Alan Gates commented on HIVE-12027:
---

+1

> simplify file metadata cache ppd api
> 
>
> Key: HIVE-12027
> URL: https://issues.apache.org/jira/browse/HIVE-12027
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: HIVE-12027.nogen.patch, HIVE-12027.patch
>
>
> I made it unwieldy for iterator model when the number of files is too large. 
> Fix it. Not shipped anywhere yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11913) Verify existence of tests for new changes in HiveQA

2015-10-02 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941918#comment-14941918
 ] 

Thejas M Nair commented on HIVE-11913:
--

Thanks a lot for working on this [~szehon]!


> Verify existence of tests for new changes in HiveQA
> ---
>
> Key: HIVE-11913
> URL: https://issues.apache.org/jira/browse/HIVE-11913
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-11913.patch
>
>
> Would be great if HiveQA could report whether there are test files (Test*, 
> *Test, or qfiles) that are added, or changed.
> Note not every change would need this, but it should be the best of ability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Fix Version/s: (was: hbase-metastore-branch)

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11777) implement an option to have single ETL strategy for multiple directories

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11777:

Attachment: HIVE-11777.01.patch

Rebased the patch on top of HIVE-11856 for now, that one is easier to commit

> implement an option to have single ETL strategy for multiple directories
> 
>
> Key: HIVE-11777
> URL: https://issues.apache.org/jira/browse/HIVE-11777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11777.01.patch, HIVE-11777.patch
>
>
> In case of metastore footer PPD we don't want to call PPD call with all 
> attendant SARG, MS and HBase overhead for each directory. If we wait for some 
> time (10ms? some fraction of inputs?) we can do one call without losing 
> overall perf. 
> For now make it time based.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11856) allow split strategies to run on threadpool

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11856:

Fix Version/s: (was: hbase-metastore-branch)

> allow split strategies to run on threadpool
> ---
>
> Key: HIVE-11856
> URL: https://issues.apache.org/jira/browse/HIVE-11856
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11856.01.patch, HIVE-11856.patch
>
>
> If a split strategy makes metastore cache calls, it should probably run on 
> the threadpool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11514) Vectorized version of auto_sortmerge_join_1.q fails during execution with NPE

2015-10-02 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11514:

Attachment: HIVE-11514.01.patch

> Vectorized version of auto_sortmerge_join_1.q fails during execution with NPE
> -
>
> Key: HIVE-11514
> URL: https://issues.apache.org/jira/browse/HIVE-11514
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11514.01.patch, auto_sortmerge_join_1.q
>
>
> Query from auto_sortmerge_join_1.q:
> {code}
> select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key
> {code}
> generates stack trace:
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.initializeOp(VectorMapJoinOperator.java:177)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:131)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12012) select query on json table with map containing numeric values fails

2015-10-02 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941874#comment-14941874
 ] 

Sushanth Sowmyan commented on HIVE-12012:
-

Thanks for the report, Jason. Sure, I can look further into this. I had looked 
at HCATALOG-630 a long time back but I seem to remember that I could not 
reproduce that at the time. If we have a more recent reproduction, it 
definitely is worth investigating.

Tests for JsonSerDe are mostly in TestJsonSerDe, instead of in .q files, since 
it descends from HCatalog - that seems to test map and 
map> as the basic cases, which work.

I'll try to reproduce and dig further.

> select query on json table with map containing numeric values fails
> ---
>
> Key: HIVE-12012
> URL: https://issues.apache.org/jira/browse/HIVE-12012
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Jagruti Varia
>Assignee: Jason Dere
> Attachments: HIVE-12012.1.patch
>
>
> select query on json table throws this error if table contains map type 
> column:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> {noformat}
> steps to reproduce the issue:
> {noformat}
> hive> create table c_complex(a array,b map) row format 
> serde 'org.apache.hive.hcatalog.data.JsonSerDe';
> OK
> Time taken: 0.319 seconds
> hive> insert into table c_complex select array('aaa'),map('aaa',1) from 
> studenttab10k limit 2;
> Query ID = hrt_qa_20150826183232_47deb33a-19c0-4d2b-a92f-726659eb9413
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id 
> application_1440603993714_0010)
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 ..   SUCCEEDED  1  100   0  
>  0
> Reducer 2 ..   SUCCEEDED  1  100   0  
>  0
> 
> VERTICES: 02/02  [==>>] 100%  ELAPSED TIME: 11.75 s   
>  
> 
> Loading data to table default.c_complex
> Table default.c_complex stats: [numFiles=1, numRows=2, totalSize=56, 
> rawDataSize=0]
> OK
> Time taken: 13.706 seconds
> hive> select * from c_complex;
> OK
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: 
> org.codehaus.jackson.JsonParseException: Current token (FIELD_NAME) not 
> numeric, can not use numeric value accessors
>  at [Source: java.io.ByteArrayInputStream@295f79b; line: 1, column: 26]
> Time taken: 0.115 seconds
> hive> select count(*) from c_complex;
> OK
> 2
> Time taken: 0.205 seconds, Fetched: 1 row(s)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11856) allow split strategies to run on threadpool

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11856:

Description: 
If a split strategy makes metastore cache calls, it should probably run on the 
threadpool.



  was:
If a split strategy makes metastore cache calls, it should probably run on the 
threadpool.


NO PRECOMMIT TESTS


> allow split strategies to run on threadpool
> ---
>
> Key: HIVE-11856
> URL: https://issues.apache.org/jira/browse/HIVE-11856
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11856.01.patch, HIVE-11856.patch
>
>
> If a split strategy makes metastore cache calls, it should probably run on 
> the threadpool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11856) allow split strategies to run on threadpool

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11856:

Attachment: HIVE-11856.01.patch

Rebased the patch on master

> allow split strategies to run on threadpool
> ---
>
> Key: HIVE-11856
> URL: https://issues.apache.org/jira/browse/HIVE-11856
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11856.01.patch, HIVE-11856.patch
>
>
> If a split strategy makes metastore cache calls, it should probably run on 
> the threadpool.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11786) Deprecate the use of redundant column in colunm stats related tables

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941867#comment-14941867
 ] 

Szehon Ho commented on HIVE-11786:
--

Chatted with Chaoyu offline, this change actually looks good to me, +1.  

It is a big change, but Chaoyu gave a good explanation of performance impact of 
the join, and there stands to be more gain for cases like changing the 
part/table names.  And there seems to be good coverage via the qtests.

> Deprecate the use of redundant column in colunm stats related tables
> 
>
> Key: HIVE-11786
> URL: https://issues.apache.org/jira/browse/HIVE-11786
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11786.1.patch, HIVE-11786.1.patch, 
> HIVE-11786.2.patch, HIVE-11786.patch
>
>
> The stats tables such as TAB_COL_STATS, PART_COL_STATS have redundant columns 
> such as DB_NAME, TABLE_NAME, PARTITION_NAME since these tables already have 
> foreign key like TBL_ID, or PART_ID referencing to TBLS or PARTITIONS. 
> These redundant columns violate database normalization rules and cause a lot 
> of inconvenience (sometimes difficult) in column stats related feature 
> implementation. For example, when renaming a table, we have to update 
> TABLE_NAME column in these tables as well which is unnecessary.
> This JIRA is first to deprecate the use of these columns at HMS code level. A 
> followed JIRA is to be opened to focus on DB schema change and upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11421) Support Schema evolution for ACID tables

2015-10-02 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-11421:
---

Assignee: Matt McCline  (was: Eugene Koifman)

> Support Schema evolution for ACID tables
> 
>
> Key: HIVE-11421
> URL: https://issues.apache.org/jira/browse/HIVE-11421
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Matt McCline
>
> Currently schema evolution is not supported for ACID tables.
> Whatever limitations ORC based tables have in general wrt to schema evolution 
> applies to ACID tables.  Generally, it's possible to have ORC based table in 
> Hive where different partitions have different schemas as long as all data 
> files in each partition have the same schema (and matches metastore partition 
> information)
> With ACID tables the above "as long as ..." part can easily be violated.
> {noformat}
> CREATE TABLE acid_partitioned2(a INT, b STRING) PARTITIONED BY(bkt INT) 
> CLUSTERED BY(a) INTO 2 BUCKETS STORED AS ORC;
> insert into table acid_partitioned2 partition(bkt=1) values(1, 'part 
> one'),(2, 'part one'), (3, 'part two'),(4, 'part three');
> alter table acid_partitioned2 add columns(c int, d string);
> insert into table acid_partitioned2 partition(bkt=2) values(1, 'part one', 
> 10, 'str10'),(2, 'part one', 20, 'str20'), (3, 'part two', 30, 'str30'),(4, 
> 'part three', 40, 'str40');
> insert into table acid_partitioned2 partition(bkt=1) values(5, 'part one', 1, 
> 'blah'),(6, 'part one', 2, 'doh!');
> {noformat}
> Now partition bkt=1 will have delta files with different schemas which have 
> to be merged on read, which leads to 
> {noformat}
> Error: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 9
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:247)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.(RecordReaderImpl.java:1864)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.createTreeReader(RecordReaderImpl.java:2263)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.access$000(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.(RecordReaderImpl.java:1865)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.createTreeReader(RecordReaderImpl.java:2263)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:283)
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:492)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:181)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:460)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1109)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1007)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:245)
> ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12018) beeline --help doesn't return to original prompt

2015-10-02 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated HIVE-12018:
-
Attachment: HIVE-12018.2.patch

Uploaded patch against 'master' branch.


> beeline --help doesn't return to original prompt
> 
>
> Key: HIVE-12018
> URL: https://issues.apache.org/jira/browse/HIVE-12018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
> Attachments: HIVE-12018.1.patch, HIVE-12018.2.patch
>
>
> "beeline --help"  displays the help message and returns to beeline prompt. 
> The common pattern is to return to the unix prompt. The intention of any 
> command help is to relaunch the same command with correct parameters.
> One such output is :
> {quote}
> $ beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-nthe username to connect as
>-pthe password to connect as
> .
> Beeline version .. by Apache Hive
> beeline> 
> {quote}
> The expected return prompt should be  "$" (the unix prompt).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12018) beeline --help doesn't return to original prompt

2015-10-02 Thread Mohammad Kamrul Islam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Kamrul Islam updated HIVE-12018:
-
Attachment: HIVE-12018.1.patch

patch uploaded.

> beeline --help doesn't return to original prompt
> 
>
> Key: HIVE-12018
> URL: https://issues.apache.org/jira/browse/HIVE-12018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
> Attachments: HIVE-12018.1.patch
>
>
> "beeline --help"  displays the help message and returns to beeline prompt. 
> The common pattern is to return to the unix prompt. The intention of any 
> command help is to relaunch the same command with correct parameters.
> One such output is :
> {quote}
> $ beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-nthe username to connect as
>-pthe password to connect as
> .
> Beeline version .. by Apache Hive
> beeline> 
> {quote}
> The expected return prompt should be  "$" (the unix prompt).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12027) simplify file metadata cache ppd api

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941831#comment-14941831
 ] 

Sergey Shelukhin commented on HIVE-12027:
-

[~alangates] can you take a look? nogen patch has the actual change

> simplify file metadata cache ppd api
> 
>
> Key: HIVE-12027
> URL: https://issues.apache.org/jira/browse/HIVE-12027
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: HIVE-12027.nogen.patch, HIVE-12027.patch
>
>
> I made it unwieldy for iterator model when the number of files is too large. 
> Fix it. Not shipped anywhere yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12027) simplify file metadata cache ppd api

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12027:

Attachment: HIVE-12027.nogen.patch
HIVE-12027.patch

Very simple patch. Most of the thrift changes are due to the stupid gen date 
feature, as usual.

> simplify file metadata cache ppd api
> 
>
> Key: HIVE-12027
> URL: https://issues.apache.org/jira/browse/HIVE-12027
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: HIVE-12027.nogen.patch, HIVE-12027.patch
>
>
> I made it unwieldy for iterator model when the number of files is too large. 
> Fix it. Not shipped anywhere yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12027) simplify file metadata cache ppd api

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12027:

Description: I made it unwieldy for iterator model when the number of files 
is too large. Fix it. Not shipped anywhere yet.  (was: I made it unwieldy for 
iterator model when the number of files is too large. Fix it.)

> simplify file metadata cache ppd api
> 
>
> Key: HIVE-12027
> URL: https://issues.apache.org/jira/browse/HIVE-12027
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> I made it unwieldy for iterator model when the number of files is too large. 
> Fix it. Not shipped anywhere yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941783#comment-14941783
 ] 

Sergey Shelukhin edited comment on HIVE-11969 at 10/2/15 9:29 PM:
--

Tested the recent patch on cluster, clogging the nodes with spurious llap 
processes :)
The CLI looks like the attached while the corresponding Tez app is submitted 
but not running. When I stopped the llap cluster, the query started running.
I could run queries (such as "use database") without Tez AM (see OK above the 
query start).
Query also compiled without waiting for AM.


!Screen Shot 2015-10-02 at 14.23.17 .png!





was (Author: sershe):
Tested the recent patch on cluster, clogging it with spurious llap processes :)
The CLI looks like the attached while Tez app is submitted but not running. 
When I stopped the llap cluster, the query started running.
I could run the queries (such as "use database") without Tez AM (see OK above 
the query start).
Query also compiled without waiting for AM.


!Screen Shot 2015-10-02 at 14.23.17 .png!




> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch, Screen Shot 2015-10-02 at 14.23.17 .png
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records

2015-10-02 Thread Roshan Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roshan Naik updated HIVE-11983:
---
Attachment: HIVE-11983.5.patch

Uploading patch v5 addressing [~ekoifman]'s comments

> Hive streaming API uses incorrect logic to assign buckets to incoming records
> -
>
> Key: HIVE-11983
> URL: https://issues.apache.org/jira/browse/HIVE-11983
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.2.1
>Reporter: Roshan Naik
>Assignee: Roshan Naik
>  Labels: streaming, streaming_api
> Attachments: HIVE-11983.3.patch, HIVE-11983.4.patch, 
> HIVE-11983.5.patch, HIVE-11983.patch
>
>
> The Streaming API tries to distribute records evenly into buckets. 
> All records in every Transaction that is part of TransactionBatch goes to the 
> same bucket and a new bucket number is chose for each TransactionBatch.
> Fix: API needs to hash each record to determine which bucket it belongs to. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941783#comment-14941783
 ] 

Sergey Shelukhin edited comment on HIVE-11969 at 10/2/15 9:22 PM:
--

Tested the recent patch on cluster, clogging it with spurious llap processes :)
The CLI looks like the attached while Tez app is submitted but not running. 
When I stopped the llap cluster, the query started running.
I could run the queries (such as "use database") without Tez AM (see OK above 
the query start).
Query also compiled without waiting for AM.


!Screen Shot 2015-10-02 at 14.23.17 .png!





was (Author: sershe):
Tested the recent patch on cluster, clogging it with spurious llap processes :)
The CLI looks like the attached while Tez app is submitted but not running. 
When I stopped the llap cluster, the query started running.
I could run the queries (such as "use database") without Tez AM (see OK above 
the query start).
Query also compiled without waiting for AM.

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch, Screen Shot 2015-10-02 at 14.23.17 .png
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: Screen Shot 2015-10-02 at 14.23.17 .png

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch, Screen Shot 2015-10-02 at 14.23.17 .png
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12026) Add test case to check permissions when truncating partition

2015-10-02 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-12026:
--
Attachment: HIVE-12026.1.patch

> Add test case to check permissions when truncating partition
> 
>
> Key: HIVE-12026
> URL: https://issues.apache.org/jira/browse/HIVE-12026
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12026.1.patch
>
>
> Add to the tests added during HIVE-9474, for TRUNCATE PARTITION



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941783#comment-14941783
 ] 

Sergey Shelukhin commented on HIVE-11969:
-

Tested the recent patch on cluster, clogging it with spurious llap processes :)
The CLI looks like the attached while Tez app is submitted but not running. 
When I stopped the llap cluster, the query started running.
I could run the queries (such as "use database") without Tez AM (see OK above 
the query start).
Query also compiled without waiting for AM.

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11913) Verify existence of tests for new changes in HiveQA

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941776#comment-14941776
 ] 

Szehon Ho commented on HIVE-11913:
--

Sergio reviewed it on RB, and chatted with Sergio offline, will check it in and 
monitor the build to see if it works.

> Verify existence of tests for new changes in HiveQA
> ---
>
> Key: HIVE-11913
> URL: https://issues.apache.org/jira/browse/HIVE-11913
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-11913.patch
>
>
> Would be great if HiveQA could report whether there are test files (Test*, 
> *Test, or qfiles) that are added, or changed.
> Note not every change would need this, but it should be the best of ability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: HIVE-11969.03.patch

Hmm, that was a wrong patch

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.01.patch, HIVE-11969.02.patch, 
> HIVE-11969.03.patch, HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941764#comment-14941764
 ] 

Szehon Ho commented on HIVE-11866:
--

No problem, +1

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-11866.2.patch, HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2015-10-02 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-11866:
-
Attachment: HIVE-11866.2.patch

Incorporated comments from the RB review. Thanks Szehon!!! 

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-11866.2.patch, HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941749#comment-14941749
 ] 

Hive QA commented on HIVE-11634:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764798/HIVE-11634.98.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9627 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_partition_diff_num_cols.q-vectorization_10.q-orc_merge9.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5500/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5500/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5500/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764798 - PreCommit-HIVE-TRUNK-Build

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941695#comment-14941695
 ] 

Szehon Ho commented on HIVE-11866:
--

Sorry, i took another glance and there's one more item of moving the version to 
the parent pom is needed.  Apologies for missing this in first pass.

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941690#comment-14941690
 ] 

Szehon Ho commented on HIVE-11866:
--

Thanks, +1

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941665#comment-14941665
 ] 

Aaron Dossett commented on HIVE-11977:
--

Thanks, [~ashutoshc], I checked the Avro JIRA based on your suggestion.  The 
Avro project declined that option in AVRO-1530 and suggested clients ignore 
zero length files.  That also led me to HIVE-7316, which my issue duplicates.

[~brocknoland] Your thoughts, since you are on both of the above JIRAs?

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941648#comment-14941648
 ] 

Ashutosh Chauhan commented on HIVE-11977:
-

If I understand correctly, reason you are suggesting is Reader should be 
resilient to such invalid files, If so, I think better place to skip such files 
is Avro's native reader itself. 
That way all of Avro user gets advantage of this, not just Hive. e.g, If you 
read that data directly (i.e., outside of Hive) than this fix will again be 
needed.

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11852) numRows and rawDataSize table properties are not replicated

2015-10-02 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941642#comment-14941642
 ] 

Sushanth Sowmyan commented on HIVE-11852:
-

[~ashutoshc], the problem with a config property here is that this stats squish 
I'm trying to prevent does not happen on the ql-side. This happens on the 
metastore, from the AlterTableHandler where an alter table gets issued from the 
client side. The metastore then decides that since the table has been altered, 
the table is now different, and thus, stats must be nuked.

I feel like if the decision to nuke the stats were not made by the metastore, 
but by the ql-side, that is cleaner and would not result in this problem, but 
then if stats squishing and table altering were two different metastore calls, 
we run into issues where one succeeding and the other not would lead to 
incorrect data elsewhere, apart from other performance implications as well.

> numRows and rawDataSize table properties are not replicated
> ---
>
> Key: HIVE-11852
> URL: https://issues.apache.org/jira/browse/HIVE-11852
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.2.1
>Reporter: Paul Isaychuk
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-11852.patch
>
>
> numRows and rawDataSize table properties are not replicated when exported for 
> replication and re-imported.
> {code}
> Table drdbnonreplicatabletable.vanillatable has different TblProps from 
> drdbnonreplicatabletable.vanillatable expected [{numFiles=1, numRows=2, 
> totalSize=560, rawDataSize=440}] but found [{numFiles=1, totalSize=560}]
> java.lang.AssertionError: Table drdbnonreplicatabletable.vanillatable has 
> different TblProps from drdbnonreplicatabletable.vanillatable expected 
> [{numFiles=1, numRows=2, totalSize=560, rawDataSize=440}] but found 
> [{numFiles=1, totalSize=560}]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess

2015-10-02 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941620#comment-14941620
 ] 

Aihua Xu commented on HIVE-10755:
-

Thanks for reviewing Daniel. I will take a look and try to add the test case. 

> Rework on HIVE-5193 to enhance the column oriented table acess
> --
>
> Key: HIVE-10755
> URL: https://issues.apache.org/jira/browse/HIVE-10755
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog
>Affects Versions: 1.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.0.0
>
> Attachments: HIVE-10755.patch
>
>
> Add the support of column pruning for column oriented table access which was 
> done in HIVE-5193 but was reverted due to the join issue in HIVE-10720.
> In 1.3.0, the patch posted by Viray didn't work, probably due to some jar 
> reference. That seems to get fixed and that patch works in 2.0.0 now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11793) SHOW LOCKS with DbTxnManager ignores filter options

2015-10-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11793:
--
Priority: Minor  (was: Major)

> SHOW LOCKS with DbTxnManager ignores filter options
> ---
>
> Key: HIVE-11793
> URL: https://issues.apache.org/jira/browse/HIVE-11793
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
>
> https://cwiki.apache.org/confluence/display/Hive/Locking and 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowLocks
>  list various options that can be used with SHOW LOCKS, e.g. 
> When ACID is enabled, all these options are ignored and a full list is 
> returned.
> (also only ext lock id is shown, int lock id is not).
> see DDLTask.showLocks() and TxnHandler.showLocks()
> requires extending ShowLocksRequest which is a Thrift object



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4726) HCatDelegator should not have to parse error text; should instead rely on canonical errorCode

2015-10-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4726:
-
Assignee: (was: Eugene Koifman)

> HCatDelegator should not have to parse error text; should instead rely on 
> canonical errorCode
> -
>
> Key: HIVE-4726
> URL: https://issues.apache.org/jira/browse/HIVE-4726
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HCatDelegator.descOnePartition() and addOnePartition() look for specific 
> exception names in Hive stderr.
> This is unreliable.  Hive should produce errorCode value from 
> org.apache.hadoop.hive.ql.ErrorMsg.  This likely needs to be fixed in DDLTask 
> which are called on behalf of HCatDelegator methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5001) [WebHCat] JobState is read/written with different user credentials

2015-10-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5001:
-
Assignee: (was: Eugene Koifman)

> [WebHCat] JobState is read/written with different user credentials
> --
>
> Key: HIVE-5001
> URL: https://issues.apache.org/jira/browse/HIVE-5001
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, WebHCat
>Affects Versions: 0.11.0
>Reporter: Eugene Koifman
>Priority: Minor
>
> JobState can be persisted to HDFS or Zookeeper.  At various points in the 
> lifecycle it's accessed with different user credentials thus may cause errors 
> depending on how permissions are set.
> Example:
> When submitting a MR job, templeton.JarDelegator is used.
> It calls LauncherDelegator#queueAsUser() which runs TempletonControllerJob 
> with UserGroupInformation.doAs().
> TempletonControllerJob will in turn create JobState and persist it.
> LauncherDelegator.registerJob() also modifies JobState but w/o doing a doAs()
> So in the later case it's possible that the persisted state of JobState by a 
> different user than one that created/owns the file.
> templeton.tool.HDFSCleanup tries to delete these files w/o doAs.
> 'childid' file, for example, is created with rw-r--r--.
> and it's parent directory (job_201308051224_0001) has rwxr-xr-x.
> HDFSStorage doesn't set file permissions explicitly so it must be using 
> default permissions.
> So there is a potential issue here (depending on UMASK) especially once 
> HIVE-4601 is addressed.
> Actually, even w/o HIVE-4601 the user that owns the WebHCat process is likely 
> different than the one submitting a request.
> The default for templeton.storage.class is 
> org.apache.hcatalog.templeton.toolHDFSStorage, but it's likely that most 
> production environments change it to Zookeeper, which may explain why this 
> issue is not commonly seen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Aaron Dossett (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941603#comment-14941603
 ] 

Aaron Dossett commented on HIVE-11977:
--

[~ashutoshc] Thank you for your response! My thought is that any process for 
generating this data could have failure scenarios that result in zero length 
files, this was the case when I initially ran into this issue.  A file was 
opened on HDFS and "held" as zero length file before data was written to it, 
and it crashed before any data could be written.  The consequences of these 
cases, that the entire table is unreadable (based on my experience), seems 
disproportionate to the actual problem.  Likewise, a process deleting empty 
files could expose small windows where the table was unusable.

Would adding a warning and/or adding an option like 
{{hive.exec.orc.skip.corrupt.data}} be more appropriate than silently ignoring 
the files?  This is my first foray into Hive internals, so perhaps that orc 
option is not an exact comparison to this situation, but as a user it seems 
similar.

Thank you again for the response and your feedback!

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12021) HivePreFilteringRule may introduce wrong common operands

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12021:
---
Attachment: HIVE-12021.01.patch

> HivePreFilteringRule may introduce wrong common operands
> 
>
> Key: HIVE-12021
> URL: https://issues.apache.org/jira/browse/HIVE-12021
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12021.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11699) Support special characters in quoted table names

2015-10-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11699:
---
Attachment: HIVE-11699.09.patch

> Support special characters in quoted table names
> 
>
> Key: HIVE-11699
> URL: https://issues.apache.org/jira/browse/HIVE-11699
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, 
> HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch, 
> HIVE-11699.06.patch, HIVE-11699.07.patch, HIVE-11699.08.patch, 
> HIVE-11699.09.patch
>
>
> Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
> investigate how much change there should be if we would like to support 
> special characters, e.g., "/" in table names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-10-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941569#comment-14941569
 ] 

Ashutosh Chauhan commented on HIVE-9447:


This patch needs more work. First, it needs to be rebased on master. Second, it 
executes the query outside of Transaction, which needs to be fixed.

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-10-02 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-9447:
---
Comment: was deleted

(was: This patch needs more work. First, it needs to be rebased on master. 
Second, it executes the query outs)

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12021) HivePreFilteringRule may introduce wrong common operands

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12021:
---
Attachment: (was: HIVE-12021.patch)

> HivePreFilteringRule may introduce wrong common operands
> 
>
> Key: HIVE-12021
> URL: https://issues.apache.org/jira/browse/HIVE-12021
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

2015-10-02 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941548#comment-14941548
 ] 

Ashutosh Chauhan commented on HIVE-11977:
-

Thanks for patch [~dossett] 
A 0-length file is an invalid Avro file, as in Avro's {{DataFileWriter}} will 
always write MAGIC header for version. Thats the reason {{DataFileReader}} 
expects it and throws up when it doesn't get one.
It seems these 0 length files got there because of some faulty generator 
process. Isn't it better to just not generate those 0 length files. Or, 
alternatively, delete these faulty files.

> Hive should handle an external avro table with zero length files present
> 
>
> Key: HIVE-11977
> URL: https://issues.apache.org/jira/browse/HIVE-11977
> Project: Hive
>  Issue Type: Bug
>Reporter: Aaron Dossett
>Assignee: Aaron Dossett
> Attachments: HIVE-11977-2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro 
> table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader 
> creates a new org.apache.avro.file.DataFileReader and DataFileReader throws 
> an exception when trying to read an empty file (because the empty file lacks 
> the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave 
> reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.(DataFileReader.java:97)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81)
> at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12021) HivePreFilteringRule may introduce wrong common operands

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12021:
---
Attachment: HIVE-12021.patch

> HivePreFilteringRule may introduce wrong common operands
> 
>
> Key: HIVE-12021
> URL: https://issues.apache.org/jira/browse/HIVE-12021
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 1.3.0, 2.0.0, 1.2.2
>
> Attachments: HIVE-12021.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11952) disable q tests that are both slow and less relevant

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941538#comment-14941538
 ] 

Sergey Shelukhin commented on HIVE-11952:
-

[~jpullokkaran] see HIVE-11992; they should be investigated for perf 
improvement. If you are referring to minimr test on CBO, perhaps they would 
become faster by just moving them away from MR; we are de-emphasizing and 
eventually removing MR in Hive 2 anyway.

> disable q tests that are both slow and less relevant
> 
>
> Key: HIVE-11952
> URL: https://issues.apache.org/jira/browse/HIVE-11952
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-11952.01.patch, HIVE-11952.patch
>
>
> We will disable several tests that test obscure and old features and take 
> inordinate amount of time, and file JIRAs to look at their perf if someone 
> still cares about them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11952) disable q tests that are both slow and less relevant

2015-10-02 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941532#comment-14941532
 ] 

Laljo John Pullokkaran commented on HIVE-11952:
---

[~sershe] Whats the plan to reenable this tests?

> disable q tests that are both slow and less relevant
> 
>
> Key: HIVE-11952
> URL: https://issues.apache.org/jira/browse/HIVE-11952
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.0.0
>
> Attachments: HIVE-11952.01.patch, HIVE-11952.patch
>
>
> We will disable several tests that test obscure and old features and take 
> inordinate amount of time, and file JIRAs to look at their perf if someone 
> still cares about them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11866) Add framework to enable testing using LDAPServer using LDAP protocol

2015-10-02 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941523#comment-14941523
 ] 

Szehon Ho commented on HIVE-11866:
--

This looks good, great to have some test about LDAP to ensure the behavior.

Left comments on the review board.

> Add framework to enable testing using LDAPServer using LDAP protocol
> 
>
> Key: HIVE-11866
> URL: https://issues.apache.org/jira/browse/HIVE-11866
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.3.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-11866.patch
>
>
> Currently there is no unit test coverage for HS2's LDAP Atn provider using a 
> LDAP Server on the backend. This prevents testing of the LDAPAtnProvider with 
> some realistic usecases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12016) Update log4j2 version to 2.4

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941504#comment-14941504
 ] 

Sergey Shelukhin commented on HIVE-12016:
-

+1

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12016) Update log4j2 version to 2.4

2015-10-02 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941501#comment-14941501
 ] 

Prasanth Jayachandran commented on HIVE-12016:
--

Created HIVE-12020 for reverting back xml configuration back to properties 
based configuration.

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12016) Update log4j2 version to 2.4

2015-10-02 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941490#comment-14941490
 ] 

Prasanth Jayachandran commented on HIVE-12016:
--

[~sershe] The old properties file is still not compatible with new properties 
file. We still have to do the migration from old properties to new properties. 
But that should be much easier I guess. I think it makes sense to revert back 
the xml configurations to new properties format (IMO, properties file are 
easier to read than xml). I will create a followup jira for that work. In any 
case we should upgrade to 2.4 :)

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11990) Loading data inpath from a temporary table dir fails on Windows

2015-10-02 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941489#comment-14941489
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11990:
--

The test failures are not related to change.


> Loading data inpath from a temporary table dir fails on Windows
> ---
>
> Key: HIVE-11990
> URL: https://issues.apache.org/jira/browse/HIVE-11990
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11990.1.patch, HIVE-11990.2.patch
>
>
> The query runs:
> {noformat}
> load data inpath 'wasb:///tmp/testtemptable/temptablemisc_5/data' overwrite 
> into table temp2;
> {noformat}
> It fails with:
> {noformat}
> FAILED: SemanticException [Error 10028]: Line 2:37 Path is not legal 
> ''wasb:///tmp/testtemptable/temptablemisc_5/data'': Move from: 
> wasb://humb23-hi...@humboldttesting3.blob.core.windows.net/tmp/testtemptable/temptablemisc_5/data
>  to: 
> hdfs://headnode0.humb23-hive1-ssh.h2.internal.cloudapp.net:8020/tmp/hive/hrt_qa/0d5f8b31-5908-44bf-ae4c-9eee956da066/_tmp_space.db/75b44252-42a7-4d28-baf8-4977daa5d49c
>  is not valid. Please check that values for params "default.fs.name" and 
> "hive.metastore.warehouse.dir" do not conflict.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12019) Create unit test for HIVE-10732

2015-10-02 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-12019:
--
Attachment: HIVE-12019.1.patch

> Create unit test for HIVE-10732
> ---
>
> Key: HIVE-12019
> URL: https://issues.apache.org/jira/browse/HIVE-12019
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-12019.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941485#comment-14941485
 ] 

Hive QA commented on HIVE-9447:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764509/HIVE-9447.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5499/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5499/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5499/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5499/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   ff9822e..a1bac80  master -> origin/master
+ git reset --hard HEAD
HEAD is now at ff9822e HIVE-12004 : SDPO doesnt set colExprMap correctly on new 
RS (Ashutosh Chauhan via Prasanth J)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at a1bac80 HIVE-11998 - Improve Compaction process logging (Eugene 
Koifman, reviewed by Jason Dere)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764509 - PreCommit-HIVE-TRUNK-Build

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0
>Reporter: Selina Zhang
>Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We c

[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-10-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Attachment: (was: HIVE-11553.06.patch)

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.04.patch, HIVE-11553.06.patch, 
> HIVE-11553.07.patch, HIVE-11553.patch
>
>
> This is the first step; uses the simple footer-getting API, without PPD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11971) testResultSetMetaData() in TestJdbcDriver2.java is failing on CBO AST path

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941480#comment-14941480
 ] 

Hive QA commented on HIVE-11971:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764508/HIVE-11971.02.patch

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9642 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_ctas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_print_header
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ctasnullcol
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestJdbcDriver2.testBuiltInUDFCol
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData
org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaDataDuplicateColumnNames
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5498/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5498/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5498/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764508 - PreCommit-HIVE-TRUNK-Build

> testResultSetMetaData() in TestJdbcDriver2.java is failing on CBO AST path
> --
>
> Key: HIVE-11971
> URL: https://issues.apache.org/jira/browse/HIVE-11971
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11971.01.patch, HIVE-11971.02.patch
>
>
> test is passing because wrong golden file is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12016) Update log4j2 version to 2.4

2015-10-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941479#comment-14941479
 ] 

Sergey Shelukhin commented on HIVE-12016:
-

Should we just revert all the xml configuration stuff then and go back to 
properties? No conversion during upgrade required that way.

> Update log4j2 version to 2.4
> 
>
> Key: HIVE-12016
> URL: https://issues.apache.org/jira/browse/HIVE-12016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-12016.1.patch, HIVE-12016.2.patch
>
>
> The latest 2.4 release of log4j2 brought back properties file based 
> configuration. https://logging.apache.org/log4j/2.0/changes-report.html#a2.4
> bump up the version number to 2.4. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-10-02 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.98.patch

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-10-02 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941398#comment-14941398
 ] 

Wei Zheng commented on HIVE-11306:
--

Test failures unrelated. [~gopalv] Can you verify the patch in terms of 
performance and commit the patch? Thanks!

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Wei Zheng
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch, 
> HIVE-11306.3.patch, HIVE-11306.5.patch, HIVE-11306.6.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11989) vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master

2015-10-02 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11989:
---
Fix Version/s: 2.0.0

> vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master
> ---
>
> Key: HIVE-11989
> URL: https://issues.apache.org/jira/browse/HIVE-11989
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.0.0
>
> Attachments: HIVE-11989.01.patch
>
>
> need to update the golden files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12017:
---
Attachment: (was: HIVE-12017.patch)

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12017:
---
Attachment: HIVE-12017.01.patch

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.01.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12017) Do not disable CBO by default when number of joins in a query is equal or less than 1

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12017:
---
Attachment: HIVE-12017.patch

> Do not disable CBO by default when number of joins in a query is equal or 
> less than 1
> -
>
> Key: HIVE-12017
> URL: https://issues.apache.org/jira/browse/HIVE-12017
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12017.patch
>
>
> Instead, we could disable some parts of CBO that are not relevant if the 
> query contains 1 or 0 joins. Implementation should be able to define easily 
> other query patterns for which we might disable some parts of CBO (in case we 
> want to do it in the future).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941323#comment-14941323
 ] 

Hive QA commented on HIVE-11720:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764682/HIVE-11720.4.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9612 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_join30.q-vector_data_types.q-filter_join_breaktask.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-scriptfile1.q-union2.q-and-12-more 
- did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5497/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5497/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5497/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764682 - PreCommit-HIVE-TRUNK-Build

> Allow HiveServer2 to set custom http request/response header size
> -
>
> Key: HIVE-11720
> URL: https://issues.apache.org/jira/browse/HIVE-11720
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, 
> HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch
>
>
> In HTTP transport mode, authentication information is sent over as part of 
> HTTP headers. Sometimes (observed when Kerberos is used) the default buffer 
> size for the headers is not enough, resulting in an HTTP 413 FULL head error. 
> We can expose those as customizable params.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization

2015-10-02 Thread Olaf Flebbe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941219#comment-14941219
 ] 

Olaf Flebbe commented on HIVE-11175:


{code}
!!{localmavencache}!! 
{code}

> create function using jar does not work with sql std authorization
> --
>
> Key: HIVE-11175
> URL: https://issues.apache.org/jira/browse/HIVE-11175
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Olaf Flebbe
> Fix For: 2.0.0
>
> Attachments: HIVE-11175.1.patch
>
>
> {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} 
> gives error code for need of accessing a local foo.jar  resource with ADMIN 
> privileges. Same for HDFS (DFS_URI) .
> problem is that the semantic analysis enforces the ADMIN privilege for write 
> but the jar is clearly input not output. 
> Patch und Testcase appendend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11175) create function using jar does not work with sql std authorization

2015-10-02 Thread Olaf Flebbe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941217#comment-14941217
 ] 

Olaf Flebbe commented on HIVE-11175:


Oops, the path names are relative to my infrastructure ;-)

It seems to me that I would need to enhance 
./beeline/src/java/org/apache/hive/beeline/util/QFileClient.java 
and introduce a !!{localmavencache}!! replacement  ?

Or am I missing something?


> create function using jar does not work with sql std authorization
> --
>
> Key: HIVE-11175
> URL: https://issues.apache.org/jira/browse/HIVE-11175
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Olaf Flebbe
> Fix For: 2.0.0
>
> Attachments: HIVE-11175.1.patch
>
>
> {code:sql}create function xxx as 'xxx' using jar 'file://foo.jar' {code} 
> gives error code for need of accessing a local foo.jar  resource with ADMIN 
> privileges. Same for HDFS (DFS_URI) .
> problem is that the semantic analysis enforces the ADMIN privilege for write 
> but the jar is clearly input not output. 
> Patch und Testcase appendend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12004) SDPO doesnt set colExprMap correctly on new RS

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941120#comment-14941120
 ] 

Hive QA commented on HIVE-12004:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764497/HIVE-12004.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5496/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5496/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5496/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764497 - PreCommit-HIVE-TRUNK-Build

> SDPO doesnt set colExprMap correctly on new RS
> --
>
> Key: HIVE-12004
> URL: https://issues.apache.org/jira/browse/HIVE-12004
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12004.patch
>
>
> As a result plan gets into a bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11306) Add a bloom-1 filter for Hybrid MapJoin spills

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941041#comment-14941041
 ] 

Hive QA commented on HIVE-11306:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764494/HIVE-11306.6.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9624 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_bmj_schema_evolution.q-orc_merge5.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5495/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5495/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5495/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764494 - PreCommit-HIVE-TRUNK-Build

> Add a bloom-1 filter for Hybrid MapJoin spills
> --
>
> Key: HIVE-11306
> URL: https://issues.apache.org/jira/browse/HIVE-11306
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Gopal V
>Assignee: Wei Zheng
> Attachments: HIVE-11306.1.patch, HIVE-11306.2.patch, 
> HIVE-11306.3.patch, HIVE-11306.5.patch, HIVE-11306.6.patch
>
>
> HIVE-9277 implemented Spillable joins for Tez, which suffers from a 
> corner-case performance issue when joining wide small tables against a narrow 
> big table (like a user info table join events stream).
> The fact that the wide table is spilled causes extra IO, even though the nDV 
> of the join key might be in the thousands.
> A cheap bloom-1 filter would add a massive performance gain for such queries, 
> massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11822) vectorize NVL UDF

2015-10-02 Thread Takanobu Asanuma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941025#comment-14941025
 ] 

Takanobu Asanuma commented on HIVE-11822:
-

Sorry for late. Please could you confirm my plan?
- implement VectorNVL like VectorCoalesce.
- modify Vectorizer and VectorizationContext for VectorNVL.
- add some unit tests and qtests as Gopal said.

> vectorize NVL UDF
> -
>
> Key: HIVE-11822
> URL: https://issues.apache.org/jira/browse/HIVE-11822
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Takanobu Asanuma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11976) Extend CBO rules to being able to apply rules only once on a given operator

2015-10-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11976:
---
Attachment: HIVE-11976.02.patch

> Extend CBO rules to being able to apply rules only once on a given operator
> ---
>
> Key: HIVE-11976
> URL: https://issues.apache.org/jira/browse/HIVE-11976
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11976.01.patch, HIVE-11976.02.patch, 
> HIVE-11976.patch
>
>
> Create a way to bail out quickly from HepPlanner if the rule has been already 
> applied on a certain operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11699) Support special characters in quoted table names

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940911#comment-14940911
 ] 

Hive QA commented on HIVE-11699:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764496/HIVE-11699.08.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9645 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_special_character_in_tabnames_3
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5494/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5494/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5494/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764496 - PreCommit-HIVE-TRUNK-Build

> Support special characters in quoted table names
> 
>
> Key: HIVE-11699
> URL: https://issues.apache.org/jira/browse/HIVE-11699
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, 
> HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch, 
> HIVE-11699.06.patch, HIVE-11699.07.patch, HIVE-11699.08.patch
>
>
> Right now table names can only be "[a-zA-z_0-9]+". This patch tries to 
> investigate how much change there should be if we would like to support 
> special characters, e.g., "/" in table names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-10-02 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11517:

Fix Version/s: (was: 1.2.0)
   (was: 1.0.0)
   1.2.2
   1.0.2

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.0.0, 1.0.2, 1.2.2
>
> Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-10-02 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940878#comment-14940878
 ] 

Matt McCline commented on HIVE-11517:
-

Fixed.  Thank you!

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.0.0, 1.0.2, 1.2.2
>
> Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11720) Allow HiveServer2 to set custom http request/response header size

2015-10-02 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940852#comment-14940852
 ] 

Lefty Leverenz commented on HIVE-11720:
---

+1 for the configuration parameter descriptions.

> Allow HiveServer2 to set custom http request/response header size
> -
>
> Key: HIVE-11720
> URL: https://issues.apache.org/jira/browse/HIVE-11720
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11720.1.patch, HIVE-11720.2.patch, 
> HIVE-11720.3.patch, HIVE-11720.4.patch, HIVE-11720.4.patch
>
>
> In HTTP transport mode, authentication information is sent over as part of 
> HTTP headers. Sometimes (observed when Kerberos is used) the default buffer 
> size for the headers is not enough, resulting in an HTTP 413 FULL head error. 
> We can expose those as customizable params.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11989) vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master

2015-10-02 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940848#comment-14940848
 ] 

Lefty Leverenz commented on HIVE-11989:
---

[~pxiong], this needs Fix Version 2.0.0.  Thanks.

> vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master
> ---
>
> Key: HIVE-11989
> URL: https://issues.apache.org/jira/browse/HIVE-11989
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11989.01.patch
>
>
> need to update the golden files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11517) Vectorized auto_smb_mapjoin_14.q produces different results

2015-10-02 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940836#comment-14940836
 ] 

Lefty Leverenz commented on HIVE-11517:
---

[~mmccline], the next releases for branch-1.0 and branch-1.2 will be 1.0.2 and 
1.2.2 respectively (not 1.0.0 or 1.2.0) so Fix Version/s needs to be changed.

Also, I don't see the commit to branch-1.2 in email or in github.

> Vectorized auto_smb_mapjoin_14.q produces different results
> ---
>
> Key: HIVE-11517
> URL: https://issues.apache.org/jira/browse/HIVE-11517
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.0.0, 1.2.0, 1.3.0, 2.0.0
>
> Attachments: HIVE-11517.01.patch, HIVE-11517.02.patch
>
>
> Converted Q file to use ORC and turned on vectorization.
> The query:
> {code}
> select count(*) from (
>   select a.key as key, a.value as val1, b.value as val2 from tbl1 a join tbl2 
> b on a.key = b.key
> ) subq1
> {code}
> produces 10 instead of 22.
> The query:
> {code}
> select src1.key, src1.cnt1, src2.cnt1 from
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq1 group by key
> ) src1
> join
> (
>   select key, count(*) as cnt1 from 
>   (
> select a.key as key, a.value as val1, b.value as val2 from tbl1 a join 
> tbl2 b on a.key = b.key
>   ) subq2 group by key
> ) src2
> {code}
> produces:
> {code}
> 0 3   3
> 2 1   1
> 4 1   1
> 5 3   3
> 8 1   1
> 9 1   1
> {code}
> instead of:
> {code}
> 0 9   9
> 2 1   1
> 4 1   1
> 5 9   9
> 8 1   1
> 9 1   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11998) Improve Compaction process logging

2015-10-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940833#comment-14940833
 ] 

Hive QA commented on HIVE-11998:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12764493/HIVE-11998.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9641 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5493/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5493/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5493/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12764493 - PreCommit-HIVE-TRUNK-Build

> Improve Compaction process logging
> --
>
> Key: HIVE-11998
> URL: https://issues.apache.org/jira/browse/HIVE-11998
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11998.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)