[jira] [Updated] (HIVE-11538) Add an option to skip init script while running tests

2015-08-27 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11538:
--
Labels: TODOC2.0  (was: )

> Add an option to skip init script while running tests
> -
>
> Key: HIVE-11538
> URL: https://issues.apache.org/jira/browse/HIVE-11538
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11538.2.patch, HIVE-11538.3.patch, HIVE-11538.patch
>
>
> {{q_test_init.sql}} has grown over time. Now, it takes substantial amount of 
> time. When debugging a particular query which doesn't need such 
> initialization, this delay is annoyance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11669) OrcFileDump service should support directories

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718115#comment-14718115
 ] 

Hive QA commented on HIVE-11669:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752835/HIVE-11669.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9380 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5095/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5095/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5095/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752835 - PreCommit-HIVE-TRUNK-Build

> OrcFileDump service should support directories
> --
>
> Key: HIVE-11669
> URL: https://issues.apache.org/jira/browse/HIVE-11669
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11669.1.patch
>
>
> orcfiledump service does not support directories. If directory is specified 
> then the program should iterate through all the files in the directory and 
> perform file dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10978) Document fs.trash.interval wrt Hive and HDFS Encryption

2015-08-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718088#comment-14718088
 ] 

Lefty Leverenz commented on HIVE-10978:
---

A version note (or bug note) could also be included in the DROP TABLE and DROP 
PARTITION sections of the DDL doc:

* [DDL -- Drop Table | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropTable]
* [DDL -- Drop Partitions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropPartitions]

> Document fs.trash.interval wrt Hive and HDFS Encryption
> ---
>
> Key: HIVE-10978
> URL: https://issues.apache.org/jira/browse/HIVE-10978
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Security
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Priority: Critical
>  Labels: TODOC1.2
>
> This should be documented in 1.2.1 Release Notes
> When HDFS is encrypted (TDE is enabled), DROP TABLE and DROP PARTITION have 
> unexpected behavior when Hadoop Trash feature is enabled.
> The later is enabled by setting fs.trash.interval > 0 in core-site.xml.
> When Trash is enabled, the data file for the table, should be "moved" to 
> Trash bin. If the table is inside an Encryption Zone, this "move" operation 
> is not allowed.
> There are 2 ways to deal with this:
> 1. use PURGE, as in DROP TABLE blah PURGE. This skips the Trash bin even if 
> enabled.
> 2. set fs.trash.interval = 0. It is critical that this config change is done 
> in core-site.xml. Setting it in hive-site.xml may lead to very strange 
> behavior where the table metadata is deleted but the data file remains.  This 
> will lead to data corruption if a table with the same name is later created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10978) Document fs.trash.interval wrt Hive and HDFS Encryption

2015-08-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718083#comment-14718083
 ] 

Lefty Leverenz commented on HIVE-10978:
---

This can be documented in the AdminManual Configuration section "Hive 
Configuration Variables Used to Interact with Hadoop" -- fs.trash.interval 
belongs in the table of configs, and for extra visibility a small subsection 
could be added after the table.

* [Hive Configuration Variables Used to Interact with Hadoop | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariablesUsedtoInteractwithHadoop]

By the way, the section title "Hive Configuration Variables ..." is misleading 
since none of them are hive.* variables, so I recommend changing it to "Other 
Configuration Variables ..." or some such.

> Document fs.trash.interval wrt Hive and HDFS Encryption
> ---
>
> Key: HIVE-10978
> URL: https://issues.apache.org/jira/browse/HIVE-10978
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Security
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Priority: Critical
>  Labels: TODOC1.2
>
> This should be documented in 1.2.1 Release Notes
> When HDFS is encrypted (TDE is enabled), DROP TABLE and DROP PARTITION have 
> unexpected behavior when Hadoop Trash feature is enabled.
> The later is enabled by setting fs.trash.interval > 0 in core-site.xml.
> When Trash is enabled, the data file for the table, should be "moved" to 
> Trash bin. If the table is inside an Encryption Zone, this "move" operation 
> is not allowed.
> There are 2 ways to deal with this:
> 1. use PURGE, as in DROP TABLE blah PURGE. This skips the Trash bin even if 
> enabled.
> 2. set fs.trash.interval = 0. It is critical that this config change is done 
> in core-site.xml. Setting it in hive-site.xml may lead to very strange 
> behavior where the table metadata is deleted but the data file remains.  This 
> will lead to data corruption if a table with the same name is later created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718058#comment-14718058
 ] 

Hive QA commented on HIVE-11668:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752820/HIVE-11668.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9380 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5094/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5094/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5094/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752820 - PreCommit-HIVE-TRUNK-Build

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-08-27 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9481:
-
Labels:   (was: TODOC1.2)

> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.2.0
>
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717982#comment-14717982
 ] 

Hive QA commented on HIVE-11645:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752808/HIVE-11645.3.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9380 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5093/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5093/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5093/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752808 - PreCommit-HIVE-TRUNK-Build

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717975#comment-14717975
 ] 

Ashutosh Chauhan commented on HIVE-11668:
-

Hmm.. code is then fragile.
+1

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11678) Add AggregateProjectMergeRule

2015-08-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11678:

Attachment: HIVE-11678.patch

> Add AggregateProjectMergeRule
> -
>
> Key: HIVE-11678
> URL: https://issues.apache.org/jira/browse/HIVE-11678
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11678.patch
>
>
> This will help to get rid of extra projects on top of Aggregation, thus 
> compacting query plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11654) After HIVE-10289, HBase metastore tests failing

2015-08-27 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11654:
--
Attachment: HIVE-11654.1.patch

Some of the failures are fixed by HIVE-11621. I attached the patch for the 
remaining failures. Note this patch is on top of HIVE-11621.

> After HIVE-10289, HBase metastore tests failing
> ---
>
> Key: HIVE-11654
> URL: https://issues.apache.org/jira/browse/HIVE-11654
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Metastore
>Reporter: Alan Gates
>Assignee: Daniel Dai
>Priority: Blocker
> Attachments: HIVE-11654.1.patch
>
>
> After the latest merge from trunk a number of the HBase unit tests are 
> failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11674) LLAP: Update tez version to fix build

2015-08-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11674.
--
Resolution: Fixed

Committed to llap branch.

> LLAP: Update tez version to fix build
> -
>
> Key: HIVE-11674
> URL: https://issues.apache.org/jira/browse/HIVE-11674
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11674.patch
>
>
> With tez version 0.8.0-SNAPSHOT the llap branch build is broken. Throws the 
> following exception 
> {code}
> work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[60,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[61,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[62,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[63,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[111,37]
>  cannot find symbol
> [ERROR] symbol:   class VertexExecutionContext
> [ERROR] location: class org.apache.tez.dag.api.Vertex
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[673,11]
>  cannot find symbol
> 7:10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11642) LLAP: make sure tests pass #3

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717873#comment-14717873
 ] 

Sergey Shelukhin edited comment on HIVE-11642 at 8/28/15 1:27 AM:
--

Will wait for Tez release, do another master merge and rerun; unit test issue 
is the test issue, will comment this test out unless it's fixed


was (Author: sershe):
Will wait for Tez release, do another master merge and rerun

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11657) HIVE-2573 introduces some issues

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11657:
---

Assignee: Sergey Shelukhin

> HIVE-2573 introduces some issues
> 
>
> Key: HIVE-11657
> URL: https://issues.apache.org/jira/browse/HIVE-11657
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> HIVE-2573 introduced static reload functions call.
> It has a few problems:
> 1) When metastore client is initialized using an externally supplied config 
> (i.e. Hive.get(HiveConf)), it still gets called during static init using the 
> main service config. In my case, even though I have uris in the supplied 
> config to connect to remote MS (which eventually happens), the static call 
> creates objectstore, which is undesirable.
> 2) It breaks compat - old metastores do not support this call so new clients 
> will fail, and there's no workaround like not using a new feature because the 
> static call is always made



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11657) HIVE-2573 introduces some issues

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717877#comment-14717877
 ] 

Sergey Shelukhin commented on HIVE-11657:
-

I will take a look tomorrow. This is causing annoying and unobvious problems on 
some setups.

> HIVE-2573 introduces some issues
> 
>
> Key: HIVE-11657
> URL: https://issues.apache.org/jira/browse/HIVE-11657
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> HIVE-2573 introduced static reload functions call.
> It has a few problems:
> 1) When metastore client is initialized using an externally supplied config 
> (i.e. Hive.get(HiveConf)), it still gets called during static init using the 
> main service config. In my case, even though I have uris in the supplied 
> config to connect to remote MS (which eventually happens), the static call 
> creates objectstore, which is undesirable.
> 2) It breaks compat - old metastores do not support this call so new clients 
> will fail, and there's no workaround like not using a new feature because the 
> static call is always made



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11627) Reduce the number of accesses to hashmaps in PPD

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717876#comment-14717876
 ] 

Hive QA commented on HIVE-11627:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752795/HIVE-11627.01.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9379 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5092/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5092/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5092/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752795 - PreCommit-HIVE-TRUNK-Build

> Reduce the number of accesses to hashmaps in PPD
> 
>
> Key: HIVE-11627
> URL: https://issues.apache.org/jira/browse/HIVE-11627
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11627.01.patch, HIVE-11627.patch
>
>
> We retrieve the ExprInfo from the hashmap each time we want to change any of 
> its properties. Instead, the number of calls to the hashmap could be 
> drastically reduced by retrieving the ExprInfo once, and changing all its 
> properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717873#comment-14717873
 ] 

Sergey Shelukhin commented on HIVE-11642:
-

Will wait for Tez release, do another master merge and rerun

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11565) LLAP: Tez counters for LLAP

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717872#comment-14717872
 ] 

Sergey Shelukhin commented on HIVE-11565:
-

can we add an API to remove the counter? :)

> LLAP: Tez counters for LLAP
> ---
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7349) Consuming published Hive HCatalog artificats in a Hadoop 2 build environment fails

2015-08-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717860#comment-14717860
 ] 

ASF GitHub Bot commented on HIVE-7349:
--

Github user jamescao commented on a diff in the pull request:

https://github.com/apache/flink/pull/1064#discussion_r38163609
  
--- Diff: flink-staging/flink-hcatalog/pom.xml ---
@@ -34,17 +34,64 @@ under the License.
 
jar
 
+   
+   
+   cloudera
+   
https://repository.cloudera.com/artifactory/cloudera-repos/
+   
+   
+


org.apache.flink
flink-java
${project.version}

-
+   
+   org.apache.flink
+   flink-scala
+   ${project.version}
+   
+   
+   org.apache.hive.hcatalog
+   hive-hcatalog-core
+   1.1.0-cdh5.4.0
--- End diff --

This is a known issue with hcatalog, the maven artifact is compiled against 
hadoop1 and blocks unit testing. 
https://issues.apache.org/jira/browse/HIVE-7349
@fhueske 
I found there is no pre-exist test when I begin to work on this issue. How 
did hcatalog get tested before? If we stick to the vanilla hcatalog, I guess 
one way is to move it to hadoop1 profile(flink compiled against the maven 
hcatalog jar can't be used in a hadoop2 env anyway), which will limit its usage 
to a large extend since almost all hive production servers on run on hadoop2 
now.


> Consuming published Hive HCatalog artificats in a Hadoop 2 build environment 
> fails
> --
>
> Key: HIVE-7349
> URL: https://issues.apache.org/jira/browse/HIVE-7349
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.0
>Reporter: Venkat Ranganathan
>
> The published Hive artifacts are built with Hadoop 1 profile.   Even though 
> Hive has Hadoop 1 and Hadoop 2 shims, some of the HCatalog Mapreduce classes 
> are still dependent on the compiled environment.
> For example, using Hive artifacts published in a Sqoop Hcatalog Hadoop 2 
> build environment results in the following failure
> Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
> java.lang.IncompatibleClassChangeError: Found interface 
> org.apache.hadoop.mapreduce.JobContext, but class was expected
> at 
> org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84)
> at 
> org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:418)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:333)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9811) Hive on Tez leaks WorkMap objects

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717850#comment-14717850
 ] 

Sergey Shelukhin commented on HIVE-9811:


We did confirm that this patch also works. Perhaps they are redundant 
(HIVE-10778 might clean it earlier). We can still commit it for now, +1

[~thejas] [~olegd] thoughts?

> Hive on Tez leaks WorkMap objects
> -
>
> Key: HIVE-9811
> URL: https://issues.apache.org/jira/browse/HIVE-9811
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Oleg Danilov
> Attachments: HIVE-9811.patch
>
>
> TezTask doesn't fully clean gWorkMap, so as result Hive leaks WorkMap objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717846#comment-14717846
 ] 

Sergey Shelukhin commented on HIVE-11675:
-

[~gopalv] fyi

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11675:
---

Assignee: Sergey Shelukhin

> make use of file footer PPD API in ETL strategy or separate strategy
> 
>
> Key: HIVE-11675
> URL: https://issues.apache.org/jira/browse/HIVE-11675
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Need to take a look at the best flow. It won't be much different if we do 
> filtering metastore call for each partition. So perhaps we'd need the custom 
> sync point/batching after all.
> Or we can make it opportunistic and not fetch any footers unless it can be 
> pushed down to metastore or fetched from local cache, that way the only slow 
> threaded op is directory listings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10568) Select count(distinct()) can have more optimal execution plan

2015-08-27 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717834#comment-14717834
 ] 

Shannon Ladymon commented on HIVE-10568:


Added to documentation. Removed TODOC1.2.

> Select count(distinct()) can have more optimal execution plan
> -
>
> Key: HIVE-10568
> URL: https://issues.apache.org/jira/browse/HIVE-10568
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 
> 0.13.0, 0.14.0, 1.0.0, 1.1.0
>Reporter: Mostafa Mokhtar
>Assignee: Ashutosh Chauhan
> Fix For: 1.2.0
>
> Attachments: HIVE-10568.1.patch, HIVE-10568.2.patch, 
> HIVE-10568.patch, HIVE-10568.patch
>
>
> {code:sql}
> select count(distinct ss_ticket_number) from store_sales;
> {code}
> can be rewritten as
> {code:sql}
> select count(1) from (select distinct ss_ticket_number from store_sales) a;
> {code}
> which may run upto 3x faster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10568) Select count(distinct()) can have more optimal execution plan

2015-08-27 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-10568:
---
Labels:   (was: TODOC1.2)

> Select count(distinct()) can have more optimal execution plan
> -
>
> Key: HIVE-10568
> URL: https://issues.apache.org/jira/browse/HIVE-10568
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Logical Optimizer
>Affects Versions: 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 
> 0.13.0, 0.14.0, 1.0.0, 1.1.0
>Reporter: Mostafa Mokhtar
>Assignee: Ashutosh Chauhan
> Fix For: 1.2.0
>
> Attachments: HIVE-10568.1.patch, HIVE-10568.2.patch, 
> HIVE-10568.patch, HIVE-10568.patch
>
>
> {code:sql}
> select count(distinct ss_ticket_number) from store_sales;
> {code}
> can be rewritten as
> {code:sql}
> select count(1) from (select distinct ss_ticket_number from store_sales) a;
> {code}
> which may run upto 3x faster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11671) Optimize RuleRegExp in DPP codepath

2015-08-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717829#comment-14717829
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11671:
--

+1 pending unit test run.

> Optimize RuleRegExp in DPP codepath
> ---
>
> Key: HIVE-11671
> URL: https://issues.apache.org/jira/browse/HIVE-11671
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11671.1.patch, cpu_with_patch.png, 
> cpu_without_patch.png, mem_with_patch.png, mem_without_patch.png
>
>
> When running a large query with DPP in its codepath, RuleRegExp came up as 
> hotspot. Creating this JIRA to optimize RuleRegExp.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11671) Optimize RuleRegExp in DPP codepath

2015-08-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11671:
-
Assignee: Rajesh Balamohan

> Optimize RuleRegExp in DPP codepath
> ---
>
> Key: HIVE-11671
> URL: https://issues.apache.org/jira/browse/HIVE-11671
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11671.1.patch, cpu_with_patch.png, 
> cpu_without_patch.png, mem_with_patch.png, mem_without_patch.png
>
>
> When running a large query with DPP in its codepath, RuleRegExp came up as 
> hotspot. Creating this JIRA to optimize RuleRegExp.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11674) LLAP: Update tez version to fix build

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717827#comment-14717827
 ] 

Prasanth Jayachandran edited comment on HIVE-11674 at 8/28/15 12:29 AM:


Updating to 0.8.1-SNAPSHOT fixes the build. [~sseth] Just wanted to make sure 
if this version is correct before committing the patch.


was (Author: prasanth_j):
Updating to 0.8.1-SNAPSHOT fixes the build. [~sseth] Just wanted to make sure 
if that correct before committing the patch.

> LLAP: Update tez version to fix build
> -
>
> Key: HIVE-11674
> URL: https://issues.apache.org/jira/browse/HIVE-11674
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11674.patch
>
>
> With tez version 0.8.0-SNAPSHOT the llap branch build is broken. Throws the 
> following exception 
> {code}
> work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[60,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[61,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[62,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[63,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[111,37]
>  cannot find symbol
> [ERROR] symbol:   class VertexExecutionContext
> [ERROR] location: class org.apache.tez.dag.api.Vertex
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[673,11]
>  cannot find symbol
> 7:10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11674) LLAP: Update tez version to fix build

2015-08-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11674:
-
Attachment: HIVE-11674.patch

Updating to 0.8.1-SNAPSHOT fixes the build. [~sseth] Just wanted to make sure 
if that correct before committing the patch.

> LLAP: Update tez version to fix build
> -
>
> Key: HIVE-11674
> URL: https://issues.apache.org/jira/browse/HIVE-11674
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11674.patch
>
>
> With tez version 0.8.0-SNAPSHOT the llap branch build is broken. Throws the 
> following exception 
> {code}
> work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[60,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[61,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[62,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java:[63,41]
>  package org.apache.tez.serviceplugins.api does not exist
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[111,37]
>  cannot find symbol
> [ERROR] symbol:   class VertexExecutionContext
> [ERROR] location: class org.apache.tez.dag.api.Vertex
> [ERROR] 
> /work/hive/hive-git/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java:[673,11]
>  cannot find symbol
> 7:10
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717824#comment-14717824
 ] 

Sergey Shelukhin commented on HIVE-11642:
-

I see TestHCatClient.testTableSchemaPropagation fails on regular master runs. 
Filed bug for flaky unit test. The q test is another trivial out file diff. 
Almost there :)

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11587) Fix memory estimates for mapjoin hashtable

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717823#comment-14717823
 ] 

Sergey Shelukhin commented on HIVE-11587:
-

Can you please post an RB

> Fix memory estimates for mapjoin hashtable
> --
>
> Key: HIVE-11587
> URL: https://issues.apache.org/jira/browse/HIVE-11587
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Attachments: HIVE-11587.01.patch
>
>
> Due to the legacy in in-memory mapjoin and conservative planning, the memory 
> estimation code for mapjoin hashtable is currently not very good. It 
> allocates the probe erring on the side of more memory, not taking data into 
> account because unlike the probe, it's free to resize, so it's better for 
> perf to allocate big probe and hope for the best with regard to future data 
> size. It is not true for hybrid case.
> There's code to cap the initial allocation based on memory available 
> (memUsage argument), but due to some code rot, the memory estimates from 
> planning are not even passed to hashtable anymore (there used to be two 
> config settings, hashjoin size fraction by itself, or hashjoin size fraction 
> for group by case), so it never caps the memory anymore below 1 Gb. 
> Initial capacity is estimated from input key count, and in hybrid join cache 
> can exceed Java memory due to number of segments.
> There needs to be a review and fix of all this code.
> Suggested improvements:
> 1) Make sure "initialCapacity" argument from Hybrid case is correct given the 
> number of segments. See how it's calculated from keys for regular case; it 
> needs to be adjusted accordingly for hybrid case if not done already.
> 1.5) Note that, knowing the number of rows, the maximum capacity one will 
> ever need for probe size (in longs) is row count (assuming key per row, i.e. 
> maximum possible number of keys) divided by load factor, plus some very small 
> number to round up. That is for flat case. For hybrid case it may be more 
> complex due to skew, but that is still a good upper bound for the total probe 
> capacity of all segments.
> 2) Rename memUsage to maxProbeSize, or something, make sure it's passed 
> correctly based on estimates that take into account both probe and data size, 
> esp. in hybrid case.
> 3) Make sure that memory estimation for hybrid case also doesn't come up with 
> numbers that are too small, like 1-byte hashtable. I am not very familiar 
> with that code but it has happened in the past.
> Other issues we have seen:
> 4) Cap single write buffer size to 8-16Mb. The whole point of WBs is that you 
> should not allocate large array in advance. Even if some estimate passes 
> 500Mb or 40Mb or whatever, it doesn't make sense to allocate that.
> 5) For hybrid, don't pre-allocate WBs - only allocate on write.
> 6) Change everywhere rounding up to power of two is used to rounding down, at 
> least for hybrid case (?)
> I wanted to put all of these items in single JIRA so we could keep track of 
> fixing all of them.
> I think there are JIRAs for some of these already, feel free to link them to 
> this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11673) LLAP: TestLlapTaskSchedulerService is flaky

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11673:

Description: 
{noformat}
java.lang.Exception: test timed out after 5000 milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at 
org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$LlapTaskSchedulerServiceForTest.forTestAwaitSchedulingRun(TestLlapTaskSchedulerService.java:442)
at 
org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$TestTaskSchedulerServiceWrapper.awaitSchedulerRun(TestLlapTaskSchedulerService.java:361)
at 
org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$TestTaskSchedulerServiceWrapper.(TestLlapTaskSchedulerService.java:349)
at 
org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService.testPreemption(TestLlapTaskSchedulerService.java:119)
{noformat}

Cannot repro locally. See HIVE-11642

  was:
{noformat}
java.lang.Exception: test timed out after 1 milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
at 
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.awaitCompletion(TestTaskExecutorService.java:244)
at 
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.access$000(TestTaskExecutorService.java:208)
at 
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption(TestTaskExecutorService.java:168)
{noformat}

Cannot repro locally. See HIVE-11642


> LLAP: TestLlapTaskSchedulerService is flaky
> ---
>
> Key: HIVE-11673
> URL: https://issues.apache.org/jira/browse/HIVE-11673
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> java.lang.Exception: test timed out after 5000 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$LlapTaskSchedulerServiceForTest.forTestAwaitSchedulingRun(TestLlapTaskSchedulerService.java:442)
>   at 
> org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$TestTaskSchedulerServiceWrapper.awaitSchedulerRun(TestLlapTaskSchedulerService.java:361)
>   at 
> org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService$TestTaskSchedulerServiceWrapper.(TestLlapTaskSchedulerService.java:349)
>   at 
> org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService.testPreemption(TestLlapTaskSchedulerService.java:119)
> {noformat}
> Cannot repro locally. See HIVE-11642



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11660) LLAP: TestTaskExecutorService is flaky

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717819#comment-14717819
 ] 

Sergey Shelukhin commented on HIVE-11660:
-

Nm, TestLlapTaskSchedulerService is also flaky :)

> LLAP: TestTaskExecutorService is flaky
> --
>
> Key: HIVE-11660
> URL: https://issues.apache.org/jira/browse/HIVE-11660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Siddharth Seth
>
> {noformat}
> java.lang.Exception: test timed out after 1 milliseconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.awaitCompletion(TestTaskExecutorService.java:244)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService$TaskExecutorServiceForTest$InternalCompletionListenerForTest.access$000(TestTaskExecutorService.java:208)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testWaitQueuePreemption(TestTaskExecutorService.java:168)
> {noformat}
> Cannot repro locally. See HIVE-11642



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717818#comment-14717818
 ] 

Prasanth Jayachandran commented on HIVE-11595:
--

+1

> refactor ORC footer reading to make it usable from outside
> --
>
> Key: HIVE-11595
> URL: https://issues.apache.org/jira/browse/HIVE-11595
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
> HIVE-11595.02.patch, HIVE-11595.03.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the 
> reader, opening a file, etc. I thought it would be as simple as protobuf 
> parseFrom bytes, but apparently there's bunch of stuff going on there. It 
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717814#comment-14717814
 ] 

Sergey Shelukhin commented on HIVE-11595:
-

Full footer buffer is not always available, see the comments. I considered 
that, but the omnly way around this is to have "fake" full footer buffer with 
fake offsets into it (ie the buffer will only have the footer and offsets wil 
cover the entire buffer), but it's hacky

> refactor ORC footer reading to make it usable from outside
> --
>
> Key: HIVE-11595
> URL: https://issues.apache.org/jira/browse/HIVE-11595
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
> HIVE-11595.02.patch, HIVE-11595.03.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the 
> reader, opening a file, etc. I thought it would be as simple as protobuf 
> parseFrom bytes, but apparently there's bunch of stuff going on there. It 
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11669) OrcFileDump service should support directories

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717807#comment-14717807
 ] 

Sergey Shelukhin commented on HIVE-11669:
-

+1

> OrcFileDump service should support directories
> --
>
> Key: HIVE-11669
> URL: https://issues.apache.org/jira/browse/HIVE-11669
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11669.1.patch
>
>
> orcfiledump service does not support directories. If directory is specified 
> then the program should iterate through all the files in the directory and 
> perform file dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10615) LLAP: Invalid containerId prefix

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717799#comment-14717799
 ] 

Prasanth Jayachandran commented on HIVE-10615:
--

[~daijy] If you have a repro, may be [~sseth] can help. Last time when I had 
this issue I wasn't able to repro. 

> LLAP: Invalid containerId prefix
> 
>
> Key: HIVE-10615
> URL: https://issues.apache.org/jira/browse/HIVE-10615
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>
> I encountered this error when I ran a simple query in llap mode today. 
> {code}org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.lang.IllegalArgumentException: Invalid ContainerId prefix: 
>   at 
> org.apache.hadoop.yarn.api.records.ContainerId.fromString(ContainerId.java:211)
>   at 
> org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:178)
>   at 
> org.apache.tez.dag.app.TezTaskCommunicatorImpl$TezTaskUmbilicalProtocolImpl.heartbeat(TezTaskCommunicatorImpl.java:311)
>   at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$LlapTaskUmbilicalProtocolImpl.heartbeat(LlapTaskCommunicator.java:398)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server$WritableRpcInvoker.call(WritableRpcEngine.java:514)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:244)
>   at com.sun.proxy.$Proxy14.heartbeat(Unknown Source)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:256)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:184)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 15/05/05 15:24:22 [Task-Executor-0] INFO task.TezTaskRunner : Interrupted 
> while waiting for task to complete. Interrupting task
> 15/05/05 15:24:22 [TezTaskRunner_attempt_1430816501738_0034_1_00_00_0] 
> INFO task.TezTaskRunner : Encounted an error while executing task: 
> attempt_1430816501738_0034_1_00_00_0
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:218)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:177)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.Futu

[jira] [Commented] (HIVE-10672) Analyze command on a table using row format serde JsonSerDe fails with NoClassDefFoundError

2015-08-27 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717787#comment-14717787
 ] 

Shannon Ladymon commented on HIVE-10672:


Here are some ideas for where it could be documented in the wiki:
* [LanguageManual Cli - Hive Resources | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-HiveResources]
* [HCatalog InputOutput - Running MapReduce with HCatalog | 
https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-RunningMapReducewithHCatalog]
* [HCatalog LoadStore - Jars and Configuration Files | 
https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-JarsandConfigurationFiles]
* [StatsDev - Existing Tables - ANALYZE | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE]

> Analyze command on a table using row format serde JsonSerDe fails with 
> NoClassDefFoundError
> ---
>
> Key: HIVE-10672
> URL: https://issues.apache.org/jira/browse/HIVE-10672
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 1.2.0
>Reporter: Jason Dere
>Assignee: Jason Dere
>  Labels: TODOC1.2
> Fix For: 1.2.1
>
> Attachments: HIVE-10672.1.patch
>
>
> Found by [~deepesh].
> Running analyze command on a table created using the following DDL:
> {noformat}
> create external table all100kjson(
> s string,
> i int,
> d double,
> m map,
> bb array>)
> row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
> STORED AS TEXTFILE 
> location '/user/hcat/tests/data/all100kjson';
> {noformat}
> analyze command
> {noformat}
> analyze table all100kjson compute statistics;
> {noformat}
> throws the following error:
> {noformat}
> Vertex failed, vertexName=Map 1, vertexId=vertex_1431071702167_0006_1_00, 
> diagnostics=[Task failed, taskId=task_1431071702167_0006_1_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
> initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:331)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
>   ... 14 more
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hive/metastore/IMetaStoreClient
>   at 
> org.apache.hive.hcatalog.data.schema.HCatFieldSchema.(HCatFieldSchema.java:225)
>   at 
> org.apache.hive.hcatalog.data.schema.HCatSchemaUtils.getHCatFieldSchema(HCatSchemaUtils.java:122)
>   at 
> org.apache.hive.hcatalog.data.schema.HCatSchemaUtils.constructHCatSchema(HCatSchemaUtils.java:154)
>   at 
> org.apache.hive.hcatalog.data.schema.HCatSchemaUtils.getHCatSchema(HCatSchemaUtils.java:165)
>   at 
> org.apache.hive.hcatalog.data.JsonSerDe.initialize(JsonSerDe.java:141)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:527)
>   at 
> org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConv

[jira] [Commented] (HIVE-11487) Add getNumPartitionsByFilter api in metastore api

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717774#comment-14717774
 ] 

Hive QA commented on HIVE-11487:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752731/HIVE-11487.01.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9380 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5091/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5091/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5091/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752731 - PreCommit-HIVE-TRUNK-Build

> Add getNumPartitionsByFilter api in metastore api
> -
>
> Key: HIVE-11487
> URL: https://issues.apache.org/jira/browse/HIVE-11487
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Amareshwari Sriramadasu
>Assignee: Akshay Goyal
> Attachments: HIVE-11487.01.patch
>
>
> Adding api for getting number of partitions for a filter will be more optimal 
> when we are only interested in the number. getAllPartitions will construct 
> all the partition object which can be time consuming and not required.
> Here is a commit we pushed in a forked repo in our organization - 
> https://github.com/inmobi/hive/commit/68b3534d3e6c4d978132043cec668798ed53e444.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8458) Potential null dereference in Utilities#clearWork()

2015-08-27 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8458:
-
Description: 
{code}
Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);

// if the plan path hasn't been initialized just return, nothing to clean.
if (mapPath == null && reducePath == null) {
  return;
}

try {
  FileSystem fs = mapPath.getFileSystem(conf);
{code}
If mapPath is null but reducePath is not null, getFileSystem() call would 
produce NPE

  was:
{code}
Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);

// if the plan path hasn't been initialized just return, nothing to clean.
if (mapPath == null && reducePath == null) {
  return;
}

try {
  FileSystem fs = mapPath.getFileSystem(conf);
{code}

If mapPath is null but reducePath is not null, getFileSystem() call would 
produce NPE


> Potential null dereference in Utilities#clearWork()
> ---
>
> Key: HIVE-8458
> URL: https://issues.apache.org/jira/browse/HIVE-8458
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Ted Yu
>Assignee: skrho
>Priority: Minor
> Attachments: HIVE-8458_001.patch
>
>
> {code}
> Path mapPath = getPlanPath(conf, MAP_PLAN_NAME);
> Path reducePath = getPlanPath(conf, REDUCE_PLAN_NAME);
> // if the plan path hasn't been initialized just return, nothing to clean.
> if (mapPath == null && reducePath == null) {
>   return;
> }
> try {
>   FileSystem fs = mapPath.getFileSystem(conf);
> {code}
> If mapPath is null but reducePath is not null, getFileSystem() call would 
> produce NPE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11510) Metatool updateLocation fails on views

2015-08-27 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-11510:


Assignee: Wei Zheng

> Metatool updateLocation fails on views
> --
>
> Key: HIVE-11510
> URL: https://issues.apache.org/jira/browse/HIVE-11510
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.14.0
>Reporter: Eric Czech
>Assignee: Wei Zheng
>
> If views are present in a hive database, issuing a 'hive metatool 
> -updateLocation' command will result in an error like this:
> ...
> Warning: Found records with bad LOCATION in SDS table.. 
> bad location URI: null
> bad location URI: null
> bad location URI: null
> 
> Based on the source code for Metatool, it looks like there would then be a 
> "bad location URI: null" message for every view and it also appears this is 
> happening simply because the 'sds' table in the hive schema has a column 
> called location that is NULL only for views.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10615) LLAP: Invalid containerId prefix

2015-08-27 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717752#comment-14717752
 ] 

Daniel Dai commented on HIVE-10615:
---

Yes, I am seeing this. How do you work around this?

> LLAP: Invalid containerId prefix
> 
>
> Key: HIVE-10615
> URL: https://issues.apache.org/jira/browse/HIVE-10615
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>
> I encountered this error when I ran a simple query in llap mode today. 
> {code}org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.lang.IllegalArgumentException: Invalid ContainerId prefix: 
>   at 
> org.apache.hadoop.yarn.api.records.ContainerId.fromString(ContainerId.java:211)
>   at 
> org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:178)
>   at 
> org.apache.tez.dag.app.TezTaskCommunicatorImpl$TezTaskUmbilicalProtocolImpl.heartbeat(TezTaskCommunicatorImpl.java:311)
>   at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$LlapTaskUmbilicalProtocolImpl.heartbeat(LlapTaskCommunicator.java:398)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server$WritableRpcInvoker.call(WritableRpcEngine.java:514)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:244)
>   at com.sun.proxy.$Proxy14.heartbeat(Unknown Source)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:256)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:184)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 15/05/05 15:24:22 [Task-Executor-0] INFO task.TezTaskRunner : Interrupted 
> while waiting for task to complete. Interrupting task
> 15/05/05 15:24:22 [TezTaskRunner_attempt_1430816501738_0034_1_00_00_0] 
> INFO task.TezTaskRunner : Encounted an error while executing task: 
> attempt_1430816501738_0034_1_00_00_0
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:218)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:177)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecuto

[jira] [Updated] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-08-27 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-10934:
---
Labels:   (was: TODOC1.2)

> Restore support for DROP PARTITION PURGE
> 
>
> Key: HIVE-10934
> URL: https://issues.apache.org/jira/browse/HIVE-10934
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.2.1
>
> Attachments: HIVE-10934.patch
>
>
> HIVE-9086 added support for PURGE in 
> {noformat}
> ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = "sayonara") 
> IGNORE PROTECTION PURGE;
> {noformat}
> looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11605) Incorrect results with bucket map join in tez.

2015-08-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11605:
--
Attachment: HIVE-11606.branch-1.patch
HIVE-11606.2.patch

> Incorrect results with bucket map join in tez.
> --
>
> Key: HIVE-11605
> URL: https://issues.apache.org/jira/browse/HIVE-11605
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.0, 1.2.0, 1.0.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Critical
> Attachments: HIVE-11605.1.patch, HIVE-11606.2.patch, 
> HIVE-11606.branch-1.patch
>
>
> In some cases, we aggressively try to convert to a bucket map join and this 
> ends up producing incorrect results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE

2015-08-27 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717738#comment-14717738
 ] 

Shannon Ladymon commented on HIVE-10934:


Added to documentation.  Removed TODOC1.2 label.

> Restore support for DROP PARTITION PURGE
> 
>
> Key: HIVE-10934
> URL: https://issues.apache.org/jira/browse/HIVE-10934
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.2.1
>
> Attachments: HIVE-10934.patch
>
>
> HIVE-9086 added support for PURGE in 
> {noformat}
> ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = "sayonara") 
> IGNORE PROTECTION PURGE;
> {noformat}
> looks like this was accidentally lost in HIVE-10228



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10978) Document fs.trash.interval wrt Hive and HDFS Encryption

2015-08-27 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717735#comment-14717735
 ] 

Eugene Koifman commented on HIVE-10978:
---

it applies everywhere

> Document fs.trash.interval wrt Hive and HDFS Encryption
> ---
>
> Key: HIVE-10978
> URL: https://issues.apache.org/jira/browse/HIVE-10978
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Security
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Priority: Critical
>  Labels: TODOC1.2
>
> This should be documented in 1.2.1 Release Notes
> When HDFS is encrypted (TDE is enabled), DROP TABLE and DROP PARTITION have 
> unexpected behavior when Hadoop Trash feature is enabled.
> The later is enabled by setting fs.trash.interval > 0 in core-site.xml.
> When Trash is enabled, the data file for the table, should be "moved" to 
> Trash bin. If the table is inside an Encryption Zone, this "move" operation 
> is not allowed.
> There are 2 ways to deal with this:
> 1. use PURGE, as in DROP TABLE blah PURGE. This skips the Trash bin even if 
> enabled.
> 2. set fs.trash.interval = 0. It is critical that this config change is done 
> in core-site.xml. Setting it in hive-site.xml may lead to very strange 
> behavior where the table metadata is deleted but the data file remains.  This 
> will lead to data corruption if a table with the same name is later created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10978) Document fs.trash.interval wrt Hive and HDFS Encryption

2015-08-27 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717732#comment-14717732
 ] 

Lefty Leverenz commented on HIVE-10978:
---

[~ekoifman], can this go in the HiveServer2 doc or is it more general?

> Document fs.trash.interval wrt Hive and HDFS Encryption
> ---
>
> Key: HIVE-10978
> URL: https://issues.apache.org/jira/browse/HIVE-10978
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation, Security
>Affects Versions: 1.2.0
>Reporter: Eugene Koifman
>Priority: Critical
>  Labels: TODOC1.2
>
> This should be documented in 1.2.1 Release Notes
> When HDFS is encrypted (TDE is enabled), DROP TABLE and DROP PARTITION have 
> unexpected behavior when Hadoop Trash feature is enabled.
> The later is enabled by setting fs.trash.interval > 0 in core-site.xml.
> When Trash is enabled, the data file for the table, should be "moved" to 
> Trash bin. If the table is inside an Encryption Zone, this "move" operation 
> is not allowed.
> There are 2 ways to deal with this:
> 1. use PURGE, as in DROP TABLE blah PURGE. This skips the Trash bin even if 
> enabled.
> 2. set fs.trash.interval = 0. It is critical that this config change is done 
> in core-site.xml. Setting it in hive-site.xml may lead to very strange 
> behavior where the table metadata is deleted but the data file remains.  This 
> will lead to data corruption if a table with the same name is later created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11671) Optimize RuleRegExp in DPP codepath

2015-08-27 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-11671:

Attachment: cpu_with_patch.png
mem_with_patch.png
cpu_without_patch.png
mem_without_patch.png

||Query (with DPP codepath)||Before Patch: Explain Plan (time in 
seconds)||After Patch: Explain Plan (time in seconds)||
|large_query_1| 349.346 | 38.513 |
|large_query_2| 732.198 |76.406|
|large_query_3| 18.273  |4.051|
|large_query_4| 17.678  |4.693|
|large_query_5| 3.943   |2.896|


> Optimize RuleRegExp in DPP codepath
> ---
>
> Key: HIVE-11671
> URL: https://issues.apache.org/jira/browse/HIVE-11671
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
> Attachments: HIVE-11671.1.patch, cpu_with_patch.png, 
> cpu_without_patch.png, mem_with_patch.png, mem_without_patch.png
>
>
> When running a large query with DPP in its codepath, RuleRegExp came up as 
> hotspot. Creating this JIRA to optimize RuleRegExp.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10615) LLAP: Invalid containerId prefix

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717727#comment-14717727
 ] 

Prasanth Jayachandran commented on HIVE-10615:
--

[~daijy] Are you seeing this issue as well?

> LLAP: Invalid containerId prefix
> 
>
> Key: HIVE-10615
> URL: https://issues.apache.org/jira/browse/HIVE-10615
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>
> I encountered this error when I ran a simple query in llap mode today. 
> {code}org.apache.hadoop.ipc.RemoteException(java.io.IOException): 
> java.lang.IllegalArgumentException: Invalid ContainerId prefix: 
>   at 
> org.apache.hadoop.yarn.api.records.ContainerId.fromString(ContainerId.java:211)
>   at 
> org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:178)
>   at 
> org.apache.tez.dag.app.TezTaskCommunicatorImpl$TezTaskUmbilicalProtocolImpl.heartbeat(TezTaskCommunicatorImpl.java:311)
>   at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$LlapTaskUmbilicalProtocolImpl.heartbeat(LlapTaskCommunicator.java:398)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server$WritableRpcInvoker.call(WritableRpcEngine.java:514)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:244)
>   at com.sun.proxy.$Proxy14.heartbeat(Unknown Source)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:256)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:184)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 15/05/05 15:24:22 [Task-Executor-0] INFO task.TezTaskRunner : Interrupted 
> while waiting for task to complete. Interrupting task
> 15/05/05 15:24:22 [TezTaskRunner_attempt_1430816501738_0034_1_00_00_0] 
> INFO task.TezTaskRunner : Encounted an error while executing task: 
> attempt_1430816501738_0034_1_00_00_0
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at 
> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:218)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:177)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.Thr

[jira] [Updated] (HIVE-11671) Optimize RuleRegExp in DPP codepath

2015-08-27 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-11671:

Attachment: HIVE-11671.1.patch

> Optimize RuleRegExp in DPP codepath
> ---
>
> Key: HIVE-11671
> URL: https://issues.apache.org/jira/browse/HIVE-11671
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
> Attachments: HIVE-11671.1.patch
>
>
> When running a large query with DPP in its codepath, RuleRegExp came up as 
> hotspot. Creating this JIRA to optimize RuleRegExp.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717719#comment-14717719
 ] 

Sergey Shelukhin commented on HIVE-11668:
-

Because the query needs to be run for every txn

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11671) Optimize RuleRegExp in DPP codepath

2015-08-27 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-11671:

Assignee: (was: Rajesh Balamohan)

> Optimize RuleRegExp in DPP codepath
> ---
>
> Key: HIVE-11671
> URL: https://issues.apache.org/jira/browse/HIVE-11671
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>
> When running a large query with DPP in its codepath, RuleRegExp came up as 
> hotspot. Creating this JIRA to optimize RuleRegExp.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11670) Strip out password information from TezSessionState configuration

2015-08-27 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717630#comment-14717630
 ] 

Vikram Dixit K commented on HIVE-11670:
---

LGTM +1.

> Strip out password information from TezSessionState configuration
> -
>
> Key: HIVE-11670
> URL: https://issues.apache.org/jira/browse/HIVE-11670
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11670.1.patch
>
>
> Remove password information from configuration copy that is sent to Yarn/Tez. 
> We don't need it there. The config entries can potentially be visible to 
> other users.
> HIVE-10508 had the fix which removed this in certain places, however, when I 
> initiated a session via Hive Cli, I could still see the password information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11664) Make tez container logs work with new log4j2 changes

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717626#comment-14717626
 ] 

Hive QA commented on HIVE-11664:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752672/HIVE-11664.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9380 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5090/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5090/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5090/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752672 - PreCommit-HIVE-TRUNK-Build

> Make tez container logs work with new log4j2 changes
> 
>
> Key: HIVE-11664
> URL: https://issues.apache.org/jira/browse/HIVE-11664
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logging, Tests
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.0.0
>
> Attachments: HIVE-11664.1.patch
>
>
> MiniTezCliDriver should log container logs to syslog file. With new log4j2 
> changes this file is not created anymore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11670) Strip out password information from TezSessionState configuration

2015-08-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717618#comment-14717618
 ] 

Thejas M Nair commented on HIVE-11670:
--

[~vikram.dixit] Can you please take a look. You know the tez code flow better.

> Strip out password information from TezSessionState configuration
> -
>
> Key: HIVE-11670
> URL: https://issues.apache.org/jira/browse/HIVE-11670
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11670.1.patch
>
>
> Remove password information from configuration copy that is sent to Yarn/Tez. 
> We don't need it there. The config entries can potentially be visible to 
> other users.
> HIVE-10508 had the fix which removed this in certain places, however, when I 
> initiated a session via Hive Cli, I could still see the password information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11587) Fix memory estimates for mapjoin hashtable

2015-08-27 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-11587:
-
Attachment: HIVE-11587.01.patch

Attach patch 1 for testing.

This patch didn't touch the constructor for regular mapjoin path 
(MapJoinBytesTableContainer). It resolves all other problems in the description.

> Fix memory estimates for mapjoin hashtable
> --
>
> Key: HIVE-11587
> URL: https://issues.apache.org/jira/browse/HIVE-11587
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Wei Zheng
> Attachments: HIVE-11587.01.patch
>
>
> Due to the legacy in in-memory mapjoin and conservative planning, the memory 
> estimation code for mapjoin hashtable is currently not very good. It 
> allocates the probe erring on the side of more memory, not taking data into 
> account because unlike the probe, it's free to resize, so it's better for 
> perf to allocate big probe and hope for the best with regard to future data 
> size. It is not true for hybrid case.
> There's code to cap the initial allocation based on memory available 
> (memUsage argument), but due to some code rot, the memory estimates from 
> planning are not even passed to hashtable anymore (there used to be two 
> config settings, hashjoin size fraction by itself, or hashjoin size fraction 
> for group by case), so it never caps the memory anymore below 1 Gb. 
> Initial capacity is estimated from input key count, and in hybrid join cache 
> can exceed Java memory due to number of segments.
> There needs to be a review and fix of all this code.
> Suggested improvements:
> 1) Make sure "initialCapacity" argument from Hybrid case is correct given the 
> number of segments. See how it's calculated from keys for regular case; it 
> needs to be adjusted accordingly for hybrid case if not done already.
> 1.5) Note that, knowing the number of rows, the maximum capacity one will 
> ever need for probe size (in longs) is row count (assuming key per row, i.e. 
> maximum possible number of keys) divided by load factor, plus some very small 
> number to round up. That is for flat case. For hybrid case it may be more 
> complex due to skew, but that is still a good upper bound for the total probe 
> capacity of all segments.
> 2) Rename memUsage to maxProbeSize, or something, make sure it's passed 
> correctly based on estimates that take into account both probe and data size, 
> esp. in hybrid case.
> 3) Make sure that memory estimation for hybrid case also doesn't come up with 
> numbers that are too small, like 1-byte hashtable. I am not very familiar 
> with that code but it has happened in the past.
> Other issues we have seen:
> 4) Cap single write buffer size to 8-16Mb. The whole point of WBs is that you 
> should not allocate large array in advance. Even if some estimate passes 
> 500Mb or 40Mb or whatever, it doesn't make sense to allocate that.
> 5) For hybrid, don't pre-allocate WBs - only allocate on write.
> 6) Change everywhere rounding up to power of two is used to rounding down, at 
> least for hybrid case (?)
> I wanted to put all of these items in single JIRA so we could keep track of 
> fixing all of them.
> I think there are JIRAs for some of these already, feel free to link them to 
> this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11618) Correct the SARG api to reunify the PredicateLeaf.Type INTEGER and LONG

2015-08-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717603#comment-14717603
 ] 

Sergio Peña commented on HIVE-11618:


[~owen.omalley] I'd like to commit this patch so that we can continue with the 
FLOAT types.
Is that ok?

> Correct the SARG api to reunify the PredicateLeaf.Type INTEGER and LONG
> ---
>
> Key: HIVE-11618
> URL: https://issues.apache.org/jira/browse/HIVE-11618
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-11618.patch
>
>
> The Parquet binding leaked implementation details into the generic SARG api. 
> Rather than make all users of the SARG api deal with each of the specific 
> types, reunify the INTEGER and LONG types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-08-27 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717588#comment-14717588
 ] 

Sushanth Sowmyan commented on HIVE-8678:


Closed as "Cannot reproduce"

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
> Fix For: 1.2.2
>
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11670) Strip out password information from TezSessionState configuration

2015-08-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11670:
-
Attachment: HIVE-11670.1.patch

[~thejas] Can you please look at the change and see if it makes sense. I tested 
this locally.

Thanks
Hari

> Strip out password information from TezSessionState configuration
> -
>
> Key: HIVE-11670
> URL: https://issues.apache.org/jira/browse/HIVE-11670
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11670.1.patch
>
>
> Remove password information from configuration copy that is sent to Yarn/Tez. 
> We don't need it there. The config entries can potentially be visible to 
> other users.
> HIVE-10508 had the fix which removed this in certain places, however, when I 
> initiated a session via Hive Cli, I could still see the password information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-08-27 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan reopened HIVE-8678:


(Actually, maybe "not a problem" is an incorrect status, since it would 
indicate that the report is accurate, but working as designed. Reopening to 
close it again.)

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
> Fix For: 1.2.2
>
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-08-27 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-8678.

Resolution: Cannot Reproduce

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
> Fix For: 1.2.2
>
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog

2015-08-27 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-8678.

   Resolution: Not A Problem
Fix Version/s: 1.2.2

Resolving as "Not a problem" as of branch-1.2, since this problem is not 
reproducible in the newer releases of hive.

> Pig fails to correctly load DATE fields using HCatalog
> --
>
> Key: HIVE-8678
> URL: https://issues.apache.org/jira/browse/HIVE-8678
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
>Reporter: Michael McLellan
>Assignee: Sushanth Sowmyan
> Fix For: 1.2.2
>
>
> Using:
> Hadoop 2.5.0-cdh5.2.0 
> Pig 0.12.0-cdh5.2.0
> Hive 0.13.1-cdh5.2.0
> When using pig -useHCatalog to load a Hive table that has a DATE field, when 
> trying to DUMP the field, the following error occurs:
> {code}
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
> at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
> java.sql.Date
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457)
> at 
> org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375)
> at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64)
> 2014-10-30 22:58:05,469 [main] ERROR 
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting 
> read value to tuple
> {code}
> It seems to be occuring here: 
> https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433
> and that it should be:
> {code}Date d = Date.valueOf(o);{code} 
> instead of 
> {code}Date d = (Date) o;{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11565) LLAP: Tez counters for LLAP

2015-08-27 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717559#comment-14717559
 ] 

Siddharth Seth commented on HIVE-11565:
---

Fixing or disabling the counters which are incorrect - FileSystem, CPU, GC - is 
not straightfoward. Disabling them is the simplest but would need to be done in 
Tez.

In terms of adding new counters - Counters are associated with a task. The 
Input / Output / ProcessorContext for a task gives a handle on these counters. 
Accessing the counters via these contexts and then creating a new counter 
(findCounter) api is what is required to add a new counter.

> LLAP: Tez counters for LLAP
> ---
>
> Key: HIVE-11565
> URL: https://issues.apache.org/jira/browse/HIVE-11565
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> 1) Tez counters for LLAP are incorrect.
> 2) Some counters, such as cache hit ratio for a fragment, are not propagated.
> We need to make sure that Tez counters for LLAP are usable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11548) HCatLoader should support predicate pushdown.

2015-08-27 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717557#comment-14717557
 ] 

Mithun Radhakrishnan commented on HIVE-11548:
-

Alright. I was able to reproduce the 
{{TestHCatClient.testTableSchemaPropagation()}} problem. It seems to fail 
without this patch, so I'll work on that in a separate JIRA. I'm still having 
trouble getting {{TestPigHBaseStorageHandler}} to fail.

> HCatLoader should support predicate pushdown.
> -
>
> Key: HIVE-11548
> URL: https://issues.apache.org/jira/browse/HIVE-11548
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-11548.1.patch
>
>
> When one uses {{HCatInputFormat}}/{{HCatLoader}} to read from file-formats 
> that support predicate pushdown (such as ORC, with 
> {{hive.optimize.index.filter=true}}), one sees that the predicates aren't 
> actually pushed down into the storage layer.
> The forthcoming patch should allow for filter-pushdown, if any of the 
> partitions being scanned with {{HCatLoader}} support the functionality. The 
> patch should technically allow the same for users of {{HCatInputFormat}}, but 
> I don't currently have a neat interface to build a compound 
> predicate-expression. Will add this separately, if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10292) Add support for HS2 to use custom authentication class with kerberos environment

2015-08-27 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717512#comment-14717512
 ] 

Ravi Prakash commented on HIVE-10292:
-

Hi Thejas! Thanks for pointing out that JIRA. That JIRA was indeed very useful. 
We could start two HiveServer2 daemons (one accepting GSSAPI and another 
accepting custom authentication) 
We felt it may be better to allow the same HiveServer2 daemon to allow two 
forms of authentication (possibly on different ports). Would you have any 
recommendations on best to achieve it?

> Add support for HS2 to use custom authentication class with kerberos 
> environment
> 
>
> Key: HIVE-10292
> URL: https://issues.apache.org/jira/browse/HIVE-10292
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Heesoo Kim
>Assignee: HeeSoo Kim
> Attachments: HIVE-10292.patch
>
>
> In the kerberos environment, Hiveserver2 only supports GSSAPI and DIGEST-MD5 
> authentication mechanism. We would like to add the ability to use custom 
> authentication class in conjunction with Kerberos. 
> This is necessary to connect to HiveServer2 from a machine which cannot 
> authenticate with the KDC used inside the cluster environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11606:
--
Attachment: HIVE-11606.branch-1.patch

> Bucket map joins fail at hash table construction time
> -
>
> Key: HIVE-11606
> URL: https://issues.apache.org/jira/browse/HIVE-11606
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11606.1.patch, HIVE-11606.2.patch, 
> HIVE-11606.branch-1.patch
>
>
> {code}
> info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
> power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
> must be a power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717516#comment-14717516
 ] 

Ashutosh Chauhan commented on HIVE-11668:
-

Why is that not sufficient ?

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11606:
--
Attachment: HIVE-11606.2.patch

> Bucket map joins fail at hash table construction time
> -
>
> Key: HIVE-11606
> URL: https://issues.apache.org/jira/browse/HIVE-11606
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11606.1.patch, HIVE-11606.2.patch
>
>
> {code}
> info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
> power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
> must be a power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11606:
--
Attachment: (was: HIVE-11606.2.patch)

> Bucket map joins fail at hash table construction time
> -
>
> Key: HIVE-11606
> URL: https://issues.apache.org/jira/browse/HIVE-11606
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11606.1.patch, HIVE-11606.2.patch
>
>
> {code}
> info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
> power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
> must be a power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time

2015-08-27 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11606:
--
Attachment: HIVE-11606.2.patch

> Bucket map joins fail at hash table construction time
> -
>
> Key: HIVE-11606
> URL: https://issues.apache.org/jira/browse/HIVE-11606
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.0.1, 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11606.1.patch, HIVE-11606.2.patch
>
>
> {code}
> info=[Error: Failure while running task:java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a 
> power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
> at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
> must be a power of two
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11617) Explain plan for multiple lateral views is very slow

2015-08-27 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11617:

Attachment: HIVE-11617.patch

Revert ExplainTask.java file since we can use "Explain logical" to avoid the 
output issue.

> Explain plan for multiple lateral views is very slow
> 
>
> Key: HIVE-11617
> URL: https://issues.apache.org/jira/browse/HIVE-11617
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11617.patch, HIVE-11617.patch
>
>
> The following explain job will be very slow or never finish if there are many 
> lateral views involved. High CPU usage is also noticed.
> {noformat}
> CREATE TABLE `t1`(`pattern` array);
>   
> explain select * from t1 
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1
> lateral view explode(pattern) tbl1 as col1;
> {noformat}
> From jstack, the job is busy with preorder tree traverse. 
> {noformat}
> at java.util.regex.Matcher.getTextLength(Matcher.java:1234)
> at java.util.regex.Matcher.reset(Matcher.java:308)
> at java.util.regex.Matcher.(Matcher.java:228)
> at java.util.regex.Pattern.matcher(Pattern.java:1088)
> at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrd

[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717468#comment-14717468
 ] 

Hive QA commented on HIVE-11642:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752643/HIVE-11642.03.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9434 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.tez.dag.app.rm.TestLlapTaskSchedulerService.testPreemption
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5089/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5089/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5089/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752643 - PreCommit-HIVE-TRUNK-Build

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717456#comment-14717456
 ] 

Prasanth Jayachandran commented on HIVE-11595:
--

Mostly looks good. I am concerned about having references to two copies of 
footer (footerBuffer and fullFooterBuffer). I am guessing footerBuffer is 
subset of fullFooterBuffer (includes metadata + ps). Can we store the 
postscript length and footer length in the FileMetaInfo? So that, we can seek 
to postscript length - footer length and read footer length bytes to extract 
the footer alone. 

> refactor ORC footer reading to make it usable from outside
> --
>
> Key: HIVE-11595
> URL: https://issues.apache.org/jira/browse/HIVE-11595
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
> HIVE-11595.02.patch, HIVE-11595.03.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the 
> reader, opening a file, etc. I thought it would be as simple as protobuf 
> parseFrom bytes, but apparently there's bunch of stuff going on there. It 
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10292) Add support for HS2 to use custom authentication class with kerberos environment

2015-08-27 Thread HeeSoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717439#comment-14717439
 ] 

HeeSoo Kim commented on HIVE-10292:
---

[~thejas] Thank you for reviewing.
The answer is "NO". 
The goal of this ticket is supporting 2 types of authentication mechanism with 
Kerberos in HS2.
One is Kerberos and the other is custom authentication on the kerberized 
cluster.

> Add support for HS2 to use custom authentication class with kerberos 
> environment
> 
>
> Key: HIVE-10292
> URL: https://issues.apache.org/jira/browse/HIVE-10292
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Heesoo Kim
>Assignee: HeeSoo Kim
> Attachments: HIVE-10292.patch
>
>
> In the kerberos environment, Hiveserver2 only supports GSSAPI and DIGEST-MD5 
> authentication mechanism. We would like to add the ability to use custom 
> authentication class in conjunction with Kerberos. 
> This is necessary to connect to HiveServer2 from a machine which cannot 
> authenticate with the KDC used inside the cluster environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11669) OrcFileDump service should support directories

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717426#comment-14717426
 ] 

Prasanth Jayachandran commented on HIVE-11669:
--

I don't think there is anything special for the acid case here. Since delta 
files are still orc files it should still work. Acid files can be identified 
only by looking at the schema. The printed schema will have a struct for acid 
and another struct with actual row schema. 

> OrcFileDump service should support directories
> --
>
> Key: HIVE-11669
> URL: https://issues.apache.org/jira/browse/HIVE-11669
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11669.1.patch
>
>
> orcfiledump service does not support directories. If directory is specified 
> then the program should iterate through all the files in the directory and 
> perform file dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11646) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix multiple window spec for PTF operator

2015-08-27 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717418#comment-14717418
 ] 

Pengcheng Xiong commented on HIVE-11646:


[~jcamachorodriguez], as per [~jpullokkaran]'s request, could you also review 
this patch? Thanks.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix multiple 
> window spec for PTF operator
> ---
>
> Key: HIVE-11646
> URL: https://issues.apache.org/jira/browse/HIVE-11646
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11646.01.patch
>
>
> Current return path only supports a single windowing spec. All the following 
> window spec will overwrite the first one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11669) OrcFileDump service should support directories

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717416#comment-14717416
 ] 

Sergey Shelukhin commented on HIVE-11669:
-

Will this work with ACID?  I assume it will just do the dumb dump of the deltas 
treating them like separate, regular ORC files with different structure.
Is that true (or will it fail in some way), and is it intended (maybe it should 
dump the acidified output, or something).

> OrcFileDump service should support directories
> --
>
> Key: HIVE-11669
> URL: https://issues.apache.org/jira/browse/HIVE-11669
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11669.1.patch
>
>
> orcfiledump service does not support directories. If directory is specified 
> then the program should iterate through all the files in the directory and 
> perform file dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11629) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter expressions for full outer join and right outer join

2015-08-27 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11629:
---
Attachment: HIVE-11629.02.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix the filter 
> expressions for full outer join and right outer join
> --
>
> Key: HIVE-11629
> URL: https://issues.apache.org/jira/browse/HIVE-11629
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11629.01.patch, HIVE-11629.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717414#comment-14717414
 ] 

Sergey Shelukhin commented on HIVE-11595:
-

[~prasanth_j] ping?

> refactor ORC footer reading to make it usable from outside
> --
>
> Key: HIVE-11595
> URL: https://issues.apache.org/jira/browse/HIVE-11595
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-10595.patch, HIVE-11595.01.patch, 
> HIVE-11595.02.patch, HIVE-11595.03.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the 
> reader, opening a file, etc. I thought it would be as simple as protobuf 
> parseFrom bytes, but apparently there's bunch of stuff going on there. It 
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11553:

Attachment: HIVE-11553.02.patch

Patch that actually works.
This just gets footers from metastore instead of HDFS.
PPD is next step...

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hbase-metastore-branch
>
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-08-27 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11634:
-
Attachment: HIVE-11634.4.patch

Addressing [~jcamachorodriguez] and [~ashutoshc]'s comments in patch#4.

Thanks
Hari

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11669) OrcFileDump service should support directories

2015-08-27 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11669:
-
Attachment: HIVE-11669.1.patch

[~gopalv]/[~sershe] Can someone please take a look?

> OrcFileDump service should support directories
> --
>
> Key: HIVE-11669
> URL: https://issues.apache.org/jira/browse/HIVE-11669
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11669.1.patch
>
>
> orcfiledump service does not support directories. If directory is specified 
> then the program should iterate through all the files in the directory and 
> perform file dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11544) LazyInteger should avoid throwing NumberFormatException

2015-08-27 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11544:
---
Attachment: HIVE-11544.4.patch

> LazyInteger should avoid throwing NumberFormatException
> ---
>
> Key: HIVE-11544
> URL: https://issues.apache.org/jira/browse/HIVE-11544
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0
>Reporter: William Slacum
>Assignee: Gopal V
>Priority: Minor
>  Labels: Performance
> Attachments: HIVE-11544.1.patch, HIVE-11544.2.patch, 
> HIVE-11544.3.patch, HIVE-11544.4.patch
>
>
> {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these 
> conditions:
> # bytes are null
> # radix is invalid
> # length is 0
> # the string is '+' or '-'
> # {{LazyInteger#parse}} throws a {{NumberFormatException}}
> Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the 
> exception is caught, swallowed, and {{isNull}} is set to {{true}}.
> This is generally a bad workflow, as exception creation is a performance 
> bottleneck, and potentially repeating for many rows in a query can have a 
> drastic performance consequence.
> It would be better if this method returned an {{Optional}}, which 
> would provide similar functionality with a higher throughput rate.
> I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so 
> I've marked those as affected. Any version in between would also suffer from 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement

2015-08-27 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9481:
-

added note to 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Synopsis.1

> allow column list specification in INSERT statement
> ---
>
> Key: HIVE-9481
> URL: https://issues.apache.org/jira/browse/HIVE-9481
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, Query Processor, SQL
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, 
> HIVE-9481.6.patch, HIVE-9481.patch
>
>
> Given a table FOO(a int, b int, c int), ANSI SQL supports insert into 
> FOO(c,b) select x,y from T.  The expectation is that 'x' is written to column 
> 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' 
> is NULLABLE.
> Hive does not support this.  In Hive one has to ensure that the data 
> producing statement has a schema that matches target table schema.
> Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when 
> target schema is explicitly provided, missing columns will be set to NULL if 
> they are NULLABLE, otherwise an error will be raised.
> If/when DEFAULT clause is supported, this can be enhanced to set default 
> value rather than NULL.
> Thus, given {noformat}
> create table source (a int, b int);
> create table target (x int, y int, z int);
> create table target2 (x int, y int, z int);
> {noformat}
> {noformat}insert into target(y,z) select * from source;{noformat}
> will mean 
> {noformat}insert into target select null as x, a, b from source;{noformat}
> and 
> {noformat}insert into target(z,y) select * from source;{noformat}
> will meant 
> {noformat}insert into target select null as x, b, a from source;{noformat}
> Also,
> {noformat}
> from source 
>   insert into target(y,z) select null as x, * 
>   insert into target2(y,z) select null as x, source.*;
> {noformat}
> and for partitioned tables, given
> {noformat}
> Given:
> CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING)
>   PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS 
> STORED AS ORC;
> INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) 
>  
>VALUES ('jsmith', 'mail.com');
> {noformat}
> And dynamic partitioning
> {noformat}
> INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) 
> VALUES ('jsmith', '2014-09-23', 'mail.com');
> {noformat}
> In all cases, the schema specification contains columns of the target table 
> which are matched by position to the values produced by VALUES clause/SELECT 
> statement.  If the producer side provides values for a dynamic partition 
> column, the column should be in the specified schema.  Static partition 
> values are part of the partition spec and thus are not produced by the 
> producer and should not be part of the schema specification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9353) make TABLE keyword optional in INSERT INTO TABLE foo...

2015-08-27 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-9353:
-
Labels:   (was: TODOC15)

added note to 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Synopsis.1

> make TABLE keyword optional in INSERT INTO TABLE foo...
> ---
>
> Key: HIVE-9353
> URL: https://issues.apache.org/jira/browse/HIVE-9353
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.1.0
>
> Attachments: HIVE-9353.patch
>
>
> standard SQL support INSERT INTO foo ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717336#comment-14717336
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

Alternatively, we can put that information (loaded partition spec and its 
stats) under hive.tez.exec.print.summary config. I often use the console log 
information to see if the stats (fast stats/no scan stats) are loaded 
correctly. 


> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717329#comment-14717329
 ] 

Prasanth Jayachandran edited comment on HIVE-11645 at 8/27/15 7:06 PM:
---

Also "show partitions" shows all partitions. Not the recently loaded 
partitions. Looking at logs will not be user friendly. Is there any other way 
to view the recently loaded partition?


was (Author: prasanth_j):
Also "show partitions" shows all partitions. Not the recently loaded 
partitions. Looking at logs will not user friendly. Is there any other way to 
view the recently loaded partition?

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717331#comment-14717331
 ] 

Sergey Shelukhin commented on HIVE-11668:
-

runTestQuery is only run when the object is created, it's not run otherwise

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717329#comment-14717329
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

Also "show partitions" shows all partitions. Not the recently loaded 
partitions. Looking at logs will not user friendly. Is there any other way to 
view the recently loaded partition?

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717323#comment-14717323
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

Its informative in the sense that it avoids issuing another command "show 
partitions" to view the list of partitions.

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717322#comment-14717322
 ] 

Ashutosh Chauhan commented on HIVE-11668:
-

Do we really need  doDbSpecificInitializationsBeforeQuery(); everywhere? I 
think we can just have it in runTestQuery() so that its there once we have 
opened the connection.

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717313#comment-14717313
 ] 

Ashutosh Chauhan commented on HIVE-11645:
-

Trash logs seems to be printed by HDFS. Will look into if there is a way to 
have that in log. I don't think its informative for user to have that on 
console. 
w.r.t. partition names, they are still printed on client log as well as on 
metastore log. Suggestion to remove them from console was given by [~hagleitn] 
who is of opinion that on console user just wants an indication that progress 
is getting made, not about all details (for which she can refer to logs).
Will add unit of time.

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717287#comment-14717287
 ] 

Sergey Shelukhin commented on HIVE-11668:
-

https://reviews.apache.org/r/37852/

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11662) DP cannot be applied to external table which contains part-spec like directory

2015-08-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717281#comment-14717281
 ] 

Hive QA commented on HIVE-11662:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12752638/HIVE-11662.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9378 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5088/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5088/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5088/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12752638 - PreCommit-HIVE-TRUNK-Build

> DP cannot be applied to external table which contains part-spec like directory
> --
>
> Key: HIVE-11662
> URL: https://issues.apache.org/jira/browse/HIVE-11662
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-11662.1.patch.txt
>
>
> Some users want to use part-spec like directory name in their partitioned 
> table locations, something like,
> {noformat}
> /something/warehouse/some_key=some_value
> {noformat}
> DP calculates additional partitions from full path, and makes exception 
> something like,
> {noformat}
> Failed with exception Partition spec {some_key=some_value, 
> part_key=part_value} contains non-partition columns
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11668:

Component/s: Metastore

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717275#comment-14717275
 ] 

Ashutosh Chauhan commented on HIVE-11668:
-

Can you create a RB entry, please ?

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11645) Add in-place updates for dynamic partitions loading

2015-08-27 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717271#comment-14717271
 ] 

Prasanth Jayachandran commented on HIVE-11645:
--

[~ashutoshc] This doesn't work when fs.trash.interval is set to >0, when table 
already has some data and when we do insert overwrite. Logs related moving of 
existing files to .Trash hdfs directory breaks the in-places updates as it 
introduces new log lines.

{code}
Loaded : 1/7 partitions.
Loaded : 2/7 partitions.
Loaded : 3/7 partitions.
Moved: 
'hdfs://localhost:9000/apps/hive/warehouse/ss_part/ss_store_sk=4/00_0' to 
trash at: hdfs://localhost:9000/user/pjayachandran/.Trash/Current
Loaded : 4/7 partitions.
Moved: 
'hdfs://localhost:9000/apps/hive/warehouse/ss_part/ss_store_sk=__HIVE_DEFAULT_PARTITION__/01_0'
 to trash at: hdfs://localhost:9000/user/pjayachandran/.Trash/Current
Moved: 
'hdfs://localhost:9000/apps/hive/warehouse/ss_part/ss_store_sk=__HIVE_DEFAULT_PARTITION__/02_0'
 to trash at: hdfs://localhost:9000/user/pjayachandran/.Trash/CurLoaded : 5/7 
partitions.
Loaded : 6/7 partitions.
Loaded : 7/7 partitions.
 Time taken for load dynamic partitions : 1471
 Time taken for adding to write entity : 1
{code}

Also this is change of behavior. Earlier it used to print the name of the 
partition that is loaded on the console (also logs to file) but now it will 
just say Loaded: m/n partitions. I think it will be useful to print all the 
partitions that are loaded at the end. This way we don't change the behaviour. 
nit:
Can you add the unit of time (ms) for "Time taken" log lines? I don't think it 
is related to your changes but it will be good to have :)

> Add in-place updates for dynamic partitions loading
> ---
>
> Key: HIVE-11645
> URL: https://issues.apache.org/jira/browse/HIVE-11645
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-11645.2.patch, HIVE-11645.3.patch, HIVE-11645.patch
>
>
> Currently, updates go to log file and on console there is no visible progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717268#comment-14717268
 ] 

Sergey Shelukhin commented on HIVE-11123:
-

HIVE-11668

> Fix how to confirm the RDBMS product name at Metastore.
> ---
>
> Key: HIVE-11123
> URL: https://issues.apache.org/jira/browse/HIVE-11123
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0
> Environment: PostgreSQL
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11123.1.patch, HIVE-11123.2.patch, 
> HIVE-11123.3.patch, HIVE-11123.4.patch, HIVE-11123.4a.patch
>
>
> I use PostgreSQL to Hive Metastore. And I saw the following message at 
> PostgreSQL log.
> {code}
> < 2015-06-26 10:58:15.488 JST >ERROR:  syntax error at or near "@@" at 
> character 5
> < 2015-06-26 10:58:15.488 JST >STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
> < 2015-06-26 10:58:15.489 JST >ERROR:  relation "v$instance" does not exist 
> at character 21
> < 2015-06-26 10:58:15.489 JST >STATEMENT:  SELECT version FROM v$instance
> < 2015-06-26 10:58:15.490 JST >ERROR:  column "version" does not exist at 
> character 10
> < 2015-06-26 10:58:15.490 JST >STATEMENT:  SELECT @@version
> {code}
> When Hive CLI and Beeline embedded mode are carried out, this message is 
> output to PostgreSQL log.
> These queries are called from MetaStoreDirectSql#determineDbType. And if we 
> use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717261#comment-14717261
 ] 

Sergey Shelukhin commented on HIVE-11668:
-

[~ashutoshc] can you take a look?


> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11668) make sure directsql calls pre-query init when needed

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11668:

Attachment: HIVE-11668.patch

> make sure directsql calls pre-query init when needed
> 
>
> Key: HIVE-11668
> URL: https://issues.apache.org/jira/browse/HIVE-11668
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11668.patch
>
>
> See comments in HIVE-11123



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for ReduceSink operator

2015-08-27 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717237#comment-14717237
 ] 

Pengcheng Xiong commented on HIVE-11623:


The failed tests are unrelated and they passed on my laptop. Pushed to master. 
Thanks [~jcamachorodriguez] for the review.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the 
> tableAlias for ReduceSink operator
> 
>
> Key: HIVE-11623
> URL: https://issues.apache.org/jira/browse/HIVE-11623
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11623.01.patch, HIVE-11623.02.patch, 
> HIVE-11623.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11123) Fix how to confirm the RDBMS product name at Metastore.

2015-08-27 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11123:

Attachment: HIVE-11123.4a.patch

btw, the patch needed to be rebased. Here's what I committed

> Fix how to confirm the RDBMS product name at Metastore.
> ---
>
> Key: HIVE-11123
> URL: https://issues.apache.org/jira/browse/HIVE-11123
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.0
> Environment: PostgreSQL
>Reporter: Shinichi Yamashita
>Assignee: Shinichi Yamashita
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11123.1.patch, HIVE-11123.2.patch, 
> HIVE-11123.3.patch, HIVE-11123.4.patch, HIVE-11123.4a.patch
>
>
> I use PostgreSQL to Hive Metastore. And I saw the following message at 
> PostgreSQL log.
> {code}
> < 2015-06-26 10:58:15.488 JST >ERROR:  syntax error at or near "@@" at 
> character 5
> < 2015-06-26 10:58:15.488 JST >STATEMENT:  SET @@session.sql_mode=ANSI_QUOTES
> < 2015-06-26 10:58:15.489 JST >ERROR:  relation "v$instance" does not exist 
> at character 21
> < 2015-06-26 10:58:15.489 JST >STATEMENT:  SELECT version FROM v$instance
> < 2015-06-26 10:58:15.490 JST >ERROR:  column "version" does not exist at 
> character 10
> < 2015-06-26 10:58:15.490 JST >STATEMENT:  SELECT @@version
> {code}
> When Hive CLI and Beeline embedded mode are carried out, this message is 
> output to PostgreSQL log.
> These queries are called from MetaStoreDirectSql#determineDbType. And if we 
> use MetaStoreDirectSql#getProductName, we need not to call these queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >