[jira] [Commented] (HIVE-11940) "INSERT OVERWRITE" query is very slow because it creates one "distcp" per file to copy data from staging directory to target directory

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909107#comment-14909107
 ] 

Hive QA commented on HIVE-11940:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762161/HIVE-11940.2.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9590 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-enforce_order.q-constprog_dpp.q-auto_join1.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5419/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5419/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5419/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762161 - PreCommit-HIVE-TRUNK-Build

> "INSERT OVERWRITE" query is very slow because it creates one "distcp" per 
> file to copy data from staging directory to target directory
> --
>
> Key: HIVE-11940
> URL: https://issues.apache.org/jira/browse/HIVE-11940
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-11940.1.patch, HIVE-11940.2.patch
>
>
> When hive.exec.stagingdir is set to ".hive-staging", which will be placed 
> under the target directory when running "INSERT OVERWRITE" query, Hive will 
> grab all files under the staging directory and copy them ONE BY ONE to target 
> directory.
> When hive exec.stagingdir is set to "/tmp/hive", Hive will simply do a RENAME 
> operation which will be instant.
> This happens with files that are not encrypted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11952) disable q tests that are both slow and less relevant

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908106#comment-14908106
 ] 

Ashutosh Chauhan commented on HIVE-11952:
-

I presume most of the time is spent on query execution. Can we just keep 
explain plans for queries in these tests?

> disable q tests that are both slow and less relevant
> 
>
> Key: HIVE-11952
> URL: https://issues.apache.org/jira/browse/HIVE-11952
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11952.patch
>
>
> We will disable several tests that test obscure and old features and take 
> inordinate amount of time, and file JIRAs to look at their perf if someone 
> still cares about them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10048) JDBC - Support SSL encryption regardless of Authentication mechanism

2015-09-25 Thread Mubashir Kazia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubashir Kazia updated HIVE-10048:
--
Attachment: HIVE-10048.2.patch

New patch with changes that incorporate feedback.

> JDBC - Support SSL encryption regardless of Authentication mechanism
> 
>
> Key: HIVE-10048
> URL: https://issues.apache.org/jira/browse/HIVE-10048
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.0.0
>Reporter: Mubashir Kazia
>Assignee: Mubashir Kazia
>  Labels: newbie, patch
> Attachments: HIVE-10048.1.patch, HIVE-10048.2.patch
>
>
> JDBC driver currently only supports SSL Transport if the Authentication 
> mechanism is SASL Plain with username and password. SSL transport  should be 
> decoupled from Authentication mechanism. If the customer chooses to do 
> Kerberos Authentication and SSL encryption over the wire it should be 
> supported. The Server side already supports this but the driver does not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10048) JDBC - Support SSL encryption regardless of Authentication mechanism

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908278#comment-14908278
 ] 

Sergio Peña commented on HIVE-10048:


Thanks @Mubashir Kazia
+1

> JDBC - Support SSL encryption regardless of Authentication mechanism
> 
>
> Key: HIVE-10048
> URL: https://issues.apache.org/jira/browse/HIVE-10048
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.0.0
>Reporter: Mubashir Kazia
>Assignee: Mubashir Kazia
>  Labels: newbie, patch
> Attachments: HIVE-10048.1.patch, HIVE-10048.2.patch
>
>
> JDBC driver currently only supports SSL Transport if the Authentication 
> mechanism is SASL Plain with username and password. SSL transport  should be 
> decoupled from Authentication mechanism. If the customer chooses to do 
> Kerberos Authentication and SSL encryption over the wire it should be 
> supported. The Server side already supports this but the driver does not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11952) disable q tests that are both slow and less relevant

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908148#comment-14908148
 ] 

Sergio Peña commented on HIVE-11952:


It looks good [~sershe]. 

I just noticed that those files are in miniSparkOnYarn.query.files variable as 
well. Although such variable is not used anywhere on the pom.xml, I think it 
would be better to remove them in case it is used in the future.

+1. Let's wait for the tests  and make sure they were not executed on jenkins.

> disable q tests that are both slow and less relevant
> 
>
> Key: HIVE-11952
> URL: https://issues.apache.org/jira/browse/HIVE-11952
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11952.patch
>
>
> We will disable several tests that test obscure and old features and take 
> inordinate amount of time, and file JIRAs to look at their perf if someone 
> still cares about them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11714) Turn off hybrid grace hash join for cross product join

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908129#comment-14908129
 ] 

Hive QA commented on HIVE-11714:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761996/HIVE-11714.2.patch

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9583 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_join
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5412/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5412/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5412/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761996 - PreCommit-HIVE-TRUNK-Build

> Turn off hybrid grace hash join for cross product join
> --
>
> Key: HIVE-11714
> URL: https://issues.apache.org/jira/browse/HIVE-11714
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-11714.1.patch, HIVE-11714.2.patch
>
>
> Current partitioning calculation is solely based on hash value of the key. 
> For cross product join where keys are empty, all the rows will be put into 
> partition 0. This falls back to the regular mapjoin behavior where we only 
> have one hashtable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11951) DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES

2015-09-25 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-11951:
---
Description: 
Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
DATABASE EXTENDED}} even though [the 
documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase]
 says you should. To reproduce:
{code}
create database test with dbproperties('foo'='bar');
desc database extended test;
{code}

The output I see is
{code}
> desc database extended test;
OK
testhdfs://:/path/to/test.dbahsu
Time taken: 0.019 seconds, Fetched: 1 row(s)
{code}
I do not see the {{foo=bar}} property.

  was:
Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
DATABASE EXTENDED}}. To reproduce:
{code}
create database test with dbproperties('foo'='bar');
desc database extended test;
{code}

The output I see is
{code}
> desc database extended test;
OK
testhdfs://:/path/to/test.dbahsu
Time taken: 0.019 seconds, Fetched: 1 row(s)
{code}
I do not see the {{foo=bar}} property.


> DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES
> -
>
> Key: HIVE-11951
> URL: https://issues.apache.org/jira/browse/HIVE-11951
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Anthony Hsu
>
> Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
> DATABASE EXTENDED}} even though [the 
> documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase]
>  says you should. To reproduce:
> {code}
> create database test with dbproperties('foo'='bar');
> desc database extended test;
> {code}
> The output I see is
> {code}
> > desc database extended test;
> OK
> test  hdfs://:/path/to/test.dbahsu
> Time taken: 0.019 seconds, Fetched: 1 row(s)
> {code}
> I do not see the {{foo=bar}} property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11951) DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES

2015-09-25 Thread Anthony Hsu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-11951:
---
Description: 
Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
DATABASE EXTENDED}} even though [the 
documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase]
 says you should. To reproduce:
{code}
create database test with dbproperties('foo'='bar');
desc database extended test;
{code}

The output I see is
{code}
> desc database extended test;
OK
testhdfs://:/path/to/test.dbahsu
Time taken: 0.019 seconds, Fetched: 1 row(s)
{code}
I do not see the {{foo=bar}} property.

This issue may affect newer Hive versions, too, but I haven't checked.

  was:
Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
DATABASE EXTENDED}} even though [the 
documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase]
 says you should. To reproduce:
{code}
create database test with dbproperties('foo'='bar');
desc database extended test;
{code}

The output I see is
{code}
> desc database extended test;
OK
testhdfs://:/path/to/test.dbahsu
Time taken: 0.019 seconds, Fetched: 1 row(s)
{code}
I do not see the {{foo=bar}} property.


> DESCRIBE DATABASE EXTENDED does not show DBPROPERTIES
> -
>
> Key: HIVE-11951
> URL: https://issues.apache.org/jira/browse/HIVE-11951
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Anthony Hsu
>
> Using Hive 0.13.1, I do not see database properties when running {{DESCRIBE 
> DATABASE EXTENDED}} even though [the 
> documentation|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeDatabase]
>  says you should. To reproduce:
> {code}
> create database test with dbproperties('foo'='bar');
> desc database extended test;
> {code}
> The output I see is
> {code}
> > desc database extended test;
> OK
> test  hdfs://:/path/to/test.dbahsu
> Time taken: 0.019 seconds, Fetched: 1 row(s)
> {code}
> I do not see the {{foo=bar}} property.
> This issue may affect newer Hive versions, too, but I haven't checked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11934) Transaction lock retry logic results in infinite loop

2015-09-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908335#comment-14908335
 ] 

Eugene Koifman commented on HIVE-11934:
---

[~alangates] could you review please

> Transaction lock retry logic results in infinite loop
> -
>
> Key: HIVE-11934
> URL: https://issues.apache.org/jira/browse/HIVE-11934
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 1.2.1
>Reporter: Steve Howard
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-11934.patch
>
>
> We reset the deadlock count to 0 every time the lock() method is called in 
> org.apache.hadoop.hive.metastore.txn.TxnHandler, so the ten count is never 
> reached in checkRetryable().
> We should let checkRetryable handle the deadlock count.
>   public LockResponse lock(LockRequest rqst)
> throws NoSuchTxnException, TxnAbortedException, MetaException
>   {
> >>>this.deadlockCnt = 0; <<<
> try
> {
>   Connection dbConn = null;
>   try
>   {



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11937) Improve StatsOptimizer to deal with query with additional constant columns

2015-09-25 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11937:
---
Attachment: HIVE-11937.02.patch

> Improve StatsOptimizer to deal with query with additional constant columns
> --
>
> Key: HIVE-11937
> URL: https://issues.apache.org/jira/browse/HIVE-11937
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11937.01.patch, HIVE-11937.02.patch
>
>
> Right now StatsOptimizer can deal with query such as "select count(1) from 
> src" by directly looking into the metastore. However, it can not deal with 
> "select '1' as one, count(1) from src" which has an additional constant 
> column. We may improve it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11945) ORC with non-local reads may not be reusing connection to DN

2015-09-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908413#comment-14908413
 ] 

Prasanth Jayachandran commented on HIVE-11945:
--

[~rajesh.balamohan] Thanks for the S3 analysis. Can you link the new HDFS jira 
to this patch? 

The latest patch LGTM as well, +1

> ORC with non-local reads may not be reusing connection to DN
> 
>
> Key: HIVE-11945
> URL: https://issues.apache.org/jira/browse/HIVE-11945
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11945.1.patch, HIVE-11945.2.patch, 
> HIVE-11945.3.patch
>
>
> When “seek + readFully(buffer, offset, length)” is used,  DFSInputStream ends 
> up going via “readWithStrategy()”.  This sets up BlockReader with length 
> equivalent to that of the block size. So until this position is reached, 
> RemoteBlockReader2.peer would not be added to the PeerCache (Plz refer 
> RemoteBlockReader2.close() in HDFS).  So eventually the next call to the same 
> DN would end opening a new socket.  In ORC, when it is not a data local read, 
> this has a the possibility of opening/closing lots of connections with DN.  
> In random reads, it would be good to set this length to the amount of data 
> that is to be read (e.g pread call in DFSInputStream which sets up the 
> BlockReader’s length correctly & the code path returns the Peer back to peer 
> cache properly).  “readFully(position, buffer, offset, length)” follows this 
> code path and ends up reusing the connections properly. Creating this JIRA to 
> fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6091) Empty pipeout files are created for connection create/close

2015-09-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6091:
---
Component/s: HiveServer2

> Empty pipeout files are created for connection create/close
> ---
>
> Key: HIVE-6091
> URL: https://issues.apache.org/jira/browse/HIVE-6091
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HIVE-6091.1.patch, HIVE-6091.2.patch, HIVE-6091.patch
>
>
> Pipeout files are created when a connection is established and removed only 
> when data was produced. Instead we should create them only when data has to 
> be fetched or remove them whether data is fetched or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11778) Merge beeline-cli branch to trunk

2015-09-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908462#comment-14908462
 ] 

Xuefu Zhang commented on HIVE-11778:


+1 to patch .1

> Merge beeline-cli branch to trunk
> -
>
> Key: HIVE-11778
> URL: https://issues.apache.org/jira/browse/HIVE-11778
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Attachments: HIVE-11778.1.patch, HIVE-11778.patch
>
>
> The team working on the beeline-cli branch would like to merge their work to 
> trunk. This jira will track that effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11968) make HS2 core components re-usable across projects

2015-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908484#comment-14908484
 ] 

Thejas M Nair commented on HIVE-11968:
--

[~cwsteinbach] had an early proposal for this here - 
https://cwiki.apache.org/confluence/display/Hive/AccessServer+Design+Proposal


> make HS2 core components re-usable across projects
> --
>
> Key: HIVE-11968
> URL: https://issues.apache.org/jira/browse/HIVE-11968
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>
> HS2 provides jdbc and odbc access to hive. There has been a lot of investment 
> into HS2 over time ( Fault tolerance, authentication modes, HTTP transport 
> mode, encryption, delegation tokens .. ).
> The thrift API that HS2 provides is generic and is applicable to other SQL 
> engines as well. Spark is already using a fork of HS2, but as it is a fork, 
> it hard to maintain. 
> HS2 code is not structured to be easily re-used and extended. If we can make 
> improvements there, it can be easily re-used by other projects, and all 
> effort can be combined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Assignee: (was: Sergey Shelukhin)

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11964) RelOptHiveTable.hiveColStatsMap might contain mismatched column stats

2015-09-25 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908655#comment-14908655
 ] 

Laljo John Pullokkaran commented on HIVE-11964:
---

Looks good. Wait for QA run

> RelOptHiveTable.hiveColStatsMap might contain mismatched column stats
> -
>
> Key: HIVE-11964
> URL: https://issues.apache.org/jira/browse/HIVE-11964
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11964.patch
>
>
> RelOptHiveTable.hiveColStatsMap might contain mismatched stats since it was 
> built by assuming the stats returned from
> ==
> hiveColStats =StatsUtils.getTableColumnStats(hiveTblMetadata, 
> hiveNonPartitionCols, nonPartColNamesThatRqrStats);
> or 
> HiveMetaStoreClient.getTableColumnStatistics(dbName, tableName, colNames)
> ==
> have the same order of the requested columns. But actually the order is 
> non-deterministic. therefore the returned stats should be re-ordered before 
> it is put in hiveColStatsMap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11304:
-
Labels: TODOC2.0 incompatibleChange  (was: TODOC2.0)

> Migrate to Log4j2 from Log4j 1.x
> 
>
> Key: HIVE-11304
> URL: https://issues.apache.org/jira/browse/HIVE-11304
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>  Labels: TODOC2.0, incompatibleChange
> Fix For: 2.0.0
>
> Attachments: HIVE-11304.10.patch, HIVE-11304.11.patch, 
> HIVE-11304.2.patch, HIVE-11304.3.patch, HIVE-11304.4.patch, 
> HIVE-11304.5.patch, HIVE-11304.6.patch, HIVE-11304.7.patch, 
> HIVE-11304.8.patch, HIVE-11304.9.patch, HIVE-11304.patch
>
>
> Log4J2 has some great benefits and can benefit hive significantly. Some 
> notable features include
> 1) Performance (parametrized logging, performance when logging is disabled 
> etc.) More details can be found here 
> https://logging.apache.org/log4j/2.x/performance.html
> 2) RoutingAppender - Route logs to different log files based on MDC context 
> (useful for HS2, LLAP etc.)
> 3) Asynchronous logging
> This is an umbrella jira to track changes related to Log4j2 migration.
> Log4J1 EOL - 
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908616#comment-14908616
 ] 

Hive QA commented on HIVE-11473:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762435/HIVE-11473.2-spark.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7554 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/953/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/953/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-953/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762435 - PreCommit-HIVE-SPARK-Build

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch, 
> HIVE-11473.2-spark.patch, HIVE-11473.2-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top

2015-09-25 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908619#comment-14908619
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11724:
--

lgtm +1

> WebHcat get jobs to order jobs on time order with latest at top
> ---
>
> Key: HIVE-11724
> URL: https://issues.apache.org/jira/browse/HIVE-11724
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Attachments: HIVE-11724.1.patch, HIVE-11724.2.patch, 
> HIVE-11724.3.patch, HIVE-11724.4.patch, HIVE-11724.5.patch, HIVE-11724.6.patch
>
>
> HIVE-5519 added pagination feature support to WebHcat. This implementation 
> returns the jobs lexicographically resulting in older jobs showing at the 
> top. 
> Improvement is to order them on time with latest at top. Typically latest 
> jobs (or running) ones are more relevant to the user. Time based ordering 
> with pagination makes more sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11969:
---

Assignee: Sergey Shelukhin

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4243) Fix column names in FileSinkOperator

2015-09-25 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4243:

Attachment: HIVE-4243.patch

Fixed order of setting precision and scale.

> Fix column names in FileSinkOperator
> 
>
> Key: HIVE-4243
> URL: https://issues.apache.org/jira/browse/HIVE-4243
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, 
> HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.patch, HIVE-4243.tmp.patch
>
>
> All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual 
> column names. Since the files are part of tables, Hive knows the column 
> names. For self-describing file formats like ORC, having the real column 
> names will improve the understandability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11970) COLUMNS_V2 table in metastore should have a longer name field

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11970:
---

Assignee: Sergey Shelukhin

> COLUMNS_V2 table in metastore should have a longer name field
> -
>
> Key: HIVE-11970
> URL: https://issues.apache.org/jira/browse/HIVE-11970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> In some cases, esp. with derived names, e.g. from Avro schemas, the column 
> names can be pretty long. COLUMNS_V2 name field has a very short length.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11642:
-
Attachment: HIVE-11642.11.patch

> LLAP: make sure tests pass #3
> -
>
> Key: HIVE-11642
> URL: https://issues.apache.org/jira/browse/HIVE-11642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, 
> HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, 
> HIVE-11642.08.patch, HIVE-11642.09.patch, HIVE-11642.10.patch, 
> HIVE-11642.11.patch, HIVE-11642.patch
>
>
> Tests should pass against the most recent branch and Tez 0.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11908) LLAP: Merge branch to hive-2.0

2015-09-25 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908628#comment-14908628
 ] 

Gopal V commented on HIVE-11908:


Patch applies cleanly onto master & runs queries at 30Tb scale - I've followed 
the branch throughout, LGTM +1.

As expected, the biggest winners seem to be short BI queries. Something like 
query55 gets a ~3x boost (150s -> 54s) & query27 gets a ~1.9x boost (250s -> 
132s).

At 200Gb scale, the BI wins are more relevant (Query55 summary, for example 
runs on LLAP under a second easily, but the stats fetch + planner takes 
~1800ms).

{code}
INFO  : Status: DAG finished successfully in 0.58 seconds
INFO  :
INFO  : VERTICES TOTAL_TASKS  FAILED_ATTEMPTS KILLED_TASKS 
DURATION_SECONDSCPU_TIME_MILLIS GC_TIME_MILLIS  INPUT_RECORDS   
OUTPUT_RECORDS
INFO  : Map 1  100 
0.00 40  0 10,000   31
INFO  : Map 2 3100 
0.20 45,730  0 18,630,7114,898
INFO  : Map 5  100 
0.00 60  0 48,000  421
INFO  : Reducer 3  100 
0.17170  0  4,898  100
INFO  : Reducer 4  100 
0.00 80  01000
{code}

[~sershe]: one zero byte file is missing in the patches needed by the services 
WebApp.

https://github.com/apache/hive/blob/llap/llap-server/src/main/resources/webapps/llap/.keep

> LLAP: Merge branch to hive-2.0
> --
>
> Key: HIVE-11908
> URL: https://issues.apache.org/jira/browse/HIVE-11908
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
>  Labels: TODOC-LLAP
> Attachments: HIVE-11908.patch
>
>
> Merge LLAP branch to hive-2.0.0 (only).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11970) COLUMNS_V2 table in metastore should have a longer name field

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11970:

Attachment: HIVE-11970.patch

[~sushanth] [~thejas] can you take a look?

> COLUMNS_V2 table in metastore should have a longer name field
> -
>
> Key: HIVE-11970
> URL: https://issues.apache.org/jira/browse/HIVE-11970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11970.patch
>
>
> In some cases, esp. with derived names, e.g. from Avro schemas, the column 
> names can be pretty long. COLUMNS_V2 name field has a very short length.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908726#comment-14908726
 ] 

Hive QA commented on HIVE-10982:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762068/HIVE-10982.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9621 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5416/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5416/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5416/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762068 - PreCommit-HIVE-TRUNK-Build

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-10982.1.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908368#comment-14908368
 ] 

Ashutosh Chauhan commented on HIVE-11583:
-

Spilling is controlled by config {{hive.join.cache.size}} Perhaps, you can set 
that to very low value in q test so as to trigger spilling and thus testing 
this without needing a large input data.

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11963) Llap: Disable web app for mini llap tests

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908418#comment-14908418
 ] 

Sergey Shelukhin commented on HIVE-11963:
-

Config is called HIVE_IN_TEST? :)

> Llap: Disable web app for mini llap tests
> -
>
> Key: HIVE-11963
> URL: https://issues.apache.org/jira/browse/HIVE-11963
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> We don't need web app service for mini llap tests. Provide config to disable 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908439#comment-14908439
 ] 

Hive QA commented on HIVE-11724:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762006/HIVE-11724.6.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9604 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_grouping_sets.q-scriptfile1.q-union2.q-and-12-more 
- did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testAddPartition
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5413/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5413/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5413/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762006 - PreCommit-HIVE-TRUNK-Build

> WebHcat get jobs to order jobs on time order with latest at top
> ---
>
> Key: HIVE-11724
> URL: https://issues.apache.org/jira/browse/HIVE-11724
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Affects Versions: 0.14.0
>Reporter: Kiran Kumar Kolli
>Assignee: Kiran Kumar Kolli
> Attachments: HIVE-11724.1.patch, HIVE-11724.2.patch, 
> HIVE-11724.3.patch, HIVE-11724.4.patch, HIVE-11724.5.patch, HIVE-11724.6.patch
>
>
> HIVE-5519 added pagination feature support to WebHcat. This implementation 
> returns the jobs lexicographically resulting in older jobs showing at the 
> top. 
> Improvement is to order them on time with latest at top. Typically latest 
> jobs (or running) ones are more relevant to the user. Time based ordering 
> with pagination makes more sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-25 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908437#comment-14908437
 ] 

Illya Yalovyy commented on HIVE-11583:
--

Oh... I was thinking about all possible ways to reduce the size of file. cahce 
size is only one piece of the puzzle. The important thing is physical file 
system blocks and it seems like I cannot control it from withing Hive script.

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908457#comment-14908457
 ] 

Ashutosh Chauhan commented on HIVE-11583:
-

Hive q tests use 
hive-shims-common/src/main/java/org/apache/hadoop/fs/ProxyLocalFileSystem.java 
I think you can configure its block size via {{fs.local.block.size}} 

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11663) Auto load/unload custom udf function for hive cli and hiveserver2

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908481#comment-14908481
 ] 

Ashutosh Chauhan commented on HIVE-11663:
-

Permanent udfs introduced in HIVE-6047 is to serve the usecase which you have 
outlined. Wondering why using permanent udfs is not sufficient here?

> Auto load/unload custom udf function for hive cli and hiveserver2
> -
>
> Key: HIVE-11663
> URL: https://issues.apache.org/jira/browse/HIVE-11663
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration
>Affects Versions: 0.14.0, 1.0.0, 1.0.1, 1.1.1, 1.2.1
>Reporter: liuzongquan
>Assignee: liuzongquan
>  Labels: features, patch
> Attachments: HIVE-11663-2.patch
>
>   Original Estimate: 96h
>  Time Spent: 96h
>  Remaining Estimate: 0h
>
> when adding custom functions used in hiveserver2, the most method is re-build 
> the hive source code, re-dist and restart hiveserver2. This way will produce 
> big cost for service user and cluster manager. The best way, in my opinion, 
> the custom udf should be like a plugin to the hiveserver2 and hive cli, and  
> users can add and remove at run-time, especially for hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11964) RelOptHiveTable.hiveColStatsMap might contain mismatched column stats

2015-09-25 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908520#comment-14908520
 ] 

Laljo John Pullokkaran commented on HIVE-11964:
---

[~ctang.ma] Please see the comments on RB

> RelOptHiveTable.hiveColStatsMap might contain mismatched column stats
> -
>
> Key: HIVE-11964
> URL: https://issues.apache.org/jira/browse/HIVE-11964
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11964.patch
>
>
> RelOptHiveTable.hiveColStatsMap might contain mismatched stats since it was 
> built by assuming the stats returned from
> ==
> hiveColStats =StatsUtils.getTableColumnStats(hiveTblMetadata, 
> hiveNonPartitionCols, nonPartColNamesThatRqrStats);
> or 
> HiveMetaStoreClient.getTableColumnStatistics(dbName, tableName, colNames)
> ==
> have the same order of the requested columns. But actually the order is 
> non-deterministic. therefore the returned stats should be re-ordered before 
> it is put in hiveColStatsMap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11684) Implement limit pushdown through outer join in CBO

2015-09-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11684:
---
Attachment: HIVE-11684.09.patch

> Implement limit pushdown through outer join in CBO
> --
>
> Key: HIVE-11684
> URL: https://issues.apache.org/jira/browse/HIVE-11684
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, 
> HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, 
> HIVE-11684.07.patch, HIVE-11684.08.patch, HIVE-11684.09.patch, 
> HIVE-11684.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11966) JDBC Driver parsing error when reading principal from ZooKeeper

2015-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908576#comment-14908576
 ] 

Thejas M Nair commented on HIVE-11966:
--

+1

> JDBC Driver parsing error when reading principal from ZooKeeper
> ---
>
> Key: HIVE-11966
> URL: https://issues.apache.org/jira/browse/HIVE-11966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11966.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908585#comment-14908585
 ] 

Hive QA commented on HIVE-11835:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762049/HIVE-11835.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5415/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5415/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5415/

Messages:
{noformat}
 This message was trimmed, see log for full details 
main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/tmp/conf
 [copy] Copying 10 files to 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-storage-api ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-storage-api ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-storage-api ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/hive-storage-api-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hive-storage-api ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-storage-api ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/storage-api/target/hive-storage-api-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/hive-storage-api/2.0.0-SNAPSHOT/hive-storage-api-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/storage-api/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/hive-storage-api/2.0.0-SNAPSHOT/hive-storage-api-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Common 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-common ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/common/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/common 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-common ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-version-annotation) @ 
hive-common ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-common 
---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/common/src/gen added.
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-common ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-common ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-common ---
[INFO] Compiling 77 source files to 
/data/hive-ptest/working/apache-github-source-source/common/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java:
 
/data/hive-ptest/working/apache-github-source-source/common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/common/src/java/org/apache/hadoop/hive/common/JvmPauseMonitor.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/common/src/java/org/apache/hadoop/hive/common/ObjectPair.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/common/src/java/org/apache/hadoop/hive/common/ObjectPair.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-common ---
[INFO] Using 

[jira] [Commented] (HIVE-11946) TestNotificationListener is flaky

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908497#comment-14908497
 ] 

Sergio Peña commented on HIVE-11946:


Why didn't the List add {{DROP_DATABASE}}? What's the difference between List 
and Vector?

> TestNotificationListener is flaky
> -
>
> Key: HIVE-11946
> URL: https://issues.apache.org/jira/browse/HIVE-11946
> Project: Hive
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11946.1.patch
>
>
> {noformat}
> expected:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE, DROP_DATABASE]> but 
> was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
> Stacktrace
> java.lang.AssertionError: expected:<[CREATE_DATABASE, CREATE_TABLE, 
> ADD_PARTITION, ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE, 
> DROP_DATABASE]> but was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, 
> ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hive.hcatalog.listener.TestNotificationListener.tearDown(TestNotificationListener.java:114)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11950) WebHCat status file doesn't show UTF8 character

2015-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908592#comment-14908592
 ] 

Thejas M Nair commented on HIVE-11950:
--

+1

> WebHCat status file doesn't show UTF8 character
> ---
>
> Key: HIVE-11950
> URL: https://issues.apache.org/jira/browse/HIVE-11950
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 1.2.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11950.1.patch
>
>
> If we do a select on a UTF8 table and store the console output into the 
> status file (enablelog=true), the UTF8 character is garbled. The reason is we 
> don't specify encoding when opening stdout/stderr in statusdir. This will 
> cause problem especially on Windows, when the default OS encoding is not UTF8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm

2015-09-25 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11954:
---
Attachment: HIVE-11954.patch

> Extend logic to choose side table in MapJoin Conversion algorithm
> -
>
> Key: HIVE-11954
> URL: https://issues.apache.org/jira/browse/HIVE-11954
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11954.patch, HIVE-11954.patch
>
>
> Selection of side table (in memory/hash table) in MapJoin Conversion 
> algorithm needs to be more sophisticated.
> In an N way Map Join, Hive should pick an input stream as side table (in 
> memory table) that has least cost in producing relation (like TS(FIL|Proj)*).
> Cost based choice needs extended cost model; without return path its going to 
> be hard to do this.
> For the time being we could employ a modified cost based algorithm for side 
> table selection.
> New algorithm is described below:
> 1. Identify the candidate set of inputs for side table (in memory/hash table) 
> from the inputs (based on conditional task size)
> 2. For each of the input identify its cost, memory requirement. Cost is 1 for 
> each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for 
> an input is the total no of heavy weight ops in its branch.
> 3. Order set from #1 on cost & memory req (ascending order)
> 4. Pick the first element from #3 as the side table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908606#comment-14908606
 ] 

Sergey Shelukhin commented on HIVE-11969:
-

[~hagleitn] [~sseth] fyi

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11714) Turn off hybrid grace hash join for cross product join

2015-09-25 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-11714:
-
Attachment: HIVE-11714.3.patch

Fixed golden file mismatches

> Turn off hybrid grace hash join for cross product join
> --
>
> Key: HIVE-11714
> URL: https://issues.apache.org/jira/browse/HIVE-11714
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-11714.1.patch, HIVE-11714.2.patch, 
> HIVE-11714.3.patch
>
>
> Current partitioning calculation is solely based on hash value of the key. 
> For cross product join where keys are empty, all the rows will be put into 
> partition 0. This falls back to the regular mapjoin behavior where we only 
> have one hashtable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11946) TestNotificationListener is flaky

2015-09-25 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908537#comment-14908537
 ] 

Jimmy Xiang commented on HIVE-11946:


Good question. Vector is synchronized while ArrayList is not.  Parameter 
actualMessages is updated in aother thread, so the assertion method may fail if 
it doesn't see the change without synchronization.

> TestNotificationListener is flaky
> -
>
> Key: HIVE-11946
> URL: https://issues.apache.org/jira/browse/HIVE-11946
> Project: Hive
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11946.1.patch
>
>
> {noformat}
> expected:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE, DROP_DATABASE]> but 
> was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
> Stacktrace
> java.lang.AssertionError: expected:<[CREATE_DATABASE, CREATE_TABLE, 
> ADD_PARTITION, ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE, 
> DROP_DATABASE]> but was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, 
> ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hive.hcatalog.listener.TestNotificationListener.tearDown(TestNotificationListener.java:114)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-0?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908582#comment-14908582
 ] 

Hive QA commented on HIVE-0:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762271/HIVE-0.17.patch

{color:red}ERROR:{color} -1 due to 589 failed/errored test(s), 9620 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4

[jira] [Updated] (HIVE-11970) COLUMNS_V2 table in metastore should have a longer name field

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11970:

Summary: COLUMNS_V2 table in metastore should have a longer name field  
(was: COLUMNS_V2 table in metastore should have longer name field)

> COLUMNS_V2 table in metastore should have a longer name field
> -
>
> Key: HIVE-11970
> URL: https://issues.apache.org/jira/browse/HIVE-11970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> In some cases, esp. with derived names, e.g. from Avro schemas, the column 
> names can be pretty long. COLUMNS_V2 has a very short length.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11970) COLUMNS_V2 table in metastore should have a longer name field

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11970:

Description: In some cases, esp. with derived names, e.g. from Avro 
schemas, the column names can be pretty long. COLUMNS_V2 name field has a very 
short length.  (was: In some cases, esp. with derived names, e.g. from Avro 
schemas, the column names can be pretty long. COLUMNS_V2 has a very short 
length.)

> COLUMNS_V2 table in metastore should have a longer name field
> -
>
> Key: HIVE-11970
> URL: https://issues.apache.org/jira/browse/HIVE-11970
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> In some cases, esp. with derived names, e.g. from Avro schemas, the column 
> names can be pretty long. COLUMNS_V2 name field has a very short length.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11946) TestNotificationListener is flaky

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908610#comment-14908610
 ] 

Sergio Peña commented on HIVE-11946:


Good. Thanks for the explanation.
The change makes sense.
+1

> TestNotificationListener is flaky
> -
>
> Key: HIVE-11946
> URL: https://issues.apache.org/jira/browse/HIVE-11946
> Project: Hive
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11946.1.patch
>
>
> {noformat}
> expected:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE, DROP_DATABASE]> but 
> was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, ALTER_PARTITION, 
> DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
> Stacktrace
> java.lang.AssertionError: expected:<[CREATE_DATABASE, CREATE_TABLE, 
> ADD_PARTITION, ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE, 
> DROP_DATABASE]> but was:<[CREATE_DATABASE, CREATE_TABLE, ADD_PARTITION, 
> ALTER_PARTITION, DROP_PARTITION, ALTER_TABLE, DROP_TABLE]>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hive.hcatalog.listener.TestNotificationListener.tearDown(TestNotificationListener.java:114)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-10083) SMBJoin fails in case one table is uninitialized

2015-09-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10083:
---
Comment: was deleted

(was: Thanks for your email.
Unfortunately, you will no longer be able to reach me under this mailaccount.
Please note that your email will not be forwarded.
For urgent inquiries, please contact my colleague Philipp Kölmel via email 
p.koel...@bigpoint.net.
Best regards,
Alain Blankenburg-Schröder
)

> SMBJoin fails in case one table is uninitialized
> 
>
> Key: HIVE-10083
> URL: https://issues.apache.org/jira/browse/HIVE-10083
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0
> Environment: MapR Hive 0.13
>Reporter: Alain Schröder
>Assignee: Na Yang
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-10083.patch
>
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the 
> tables used for the JOIN is uninitialized. Everything works if both are 
> uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
> (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> [...]
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE  TABLE `test1` (
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE  TABLE `test2`(
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2 
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure 
> fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
> AbstractBucketJoinProc.java and it does not seem to have changed from our 
> MapR Hive 0.13 to current snapshot, so this should be also an error in the 
> current Version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-25 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908338#comment-14908338
 ] 

Illya Yalovyy commented on HIVE-11583:
--

[~ashutoshc], I have a qTest for this issue, but includes rather big gz - 
compressed file. What is the best way to contribute it? The question is how to 
create a patch for this big binary file?


> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11484) Fix ObjectInspector for Char and VarChar

2015-09-25 Thread Deepak Barr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Barr reassigned HIVE-11484:
--

Assignee: Deepak Barr

> Fix ObjectInspector for Char and VarChar
> 
>
> Key: HIVE-11484
> URL: https://issues.apache.org/jira/browse/HIVE-11484
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Amareshwari Sriramadasu
>Assignee: Deepak Barr
>
> The creation of HiveChar and Varchar is not happening through ObjectInspector.
> Here is fix we pushed internally : 
> https://github.com/InMobi/hive/commit/fe95c7850e7130448209141155f28b25d3504216



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11887) spark tests break the build on a shared machine

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908414#comment-14908414
 ] 

Sergey Shelukhin commented on HIVE-11887:
-

It's part of itests build: 
{noformat}
$ grep -B 30 -A 2 UDFExa itests/pom.xml 

  
download-spark
generate-sources

  run


  

  set -x
  /bin/pwd
  BASE_DIR=./target
  HIVE_ROOT=$BASE_DIR/../../../
  DOWNLOAD_DIR=./../thirdparty
  download() {
url=$1;
finalName=$2
tarName=$(basename $url)
rm -rf $BASE_DIR/$finalName
if [[ ! -f $DOWNLOAD_DIR/$tarName ]]
then
 curl -Sso $DOWNLOAD_DIR/$tarName $url
fi
tar -zxf $DOWNLOAD_DIR/$tarName -C $BASE_DIR
mv 
$BASE_DIR/spark-${spark.version}-bin-hadoop2-without-hive $BASE_DIR/$finalName
  }
  mkdir -p $DOWNLOAD_DIR
  download 
"http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz;
 "spark"
  cp -f $HIVE_ROOT/data/conf/spark/log4j2.xml 
$BASE_DIR/spark/conf/
  sed '/package /d' 
${basedir}/${hive.path.to.root}/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleAdd.java
 > /tmp/UDFExampleAdd.java
  javac -cp  
${settings.localRepository}/org/apache/hive/hive-exec/${project.version}/hive-exec-${project.version}.jar
 /tmp/UDFExampleAdd.java -d /tmp
  jar -cf /tmp/udfexampleadd-1.0.jar -C /tmp 
UDFExampleAdd.class

  
{noformat}
See the last two lines

> spark tests break the build on a shared machine
> ---
>
> Key: HIVE-11887
> URL: https://issues.apache.org/jira/browse/HIVE-11887
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Spark download creates UDFExampleAdd jar in /tmp; when building on a shared 
> machine, someone else's jar from a build prevents this jar from being created 
> (I have no permissions to this file because it was created by a different 
> user) and the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11887) spark tests break the build on a shared machine

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908414#comment-14908414
 ] 

Sergey Shelukhin edited comment on HIVE-11887 at 9/25/15 5:57 PM:
--

Sorry, java, not jar
It's part of itests build: 
{noformat}
$ grep -B 30 -A 2 UDFExa itests/pom.xml 

  
download-spark
generate-sources

  run


  

  set -x
  /bin/pwd
  BASE_DIR=./target
  HIVE_ROOT=$BASE_DIR/../../../
  DOWNLOAD_DIR=./../thirdparty
  download() {
url=$1;
finalName=$2
tarName=$(basename $url)
rm -rf $BASE_DIR/$finalName
if [[ ! -f $DOWNLOAD_DIR/$tarName ]]
then
 curl -Sso $DOWNLOAD_DIR/$tarName $url
fi
tar -zxf $DOWNLOAD_DIR/$tarName -C $BASE_DIR
mv 
$BASE_DIR/spark-${spark.version}-bin-hadoop2-without-hive $BASE_DIR/$finalName
  }
  mkdir -p $DOWNLOAD_DIR
  download 
"http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz;
 "spark"
  cp -f $HIVE_ROOT/data/conf/spark/log4j2.xml 
$BASE_DIR/spark/conf/
  sed '/package /d' 
${basedir}/${hive.path.to.root}/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleAdd.java
 > /tmp/UDFExampleAdd.java
  javac -cp  
${settings.localRepository}/org/apache/hive/hive-exec/${project.version}/hive-exec-${project.version}.jar
 /tmp/UDFExampleAdd.java -d /tmp
  jar -cf /tmp/udfexampleadd-1.0.jar -C /tmp 
UDFExampleAdd.class

  
{noformat}
See the last two lines


was (Author: sershe):
It's part of itests build: 
{noformat}
$ grep -B 30 -A 2 UDFExa itests/pom.xml 

  
download-spark
generate-sources

  run


  

  set -x
  /bin/pwd
  BASE_DIR=./target
  HIVE_ROOT=$BASE_DIR/../../../
  DOWNLOAD_DIR=./../thirdparty
  download() {
url=$1;
finalName=$2
tarName=$(basename $url)
rm -rf $BASE_DIR/$finalName
if [[ ! -f $DOWNLOAD_DIR/$tarName ]]
then
 curl -Sso $DOWNLOAD_DIR/$tarName $url
fi
tar -zxf $DOWNLOAD_DIR/$tarName -C $BASE_DIR
mv 
$BASE_DIR/spark-${spark.version}-bin-hadoop2-without-hive $BASE_DIR/$finalName
  }
  mkdir -p $DOWNLOAD_DIR
  download 
"http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz;
 "spark"
  cp -f $HIVE_ROOT/data/conf/spark/log4j2.xml 
$BASE_DIR/spark/conf/
  sed '/package /d' 
${basedir}/${hive.path.to.root}/contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleAdd.java
 > /tmp/UDFExampleAdd.java
  javac -cp  
${settings.localRepository}/org/apache/hive/hive-exec/${project.version}/hive-exec-${project.version}.jar
 /tmp/UDFExampleAdd.java -d /tmp
  jar -cf /tmp/udfexampleadd-1.0.jar -C /tmp 
UDFExampleAdd.class

  
{noformat}
See the last two lines

> spark tests break the build on a shared machine
> ---
>
> Key: HIVE-11887
> URL: https://issues.apache.org/jira/browse/HIVE-11887
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> Spark download creates UDFExampleAdd jar in /tmp; when building on a shared 
> machine, someone else's jar from a build prevents this jar from being created 
> (I have no permissions to this file because it was created by a different 
> user) and the build fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908415#comment-14908415
 ] 

Laljo John Pullokkaran commented on HIVE-11880:
---

[~sjtufighter] Could you upload the patch to RB?

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11963) Llap: Disable web app for mini llap tests

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11963:
-
Attachment: HIVE-11963.patch

I seriously thought about that :)

> Llap: Disable web app for mini llap tests
> -
>
> Key: HIVE-11963
> URL: https://issues.apache.org/jira/browse/HIVE-11963
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11963.patch
>
>
> We don't need web app service for mini llap tests. Provide config to disable 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11935) Access HiveMetaStoreClient.currentMetaVars should be synchronized

2015-09-25 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11935:
--
Attachment: HIVE-11935.2.patch

Ok, let's use the local copy then.

> Access HiveMetaStoreClient.currentMetaVars should be synchronized
> -
>
> Key: HIVE-11935
> URL: https://issues.apache.org/jira/browse/HIVE-11935
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11935.1.patch, HIVE-11935.2.patch
>
>
> We saw intermittent failure of the following stack:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.isCompatibleWith(HiveMetaStoreClient.java:287)
> at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
> at com.sun.proxy.$Proxy9.isCompatibleWith(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:206)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.createHiveDB(BaseSemanticAnalyzer.java:205)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.(DDLSemanticAnalyzer.java:223)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzerFactory.get(SemanticAnalyzerFactory.java:259)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:409)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1116)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:375)
> at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
> at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source)
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:274)
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)
> at 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:171)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:479)
> at 
> 

[jira] [Resolved] (HIVE-11963) Llap: Disable web app for mini llap tests

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11963.
--
   Resolution: Fixed
Fix Version/s: llap

> Llap: Disable web app for mini llap tests
> -
>
> Key: HIVE-11963
> URL: https://issues.apache.org/jira/browse/HIVE-11963
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
> Attachments: HIVE-11963.2.patch, HIVE-11963.patch
>
>
> We don't need web app service for mini llap tests. Provide config to disable 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11910) TestHCatLoaderEncryption should shutdown created MiniDFS instance

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908391#comment-14908391
 ] 

Ashutosh Chauhan commented on HIVE-11910:
-

+1

> TestHCatLoaderEncryption should shutdown created MiniDFS instance
> -
>
> Key: HIVE-11910
> URL: https://issues.apache.org/jira/browse/HIVE-11910
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-11910.1.patch
>
>
> setup() creates a MiniDFS instance, but this is never shut down. On my Linux 
> VM this causes this test to fail with OOM/cannot create new thread/other 
> errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11937) Improve StatsOptimizer to deal with query with additional constant columns

2015-09-25 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908398#comment-14908398
 ] 

Pengcheng Xiong commented on HIVE-11937:


update golden files for metadata_only_queries on the other test drivers. the 
other test failures are unrelated.

> Improve StatsOptimizer to deal with query with additional constant columns
> --
>
> Key: HIVE-11937
> URL: https://issues.apache.org/jira/browse/HIVE-11937
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11937.01.patch, HIVE-11937.02.patch
>
>
> Right now StatsOptimizer can deal with query such as "select count(1) from 
> src" by directly looking into the metastore. However, it can not deal with 
> "select '1' as one, count(1) from src" which has an additional constant 
> column. We may improve it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11945) ORC with non-local reads may not be reusing connection to DN

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908400#comment-14908400
 ] 

Sergey Shelukhin commented on HIVE-11945:
-

+1

> ORC with non-local reads may not be reusing connection to DN
> 
>
> Key: HIVE-11945
> URL: https://issues.apache.org/jira/browse/HIVE-11945
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-11945.1.patch, HIVE-11945.2.patch, 
> HIVE-11945.3.patch
>
>
> When “seek + readFully(buffer, offset, length)” is used,  DFSInputStream ends 
> up going via “readWithStrategy()”.  This sets up BlockReader with length 
> equivalent to that of the block size. So until this position is reached, 
> RemoteBlockReader2.peer would not be added to the PeerCache (Plz refer 
> RemoteBlockReader2.close() in HDFS).  So eventually the next call to the same 
> DN would end opening a new socket.  In ORC, when it is not a data local read, 
> this has a the possibility of opening/closing lots of connections with DN.  
> In random reads, it would be good to set this length to the amount of data 
> that is to be read (e.g pread call in DFSInputStream which sets up the 
> BlockReader’s length correctly & the code path returns the Peer back to peer 
> cache properly).  “readFully(position, buffer, offset, length)” follows this 
> code path and ends up reusing the connections properly. Creating this JIRA to 
> fix this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908396#comment-14908396
 ] 

Ashutosh Chauhan commented on HIVE-11880:
-

[~jpullokkaran] Do you want to take a look at this one?

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11964) RelOptHiveTable.hiveColStatsMap might contain mismatched column stats

2015-09-25 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11964:
---
Attachment: HIVE-11964.patch

> RelOptHiveTable.hiveColStatsMap might contain mismatched column stats
> -
>
> Key: HIVE-11964
> URL: https://issues.apache.org/jira/browse/HIVE-11964
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Statistics
>Affects Versions: 1.2.1
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-11964.patch
>
>
> RelOptHiveTable.hiveColStatsMap might contain mismatched stats since it was 
> built by assuming the stats returned from
> ==
> hiveColStats =StatsUtils.getTableColumnStats(hiveTblMetadata, 
> hiveNonPartitionCols, nonPartColNamesThatRqrStats);
> or 
> HiveMetaStoreClient.getTableColumnStatistics(dbName, tableName, colNames)
> ==
> have the same order of the requested columns. But actually the order is 
> non-deterministic. therefore the returned stats should be re-ordered before 
> it is put in hiveColStatsMap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11963) Llap: Disable web app for mini llap tests

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11963:
-
Attachment: HIVE-11963.2.patch

> Llap: Disable web app for mini llap tests
> -
>
> Key: HIVE-11963
> URL: https://issues.apache.org/jira/browse/HIVE-11963
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-11963.2.patch, HIVE-11963.patch
>
>
> We don't need web app service for mini llap tests. Provide config to disable 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908352#comment-14908352
 ] 

Xuefu Zhang commented on HIVE-11473:


Thanks, Rui. Yes. we can build Spark w/o parquet.  I will do that and publish a 
new tarball.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch, 
> HIVE-11473.2-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908411#comment-14908411
 ] 

Laljo John Pullokkaran commented on HIVE-11880:
---

[~wangmeng] Could you upload the patch to RB?

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11966) JDBC Driver parsing error when reading principal from ZooKeeper

2015-09-25 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-11966:

Attachment: HIVE-11966.1.patch

> JDBC Driver parsing error when reading principal from ZooKeeper
> ---
>
> Key: HIVE-11966
> URL: https://issues.apache.org/jira/browse/HIVE-11966
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-11966.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11583) When PTF is used over a large partitions result could be corrupted

2015-09-25 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908480#comment-14908480
 ] 

Illya Yalovyy commented on HIVE-11583:
--

I tried, and when I did it from hive script it didn't take any effect. Is the 
any way to reconfigure it BEFORE the test? 

> When PTF is used over a large partitions result could be corrupted
> --
>
> Key: HIVE-11583
> URL: https://issues.apache.org/jira/browse/HIVE-11583
> Project: Hive
>  Issue Type: Bug
>  Components: PTF-Windowing
>Affects Versions: 0.14.0, 0.13.1, 0.14.1, 1.0.0, 1.2.0, 1.2.1
> Environment: Hadoop 2.6 + Apache hive built from trunk
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-11583.patch
>
>
> Dataset: 
>  Window has 50001 record (2 blocks on disk and 1 block in memory)
>  Size of the second block is >32Mb (2 splits)
> Result:
> When the last block is read from the disk only first split is actually 
> loaded. The second split gets missed. The total count of the result dataset 
> is correct, but some records are missing and another are duplicated.
> Example:
> {code:sql}
> CREATE TABLE ptf_big_src (
>   id INT,
>   key STRING,
>   grp STRING,
>   value STRING
> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
> LOAD DATA LOCAL INPATH '../../data/files/ptf_3blocks.txt.gz' OVERWRITE INTO 
> TABLE ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> ---
> -- A  25000
> -- B  2
> -- C  5001
> ---
> CREATE TABLE ptf_big_trg AS SELECT *, row_number() OVER (PARTITION BY key 
> ORDER BY grp) grp_num FROM ptf_big_src;
> SELECT grp, COUNT(1) cnt FROM ptf_big_trg GROUP BY grp ORDER BY cnt desc;
> -- 
> -- A  34296
> -- B  15704
> -- C  1
> ---
> {code}
> Counts by 'grp' are incorrect!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11967) LLAP: Merge master to branch

2015-09-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-11967.
--
   Resolution: Fixed
Fix Version/s: llap

> LLAP: Merge master to branch
> 
>
> Key: HIVE-11967
> URL: https://issues.apache.org/jira/browse/HIVE-11967
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: llap
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL

2015-09-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-11835:
---
Attachment: HIVE-11835.2.patch

> Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
> -
>
> Key: HIVE-11835
> URL: https://issues.apache.org/jira/browse/HIVE-11835
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Affects Versions: 1.2.0, 1.1.0, 2.0.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-11835.1.patch, HIVE-11835.2.patch, HIVE-11835.patch
>
>
> Steps to reproduce:
> 1. create a text file with values like 0.0, 0.00, etc.
> 2. create table in hive with type decimal(1,1).
> 3. run "load data local inpath ..." to load data into the table.
> 4. run select * on the table.
> You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these 
> should be read as 0.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11733) UDF GenericUDFReflect cannot find classes added by "ADD JAR"

2015-09-25 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908836#comment-14908836
 ] 

Yibing Shi commented on HIVE-11733:
---

Sorry, got distracted by other stuff. Will add a test case for this.

> UDF GenericUDFReflect cannot find classes added by "ADD JAR"
> 
>
> Key: HIVE-11733
> URL: https://issues.apache.org/jira/browse/HIVE-11733
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.1
>Reporter: Yibing Shi
>Assignee: Yibing Shi
> Attachments: HIVE-11733.1.patch
>
>
> When run below command:
> {quote}
> hive -e "add jar /root/hive/TestReflect.jar; \
> select reflect('com.yshi.hive.TestReflect', 'testReflect', code) from 
> sample_07 limit 3"
> {quote}
> Get below error:
> {noformat}
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> {noformat}
> The full stack trace is:
> {noformat}
> 15/09/04 07:00:37 [main]: INFO compress.CodecPool: Got brand-new decompressor 
> [.bz2]
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> 15/09/04 07:00:37 [main]: ERROR CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> UDFReflect evaluate
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect 
> evaluate
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:416)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
>   ... 13 more
> Caused by: java.lang.ClassNotFoundException: com.yshi.hive.TestReflect
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:190)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFReflect.evaluate(GenericUDFReflect.java:105)
>   ... 22 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators

2015-09-25 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11355:
--
Attachment: HIVE-11355.5.patch

Rebased to trunk.

> Hive on tez: memory manager for sort buffers (input/output) and operators
> -
>
> Key: HIVE-11355
> URL: https://issues.apache.org/jira/browse/HIVE-11355
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-11355.1.patch, HIVE-11355.2.patch, 
> HIVE-11355.3.patch, HIVE-11355.4.patch, HIVE-11355.5.patch
>
>
> We need to better manage the sort buffer allocations to ensure better 
> performance. Also, we need to provide configurations to certain operators to 
> stay within memory limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-11969:
---

Assignee: Sergey Shelukhin

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11941) Update committer list

2015-09-25 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11941:
---
Attachment: HIVE-11941.1.patch

Change Sushanth Sowmyan to PMC member and add Lars Francke as committer. Please 
take a look.

> Update committer list
> -
>
> Key: HIVE-11941
> URL: https://issues.apache.org/jira/browse/HIVE-11941
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-11941.1.patch, HIVE-11941.patch
>
>
> Please update the committer list in http://hive.apache.org/people.html:
> ---
> Name: Chaoyu Tang
> Apache ID: ctang
> Organization: Cloudera (www.cloudera.com)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11901) StorageBasedAuthorizationProvider requires write permission on table for SELECT statements

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908942#comment-14908942
 ] 

Hive QA commented on HIVE-11901:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761542/HIVE-11901.01.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9591 tests executed
*Failed tests:*
{noformat}
TestCliDriver-skewjoinopt3.q-vector_acid3.q-ctas_date.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-orc_merge6.q-vector_outer_join0.q-mapreduce1.q-and-12-more 
- did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5417/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5417/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5417/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761542 - PreCommit-HIVE-TRUNK-Build

> StorageBasedAuthorizationProvider requires write permission on table for 
> SELECT statements
> --
>
> Key: HIVE-11901
> URL: https://issues.apache.org/jira/browse/HIVE-11901
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: HIVE-11901.01.patch
>
>
> With HIVE-7895, it will require write permission on the table directory even 
> for a SELECT statement.
> Looking at the stacktrace, it seems the method 
> {{StorageBasedAuthorizationProvider#authorize(Table table, Partition part, 
> Privilege[] readRequiredPriv, Privilege[] writeRequiredPriv)}} always treats 
> a null partition as a CREATE statement, which can also be a SELECT.
> We may have to check {{readRequiredPriv}} and {{writeRequiredPriv}} first   
> in order to tell which statement it is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11969:

Attachment: HIVE-11969.patch

[~sseth] [~hagleitn] [~gopalv] fyi

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908938#comment-14908938
 ] 

Laljo John Pullokkaran commented on HIVE-11880:
---

[~sjtufighter] HIVE-11919 fixes the issue. I don't see the exception.
Could you try the patch?

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11901) StorageBasedAuthorizationProvider requires write permission on table for SELECT statements

2015-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908967#comment-14908967
 ] 

Thejas M Nair commented on HIVE-11901:
--

Thanks for catching this and the patch [~chengbing.liu]!
Can you also please add a test case for this ? (You can refer to existing tests 
for examples).


> StorageBasedAuthorizationProvider requires write permission on table for 
> SELECT statements
> --
>
> Key: HIVE-11901
> URL: https://issues.apache.org/jira/browse/HIVE-11901
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.1
>Reporter: Chengbing Liu
>Assignee: Chengbing Liu
> Attachments: HIVE-11901.01.patch
>
>
> With HIVE-7895, it will require write permission on the table directory even 
> for a SELECT statement.
> Looking at the stacktrace, it seems the method 
> {{StorageBasedAuthorizationProvider#authorize(Table table, Partition part, 
> Privilege[] readRequiredPriv, Privilege[] writeRequiredPriv)}} always treats 
> a null partition as a CREATE statement, which can also be a SELECT.
> We may have to check {{readRequiredPriv}} and {{writeRequiredPriv}} first   
> in order to tell which statement it is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908980#comment-14908980
 ] 

Sergey Shelukhin commented on HIVE-11969:
-

Will test later to see if it works... also need to see if MiniTez passes

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-11969) start Tez session in background when starting CLI

2015-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908973#comment-14908973
 ] 

Sergey Shelukhin edited comment on HIVE-11969 at 9/26/15 1:36 AM:
--

[~sseth] [~hagleitn] [~gopalv] fyi https://reviews.apache.org/r/38783/


was (Author: sershe):
[~sseth] [~hagleitn] [~gopalv] fyi

> start Tez session in background when starting CLI
> -
>
> Key: HIVE-11969
> URL: https://issues.apache.org/jira/browse/HIVE-11969
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11969.patch
>
>
> Tez session spins up AM, which can cause delays, esp. if the cluster is very 
> busy.
> This can be done in background, so the AM might get started while the user is 
> running local commands and doing other things.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11916) TxnHandler.getOpenTxnsInfo() and getOpenTxns() may produce inconsistent result

2015-09-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11916:
--
Attachment: HIVE-11916.patch

> TxnHandler.getOpenTxnsInfo() and getOpenTxns() may produce inconsistent result
> --
>
> Key: HIVE-11916
> URL: https://issues.apache.org/jira/browse/HIVE-11916
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-11916.patch
>
>
> both run at READ_COMMITTED isolation level and each runs 2 queries.
> Thus it's possible for a new txn to start/commit after hight water mark is 
> recored by these methods.  So the returned list may have txns above the HWM.
> This can be fixed by adding "WHERE TXN_ID < hwm" clause to 2nd query in each 
> case 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11149) Fix issue with sometimes HashMap in PerfLogger.java hangs

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14909013#comment-14909013
 ] 

Hive QA commented on HIVE-11149:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762147/HIVE-11149.04.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9620 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5418/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5418/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5418/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762147 - PreCommit-HIVE-TRUNK-Build

> Fix issue with sometimes HashMap in PerfLogger.java hangs 
> --
>
> Key: HIVE-11149
> URL: https://issues.apache.org/jira/browse/HIVE-11149
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11149.01.patch, HIVE-11149.02.patch, 
> HIVE-11149.03.patch, HIVE-11149.04.patch
>
>
> In  Multi-thread environment,  sometimes the  HashMap in PerfLogger.java  
> will  casue massive Java Processes hang  and cost  large amounts of 
> unnecessary CPU and Memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread WangMeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

WangMeng updated HIVE-11880:

Attachment: HIVE-11880.03.patch

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column

2015-09-25 Thread WangMeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907667#comment-14907667
 ] 

WangMeng commented on HIVE-11880:
-

[~xuefuz]I have rebased it and uploaded a new patch.
[~ashutoshc] I tried the patch of HIVE-11919 again after I rebased, it can not 
fix this bug also.
Please check it again . Thanks.

> filter bug  of UNION ALL when hive.ppd.remove.duplicatefilters=true and 
> filter condition is type incompatible column 
> -
>
> Key: HIVE-11880
> URL: https://issues.apache.org/jira/browse/HIVE-11880
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: WangMeng
>Assignee: WangMeng
> Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, 
> HIVE-11880.03.patch
>
>
>For UNION ALL , when an union operator is constant column (such as '0L', 
> BIGINT Type)  and its corresponding column has incompatible type (such as INT 
> type). 
>   Query with filter condition on type incompatible column on this UNION ALL  
> will cause IndexOutOfBoundsException.
>  Such as TPC-H table "orders",in  the following query:
>  Type of 'orders'.'o_custkey' is INT normally,  while  the type of 
> corresponding constant column  "0" is BIGINT( `0L AS `o_custkey` ). 
>  This query (with filter "type incompatible column 'o_custkey' ")  will fail  
> with  java.lang.IndexOutOfBoundsException : 
> {code}
> SELECT Count(1)
> FROM   (
>   SELECT `o_orderkey` ,
>  `o_custkey`
>   FROM   `orders`
>   UNION ALL
>   SELECT `o_orderkey`,
>  0L  AS `o_custkey`
>   FROM   `orders`) `oo`
> WHERE  o_custkey<10 limit 4 ;
> {code}
> When 
> {code}
> set hive.ppd.remove.duplicatefilters=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10209) FetchTask with VC may fail because ExecMapper.done is true

2015-09-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907678#comment-14907678
 ] 

Lefty Leverenz commented on HIVE-10209:
---

Version note:  This was also committed to branch-1.0 (for release 1.0.2) on 
September 24th with commit 2801d2c4b1a61315ae7f28c0ea825580e30f411b.

> FetchTask with VC may fail because ExecMapper.done is true
> --
>
> Key: HIVE-10209
> URL: https://issues.apache.org/jira/browse/HIVE-10209
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.1.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 1.2.0
>
> Attachments: HIVE-10209.1-spark.patch, HIVE-10209.2-spark.patch
>
>
> ExecMapper.done is a static variable, and may cause issues in the following 
> example:
> {code}
> set hive.fetch.task.conversion=minimal;
> select * from src where key < 10 limit 1;
> set hive.fetch.task.conversion=more;
> select *, BLOCK__OFFSET_INSIDE__FILE from src where key < 10;
> {code}
> The second select won't return any result, if running in local mode.
> The issue is, the first select query will be converted to a MapRedTask with 
> only a mapper. And, when the task is done, because of the limit operator, 
> ExecMapper.done will be set to true.
> Then, when the second select query begin to execute, it will call 
> {{FetchOperator::getRecordReader()}}, and since here we have virtual column, 
> an instance of {{HiveRecordReader}} will be returned. The problem is, 
> {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, 
> since the value is true, it will quit immediately.
> In short, I think making ExecMapper.done static is a bad idea. The first 
> query should in no way affect the second one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11943) Set old CLI as the default Client when using hive script

2015-09-25 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907677#comment-14907677
 ] 

Ferdinand Xu commented on HIVE-11943:
-

Thanks [~leftylev] for figuring it out. I have updated the related section in 
the wiki. Please help me review it. Thank you!

> Set old CLI as the default Client when using hive script
> 
>
> Key: HIVE-11943
> URL: https://issues.apache.org/jira/browse/HIVE-11943
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11943.1-beeline-cli.patch
>
>
> Since we have some concerns about deprecating the current CLI, we will set 
> the old CLI as default. Once we resolve the problems, we will set the new CLI 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11943) Set old CLI as the default Client when using hive script

2015-09-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907722#comment-14907722
 ] 

Lefty Leverenz commented on HIVE-11943:
---

Looks, good, [~Ferd], thanks.  I tinkered with the wording a bit.

> Set old CLI as the default Client when using hive script
> 
>
> Key: HIVE-11943
> URL: https://issues.apache.org/jira/browse/HIVE-11943
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11943.1-beeline-cli.patch
>
>
> Since we have some concerns about deprecating the current CLI, we will set 
> the old CLI as default. Once we resolve the problems, we will set the new CLI 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10083) SMBJoin fails in case one table is uninitialized

2015-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907774#comment-14907774
 ] 

Alain Blankenburg-Schröder commented on HIVE-10083:
---

Thanks for your email.
Unfortunately, you will no longer be able to reach me under this mailaccount.
Please note that your email will not be forwarded.
For urgent inquiries, please contact my colleague Philipp Kölmel via email 
p.koel...@bigpoint.net.
Best regards,
Alain Blankenburg-Schröder


> SMBJoin fails in case one table is uninitialized
> 
>
> Key: HIVE-10083
> URL: https://issues.apache.org/jira/browse/HIVE-10083
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0
> Environment: MapR Hive 0.13
>Reporter: Alain Schröder
>Assignee: Na Yang
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-10083.patch
>
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the 
> tables used for the JOIN is uninitialized. Everything works if both are 
> uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
> (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> [...]
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE  TABLE `test1` (
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE  TABLE `test2`(
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2 
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure 
> fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
> AbstractBucketJoinProc.java and it does not seem to have changed from our 
> MapR Hive 0.13 to current snapshot, so this should be also an error in the 
> current Version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10083) SMBJoin fails in case one table is uninitialized

2015-09-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907773#comment-14907773
 ] 

Lefty Leverenz commented on HIVE-10083:
---

Version note: This was also committed to branch-1.0 (for release 1.0.2) on 
September 24 with commit a7618dfb9f93eab922f1939680dca4ae5d5a8f6b.

> SMBJoin fails in case one table is uninitialized
> 
>
> Key: HIVE-10083
> URL: https://issues.apache.org/jira/browse/HIVE-10083
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.13.0
> Environment: MapR Hive 0.13
>Reporter: Alain Schröder
>Assignee: Na Yang
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: HIVE-10083.patch
>
>
> We experience IndexOutOfBoundsException in a SMBJoin in the case on the 
> tables used for the JOIN is uninitialized. Everything works if both are 
> uninitialized or initialized.
> {code}
> 2015-03-24 09:12:58,967 ERROR [main]: ql.Driver 
> (SessionState.java:printError(545)) - FAILED: IndexOutOfBoundsException 
> Index: 0, Size: 0
> java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.fillMappingBigTableBucketFileNameToSmallTableBucketFileNames(AbstractBucketJoinProc.java:486)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.convertMapJoinToBucketMapJoin(AbstractBucketJoinProc.java:429)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToBucketMapJoin(AbstractSMBJoinProc.java:540)
> at 
> org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.convertJoinToSMBJoin(AbstractSMBJoinProc.java:549)
> at 
> org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:51)
> [...]
> {code}
> Simplest way to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
> CREATE DATABASE IF NOT EXISTS tmp;
> USE tmp;
> CREATE  TABLE `test1` (
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> stored as orc;
> CREATE  TABLE `test2`(
>   `foo` bigint )
> CLUSTERED BY (
>   foo)
> SORTED BY (
>   foo ASC)
> INTO 384 BUCKETS
> STORED AS ORC;
> -- Initialize ONE table of the two tables with any data.
> INSERT INTO TABLE test1 SELECT foo FROM table_with_some_content LIMIT 100;
> SELECT t1.foo, t2.foo
> FROM test1 t1 INNER JOIN test2 t2 
> ON (t1.foo = t2.foo);
> {code}
> I took a look at the Procedure 
> fillMappingBigTableBucketFileNameToSmallTableBucketFileNames in 
> AbstractBucketJoinProc.java and it does not seem to have changed from our 
> MapR Hive 0.13 to current snapshot, so this should be also an error in the 
> current Version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-11473:
--
Attachment: HIVE-11473.1-spark.patch

Try again. The spark dependencies can be resolved on my side.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11941) Update committer list

2015-09-25 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907766#comment-14907766
 ] 

Lars Francke commented on HIVE-11941:
-

Thanks! Sure:

* Name: Lars Francke
* Username: larsfrancke
* Organization: Freelancer / http://lars-francke.de/en/

> Update committer list
> -
>
> Key: HIVE-11941
> URL: https://issues.apache.org/jira/browse/HIVE-11941
> Project: Hive
>  Issue Type: Bug
>Reporter: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-11941.patch
>
>
> Please update the committer list in http://hive.apache.org/people.html:
> ---
> Name: Chaoyu Tang
> Apache ID: ctang
> Organization: Cloudera (www.cloudera.com)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-11473:
--
Attachment: HIVE-11473.2-spark.patch

Remove the repo since we no longer need snapshots.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch, 
> HIVE-11473.2-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11943) Set old CLI as the default Client when using hive script

2015-09-25 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907728#comment-14907728
 ] 

Ferdinand Xu commented on HIVE-11943:
-

Thanks [~leftylev]

> Set old CLI as the default Client when using hive script
> 
>
> Key: HIVE-11943
> URL: https://issues.apache.org/jira/browse/HIVE-11943
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11943.1-beeline-cli.patch
>
>
> Since we have some concerns about deprecating the current CLI, we will set 
> the old CLI as default. Once we resolve the problems, we will set the new CLI 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907741#comment-14907741
 ] 

Hive QA commented on HIVE-11473:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762295/HIVE-11473.1-spark.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/950/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/950/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-950/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-core_2.10/1.5.0/spark-core_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-core_2.10/1.5.0/spark-core_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-core_2.10/1.5.0/spark-core_2.10-1.5.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-core_2.10/1.5.0/spark-core_2.10-1.5.0.pom
 (20 KB at 189.7 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-parent_2.10/1.5.0/spark-parent_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-parent_2.10/1.5.0/spark-parent_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-parent_2.10/1.5.0/spark-parent_2.10-1.5.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-parent_2.10/1.5.0/spark-parent_2.10-1.5.0.pom
 (85 KB at 1153.1 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-launcher_2.10/1.5.0/spark-launcher_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-launcher_2.10/1.5.0/spark-launcher_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-launcher_2.10/1.5.0/spark-launcher_2.10-1.5.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-launcher_2.10/1.5.0/spark-launcher_2.10-1.5.0.pom
 (5 KB at 191.7 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-network-common_2.10/1.5.0/spark-network-common_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-network-common_2.10/1.5.0/spark-network-common_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-network-common_2.10/1.5.0/spark-network-common_2.10-1.5.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-network-common_2.10/1.5.0/spark-network-common_2.10-1.5.0.pom
 (4 KB at 136.1 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-network-shuffle_2.10/1.5.0/spark-network-shuffle_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-network-shuffle_2.10/1.5.0/spark-network-shuffle_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-network-shuffle_2.10/1.5.0/spark-network-shuffle_2.10-1.5.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-network-shuffle_2.10/1.5.0/spark-network-shuffle_2.10-1.5.0.pom
 (4 KB at 142.4 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/apache/spark/spark-unsafe_2.10/1.5.0/spark-unsafe_2.10-1.5.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/org/apache/spark/spark-unsafe_2.10/1.5.0/spark-unsafe_2.10-1.5.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/spark/spark-unsafe_2.10/1.5.0/spark-unsafe_2.10-1.5.0.pom
Downloading: 
https://repository.apache.org/content/repositories/releases/org/apache/spark/spark-unsafe_2.10/1.5.0/spark-unsafe_2.10-1.5.0.pom
Downloaded: 
https://repository.apache.org/content/repositories/releases/org/apache/spark/spark-unsafe_2.10/1.5.0/spark-unsafe_2.10-1.5.0.pom
 (5 KB at 4.1 KB/sec)
Downloading: 
http://www.datanucleus.org/downloads/maven2/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.pom
Downloading: 
https://s3-us-west-1.amazonaws.com/hive-spark/maven2/spark_2.10-1.3-rc1/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.pom
Downloading: 
http://repo.maven.apache.org/maven2/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.pom
Downloaded: 
http://repo.maven.apache.org/maven2/net/jpountz/lz4/lz4/1.3.0/lz4-1.3.0.pom (2 
KB at 52.3 KB/sec)
Downloading: 

[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907767#comment-14907767
 ] 

Rui Li commented on HIVE-11473:
---

BTW, I think we can get rid of the {{spark-1.3}} repo?

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907770#comment-14907770
 ] 

Hive QA commented on HIVE-11473:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762349/HIVE-11473.1-spark.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/951/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/951/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-951/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO]   CP: 
/data/hive-ptest/working/maven/com/sun/jersey/contribs/jersey-guice/1.9/jersey-guice-1.9.jar
[INFO]   CP: 
/data/hive-ptest/working/maven/com/google/inject/extensions/guice-servlet/3.0/guice-servlet-3.0.jar
[INFO]   CP: 
/data/hive-ptest/working/maven/io/netty/netty/3.6.2.Final/netty-3.6.2.Final.jar
[INFO]   CP: 
/data/hive-ptest/working/maven/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar
DataNucleus Enhancer (version 3.2.10) for API "JDO" using JRE "1.7"
DataNucleus Enhancer : Classpath
>>  /usr/local/apache-maven-3.0.5/boot/plexus-classworlds-2.4.jar
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MColumnDescriptor
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MStringList
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MPartition
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MIndex
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MRole
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MRoleMap
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MGlobalPrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MDBPrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MTablePrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MPartitionPrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MPartitionColumnPrivilege
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MPartitionEvent
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MMasterKey
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MDelegationToken
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MTableColumnStatistics
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MVersionTable
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MResourceUri
ENHANCED (PersistenceCapable) : org.apache.hadoop.hive.metastore.model.MFunction
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MNotificationLog
ENHANCED (PersistenceCapable) : 
org.apache.hadoop.hive.metastore.model.MNotificationNextId
DataNucleus Enhancer completed with success for 29 classes. Timings : input=182 
ms, enhance=313 ms, total=495 ms. Consult the log for full details
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
hive-metastore ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-git-source-source/metastore/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-metastore ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/metastore/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/metastore/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-git-source-source/metastore/target/tmp/conf
 [copy] Copying 10 files to 
/data/hive-ptest/working/apache-git-source-source/metastore/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) 

[jira] [Commented] (HIVE-11710) Beeline embedded mode doesn't output query progress after setting any session property

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907772#comment-14907772
 ] 

Hive QA commented on HIVE-11710:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761965/HIVE-11710.3.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9568 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_grouping_sets.q-scriptfile1.q-union2.q-and-12-more 
- did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5408/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5408/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5408/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761965 - PreCommit-HIVE-TRUNK-Build

> Beeline embedded mode doesn't output query progress after setting any session 
> property
> --
>
> Key: HIVE-11710
> URL: https://issues.apache.org/jira/browse/HIVE-11710
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-11710.2.patch, HIVE-11710.3.patch, HIVE-11710.patch
>
>
> Connect to beeline embedded mode {{beeline -u jdbc:hive2://}}. Then set 
> anything in the session like {{set aa=true;}}.
> After that, any query like {{select count(*) from src;}} will only output 
> result but no query progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11943) Set old CLI as the default Client when using hive script

2015-09-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907654#comment-14907654
 ] 

Lefty Leverenz commented on HIVE-11943:
---

This should be documented in the wiki when it is merged to master (unless the 
default gets changed by then).

* [Replacing the Implementation of Hive CLI Using Beeline | 
https://cwiki.apache.org/confluence/display/Hive/Replacing+the+Implementation+of+Hive+CLI+Using+Beeline]

Now that the doc-tracking jira HIVE-10810 is closed, we don't have a way to 
flag doc issues for the beeline-cli branch.

> Set old CLI as the default Client when using hive script
> 
>
> Key: HIVE-11943
> URL: https://issues.apache.org/jira/browse/HIVE-11943
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11943.1-beeline-cli.patch
>
>
> Since we have some concerns about deprecating the current CLI, we will set 
> the old CLI as default. Once we resolve the problems, we will set the new CLI 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11937) Improve StatsOptimizer to deal with query with additional constant columns

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907909#comment-14907909
 ] 

Hive QA commented on HIVE-11937:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761972/HIVE-11937.01.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9568 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-enforce_order.q-constprog_dpp.q-auto_join1.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_metadata_only_queries
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5409/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5409/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5409/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761972 - PreCommit-HIVE-TRUNK-Build

> Improve StatsOptimizer to deal with query with additional constant columns
> --
>
> Key: HIVE-11937
> URL: https://issues.apache.org/jira/browse/HIVE-11937
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11937.01.patch
>
>
> Right now StatsOptimizer can deal with query such as "select count(1) from 
> src" by directly looking into the metastore. However, it can not deal with 
> "select '1' as one, count(1) from src" which has an additional constant 
> column. We may improve it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11939) TxnDbUtil should turn off jdbc auto commit

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908029#comment-14908029
 ] 

Hive QA commented on HIVE-11939:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12761994/HIVE-11939.1.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9584 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.metastore.hbase.TestHBaseImport.org.apache.hadoop.hive.metastore.hbase.TestHBaseImport
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5411/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5411/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5411/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12761994 - PreCommit-HIVE-TRUNK-Build

> TxnDbUtil should turn off jdbc auto commit
> --
>
> Key: HIVE-11939
> URL: https://issues.apache.org/jira/browse/HIVE-11939
> Project: Hive
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11939.1.patch
>
>
> TxnDbUtil uses jdbc transactions, but doesn't turn off auto commit. So some 
> TestStreaming tests are flaky. For example,
> {noformat}
> testTransactionBatchAbortAndCommit(org.apache.hive.hcatalog.streaming.TestStreaming)
>   Time elapsed: 0.011 sec  <<< ERROR!
> java.sql.SQLException: Table/View 'TXNS' already exists in Schema 'APP'.
>   at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>   at 
> org.apache.derby.impl.sql.catalog.DataDictionaryImpl.duplicateDescriptorException(Unknown
>  Source)
>   at 
> org.apache.derby.impl.sql.catalog.DataDictionaryImpl.addDescriptor(Unknown 
> Source)
>   at 
> org.apache.derby.impl.sql.execute.CreateTableConstantAction.executeConstantAction(Unknown
>  Source)
>   at org.apache.derby.impl.sql.execute.MiscResultSet.open(Unknown Source)
>   at 
> org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source)
>   at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:72)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnDbUtil.prepDb(TxnDbUtil.java:131)
>   at 
> org.apache.hive.hcatalog.streaming.TestStreaming.(TestStreaming.java:160)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11903) Add zookeeper metrics to HS2

2015-09-25 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908006#comment-14908006
 ] 

Yongzhi Chen commented on HIVE-11903:
-

For the zookeeper connections, HS2 does not have much:
1. The lock manager uses 1 (the singleton). This singleton change because of 
zookeeper connection leak issues before
2.  If we're supporting dynamic service discovery, there is one at HS2 startup. 
And the watcher related to HS2 and Zookeeper: DeRegisterWatcher is used to 
remove zookeeper node for the server if the server is removed from service 
discovery. 
This zookeeper connection will be removed when HS2 stops. 
3. The one is used in deleteServerInstancesFromZooKeeper, it is closed right 
after use within the method. 

Do you want to count connection for case 2 and 3 too? Or do you want to count 
other kind of connections? Thanks

> Add zookeeper metrics to HS2
> 
>
> Key: HIVE-11903
> URL: https://issues.apache.org/jira/browse/HIVE-11903
> Project: Hive
>  Issue Type: Sub-task
>  Components: Diagnosability
>Reporter: Szehon Ho
>Assignee: Yongzhi Chen
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch
>
>
> Potential metrics are active zookeeper connections, locks taken by type, etc. 
>  Can refine as we go along.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908031#comment-14908031
 ] 

Rui Li commented on HIVE-11473:
---

{{parquet_join}} failed because hive and spark depend on different parquet 
versions (hive on 1.8.1 while spark on 1.7.0). To work around this, we can 
build spark with {{parquet-provided}} profile. I built such a spark package and 
passed the test locally.
[~xuefuz] do you have any other suggestions?

Other failures are not related.

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch, 
> HIVE-11473.2-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11473) Upgrade Spark dependency to 1.5 [Spark Branch]

2015-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14907896#comment-14907896
 ] 

Hive QA commented on HIVE-11473:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12762354/HIVE-11473.2-spark.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 7483 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.initializationError
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.initializationError
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
org.apache.hive.hcatalog.streaming.TestStreaming.testTimeOutReaper
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/952/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/952/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-952/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12762354 - PreCommit-HIVE-SPARK-Build

> Upgrade Spark dependency to 1.5 [Spark Branch]
> --
>
> Key: HIVE-11473
> URL: https://issues.apache.org/jira/browse/HIVE-11473
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Jimmy Xiang
>Assignee: Rui Li
> Attachments: HIVE-11473.1-spark.patch, HIVE-11473.1-spark.patch, 
> HIVE-11473.2-spark.patch
>
>
> In Spark 1.5, SparkListener interface is changed. So HoS may fail to create 
> the spark client if the un-implemented event callback method is invoked.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >