Re: Review Request 48233: HIVE-13884: Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-09 Thread Kapil Rastogi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48233/#review136927
---




metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java (lines 
3175 - 3196)


seems quite a bit of overlap in these functions. is it worthwhile to factor 
out all lines except

int partitionCount = get_num_partitions_by_filter(db_name, tbl_name, 
filter);



metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java (line 
3176)


what is the default for getIntVar if the configuration doesn't exist. -1, 0?



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java (line 
819)


nit - extra 'r' at the end


- Kapil Rastogi


On June 6, 2016, 6:19 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48233/
> ---
> 
> (Updated June 6, 2016, 6:19 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Naveen Gangam.
> 
> 
> Bugs: HIVE-13884
> https://issues.apache.org/jira/browse/HIVE-13884
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch verifies the # of partitions a table has before fetching any from 
> the metastore. I
> t checks that limit from 'hive.limit.query.max.table.partition'.
> 
> A limitation added here is that the variable must be on hive-site.xml in 
> order to work, and it does not accept to set this through beeline because 
> HiveMetaStore.java does not read the variables set through beeline. I think 
> it is better to keep it this way to avoid users changing the value on fly, 
> and crashing the metastore.
> 
> Another change is that EXPLAIN commands won't be executed either. EXPLAIN 
> commands need to fetch partitions in order to create the operator tree. If we 
> allow EXPLAIN to do that, then we may have the same OOM situations for large 
> partitions.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 94dd72e6624d13d2503f68d2fd2d2a84859a4500 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> 8e0bba60cc73890c1566e0f5df965f0f0bcfe0ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b6d5276e49356f30147cb4f10262a2730ba99566 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
> a6d3f5385b33b8a4e31ee20ca5cb8f58c97c8702 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 31f0d7b89670b8a749bbe8a7ff2b4ff9f059a8e2 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  3152e77c3c7152ac4dbe7e779ce35f28044fe3c9 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86a243609b23e2ca9bb8849f0da863a95e477d5c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> c3d903b8cc8197ba8bea17145bec1444ed14eb22 
> 
> Diff: https://reviews.apache.org/r/48233/diff/
> 
> 
> Testing
> ---
> 
> Waiting for HiveQA.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



[jira] [Created] (HIVE-13991) Union All on view fail with no valid permission on underneath table

2016-06-09 Thread Yongzhi Chen (JIRA)
Yongzhi Chen created HIVE-13991:
---

 Summary: Union All on view fail with no valid permission on 
underneath table
 Key: HIVE-13991
 URL: https://issues.apache.org/jira/browse/HIVE-13991
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen


When sentry is enabled. 
create view V as select * from T;
When the user has read permission on view V, but does not have read permission 
on table T,

select * from V union all select * from V 
failed with:
{noformat}
0: jdbc:hive2://> select * from s07view union all select * from s07view 
limit 1;
Error: Error while compiling statement: FAILED: SemanticException No valid 
privileges
 Required privileges for this query: 
Server=server1->Db=default->Table=sample_07->action=select; 
(state=42000,code=4)
{noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 48502: HIVE-13731 LLAP: return LLAP token with the splits

2016-06-09 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48502/#review136949
---




itests/hive-unit/src/test/java/org/apache/hadoop/hive/llap/ext/TestLlapInputSplit.java
 (line 85)


Should we also be checking the token bytes here?



llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
(line 135)


Looks like a null tokenBytes is serialized/deserialized as a 0-byte array. 
During token verification does this get treated the same way as a null token?


- Jason Dere


On June 9, 2016, 7:07 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48502/
> ---
> 
> (Updated June 9, 2016, 7:07 p.m.)
> 
> 
> Review request for hive, Jason Dere and Siddharth Seth.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> .
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/llap/ext/TestLlapInputSplit.java
>  7491222 
>   llap-client/src/java/org/apache/hadoop/hive/llap/LlapInputSplit.java 
> ab11926 
>   
> llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
>  3ebae4a 
>   
> llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
> d1748cb 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> cff5ee1 
> 
> Diff: https://reviews.apache.org/r/48502/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



[jira] [Created] (HIVE-13990) Client should not check dfs.namenode.acls.enabled to determine if extended ACLs are supported

2016-06-09 Thread Chris Drome (JIRA)
Chris Drome created HIVE-13990:
--

 Summary: Client should not check dfs.namenode.acls.enabled to 
determine if extended ACLs are supported
 Key: HIVE-13990
 URL: https://issues.apache.org/jira/browse/HIVE-13990
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.1
Reporter: Chris Drome


dfs.namenode.acls.enabled is a server side configuration and the client should 
not presume to know how the server is configured. Barring a method for querying 
the NN whether ACLs are supported the client should try and catch the 
appropriate exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13989) Extended ACLs are not handled according to specification

2016-06-09 Thread Chris Drome (JIRA)
Chris Drome created HIVE-13989:
--

 Summary: Extended ACLs are not handled according to specification
 Key: HIVE-13989
 URL: https://issues.apache.org/jira/browse/HIVE-13989
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.1
Reporter: Chris Drome
Assignee: Chris Drome






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 48520: Use multi-threaded approach to listing files for msck

2016-06-09 Thread Hari Sankar Sivarama Subramaniyan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/#review136931
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 
379)


nit: is it possible to make allDirs as SynchronizedSet so that someone 
doesnt misuse this in future.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 
385)


Can you please update this parameter description in HiveConf.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 
390)


nit: Fine to use a Void return type and return null object instead of true 
always.



ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java (line 
394)


This will be effectively a serial operation if we have a skewed directory 
structure (very rare or possibly no scenarios ??)

Another thing I remembered is that HIVE_MOVE_FILES_THREAD_COUNT does 
support a value of 0, which runs the entire thing in serial mode. So if you are 
reusing that configuration, you will have to keep the serial code path or else 
you need to introduce a new param. Otherwise there will be a conflict.


- Hari Sankar Sivarama Subramaniyan


On June 9, 2016, 11:16 p.m., pengcheng xiong wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48520/
> ---
> 
> (Updated June 9, 2016, 11:16 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13984
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
> 10fa561 
> 
> Diff: https://reviews.apache.org/r/48520/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>



[jira] [Created] (HIVE-13988) zero length file is being created for empty bucket in tez mode

2016-06-09 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-13988:
--

 Summary: zero length file is being created for empty bucket in tez 
mode
 Key: HIVE-13988
 URL: https://issues.apache.org/jira/browse/HIVE-13988
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


Even though bucket is empty, zero length file is being created in tez mode. 

steps to reproduce the issue:
{noformat}
hive> set hive.execution.engine;
hive.execution.engine=tez
hive> drop table if exists emptybucket_orc;
OK
Time taken: 5.416 seconds
hive> create table emptybucket_orc(age int) clustered by (age) sorted by (age) 
into 99 buckets stored as orc;
OK
Time taken: 0.493 seconds
hive> insert into table emptybucket_orc select distinct(age) from studenttab10k 
limit 0;
Query ID = hrt_qa_20160523231955_8b981be7-68c4-4416-8a48-5f8c7ff551c3
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id 
application_1464045121842_0002)

--
VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  KILLED  
--
Map 1 ..  llap SUCCEEDED  1  100
   0   0  
Reducer 2 ..  llap SUCCEEDED  1  100
   0   0  
Reducer 3 ..  llap SUCCEEDED  1  100
   0   0  
Reducer 4 ..  llap SUCCEEDED 99 9900
   0   0  
--
VERTICES: 04/04  [==>>] 100%  ELAPSED TIME: 11.00 s
--
Loading data to table default.emptybucket_orc
OK
Time taken: 16.907 seconds
hive> dfs -ls /apps/hive/warehouse/emptybucket_orc;
Found 99 items
-rwxrwxrwx   3 hrt_qa hdfs  0 2016-05-23 23:20 
/apps/hive/warehouse/emptybucket_orc/00_0
-rwxrwxrwx   3 hrt_qa hdfs  0 2016-05-23 23:20 
/apps/hive/warehouse/emptybucket_orc/01_0
..
{noformat}

Expected behavior:
In tez mode, zero length file shouldn't get created on hdfs if bucket is empty



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 48520: Use multi-threaded approach to listing files for msck

2016-06-09 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48520/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

HIVE-13984


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
10fa561 

Diff: https://reviews.apache.org/r/48520/diff/


Testing
---


Thanks,

pengcheng xiong



Re: Moving Hive Jenkins jobs to builds.apache.org

2016-06-09 Thread Wei Zheng
Hi Sergio,

Looks like builds.apache.org is still using Java 7?

Thanks,
Wei







On 6/5/16, 12:30, "Sergio Pena"  wrote:

>Hi All,
>
>I just finished moving all Jenkins jobs to https://builds.apache.org. I
>will keep monitoring the jobs during the week to make sure they runs fine.
>There is a view in Jenkins where you can see all Hive jobs,
>https://builds.apache.org/view/H-L/view/Hive/
>
>There are a couple of jobs that I don't know if they are still be used,
>"hive-trunk" and "hive-0.14"
>Please let me know if someone is using those jobs, or if I can destroy them
>so that we can keep this view clean.
>
>Please let me know If you have problems with Jenkins.
>
>- Sergio
>
>On Sun, May 29, 2016 at 9:40 PM, Sergio Pena 
>wrote:
>
>> Hi All,
>>
>> Just to let you know that I will plan to move our Jenkins jobs and
>> configurations to the builds.apache.org site during this week.  ASF has a
>> good infrastructure for Jenkins, and have all jobs backed up in case of
>> server failures.
>>
>> Also, some people in the Hive community is showing interest in helping on
>> Jenkins, so I would like to give that benefit by moving it to ASF Jenkins.
>>
>> For now, only Jenkins will be moved, but the cluster will still be
>> private. However, I will continue working on making this more open too.
>>
>> Just one thing about ASF jenkins is that it has too many jobs from all
>> Apache projects. Hopefully you don't get lost on it when looking for Hive
>> :P.
>>
>> I'll let you know when all this happens.
>>
>> - Sergio
>>


[jira] [Created] (HIVE-13987) Clarify current error shown when HS2 is down

2016-06-09 Thread Abdullah Yousufi (JIRA)
Abdullah Yousufi created HIVE-13987:
---

 Summary: Clarify current error shown when HS2 is down
 Key: HIVE-13987
 URL: https://issues.apache.org/jira/browse/HIVE-13987
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 2.0.1
Reporter: Abdullah Yousufi
Assignee: Abdullah Yousufi
Priority: Minor
 Fix For: 2.2.0


When HS2 is down and a query is run, the following error is shown in beeline:
{code}
0: jdbc:hive2://localhost:1> show tables;
Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)
{code}

It may be more helpful to also indicate that the reason for this is that HS2 is 
down, such as:
{code}
0: jdbc:hive2://localhost:1> show tables;
HS2 may be unavailable, check server status
Error: org.apache.thrift.transport.TTransportException (state=08S01,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13986) LLAP: kill Tez AM on token errors from plugin

2016-06-09 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-13986:
---

 Summary: LLAP: kill Tez AM on token errors from plugin
 Key: HIVE-13986
 URL: https://issues.apache.org/jira/browse/HIVE-13986
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13985) ORC improvements for reducing the file system calls in task side

2016-06-09 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-13985:


 Summary: ORC improvements for reducing the file system calls in 
task side
 Key: HIVE-13985
 URL: https://issues.apache.org/jira/browse/HIVE-13985
 Project: Hive
  Issue Type: Bug
  Components: ORC
Affects Versions: 1.3.0, 2.2.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


HIVE-13840 fixed some issues with addition file system invocations during split 
generation. Similarly, this jira will fix issues with additional file system 
invocations on the task side. To avoid reading footers on the task side, users 
can set hive.orc.splits.include.file.footer to true which will serialize the 
orc footers on the splits. But this has issues with serializing unwanted 
information like column statistics and other metadata which are not really 
required for reading orc split on the task side. We can reduce the payload on 
the orc splits by serializing only the minimum required information (stripe 
information, types, compression details). This will decrease the payload on the 
orc splits and can potentially avoid OOMs in application master (AM) during 
split generation. This jira also address other issues concerning the AM cache. 
The local cache used by AM is soft reference cache. This can introduce 
unpredictability across multiple runs of the same query. We can cache the 
serialized footer in the local cache and also use strong reference cache which 
should avoid memory pressure and will have better predictability.

One other improvement that we can do is when 
hive.orc.splits.include.file.footer is set to false, on the task side we make 
one additional file system call to know the size of the file. If we can 
serialize the file length in the orc split this can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13984) Use multi-threaded approach to listing files for msck

2016-06-09 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-13984:
--

 Summary: Use multi-threaded approach to listing files for msck
 Key: HIVE-13984
 URL: https://issues.apache.org/jira/browse/HIVE-13984
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13983) Unicode insert issue

2016-06-09 Thread eugene liu (JIRA)
eugene liu created HIVE-13983:
-

 Summary: Unicode insert issue
 Key: HIVE-13983
 URL: https://issues.apache.org/jira/browse/HIVE-13983
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.2.0
 Environment: failed in beeline and in odbc
Reporter: eugene liu


Unicode  characters in UTF-8 on wire: ¿=C2 BF  «=C2 AB  é=C3 A9
Characters inserted using INSERT SELECT format stores the values correctly
Characters inserted using INSERT VALUES format stores the values incorrectly. 
Below was what I did in beeline:

DROP TABLE testch3;
CREATE TABLE testch3(col0 int, col1 CHAR(10), col2 VARCHAR(10), col3 string);
Insert into table testch3 select  1,'¿','«','é' from (select count(*) from 
testch3) qaz;
Insert into table testch3 values (2,'¿','«','é');
select * from testch3;

+---+---+---+---+
| testch3.col0  | testch3.col1  | testch3.col2  | testch3.col3  |
+---+---+---+---+
| 1 | ¿ | « | é |
| 2 | � | � | � |
+---+---+---+---+
2 rows selected (0.251 seconds)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 48502: HIVE-13731 LLAP: return LLAP token with the splits

2016-06-09 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48502/
---

Review request for hive, Jason Dere and Siddharth Seth.


Repository: hive-git


Description
---

.


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/llap/ext/TestLlapInputSplit.java
 7491222 
  llap-client/src/java/org/apache/hadoop/hive/llap/LlapInputSplit.java ab11926 
  
llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
 3ebae4a 
  llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
d1748cb 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
cff5ee1 

Diff: https://reviews.apache.org/r/48502/diff/


Testing
---


Thanks,

Sergey Shelukhin



Hive Table Creation failure on Postgres

2016-06-09 Thread Siddhi Mehta
Hello Everyone,

We are using postgres for hive persistent store.

We are making use of the schematool to create hive schema and our hive
configs have table and column validation enabled.

While trying to create a simple hive table we ran into the following error.

Error: Error while processing statement: FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:javax.jdo.JDODataStoreException: Wrong precision for
column "*COLUMNS_V2"."COMMENT*" : was 4000 (according to the JDBC driver)
but should be 256 (based on the MetaData definition for field
org.apache.hadoop.hive.metastore.model.MFieldSchema.comment).

Looks like the Hive Metastore validation expects it to be 255 but when I
looked at the metastore script for Postgres  it creates the column with
precision 4000.

Interesting thing is that mysql scripts for the same hive version create
the column with precision 255.

Is there a config to communicate with Hive MetaStore validation layers as
to what is the appropriate column precision to be based on the underlying
persistent store  used or
is this a known workaround to turn of validation when using postgress as
the persistent store.

Thanks,
Siddhi


Review Request 48500: HIVE-13982

2016-06-09 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48500/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-13982
https://issues.apache.org/jira/browse/HIVE-13982


Repository: hive-git


Description
---

HIVE-13982


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
 353d8db41af10512c94c0700a9bb06a07d660190 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
 1c3eb8155defa99a223ccf4ee4b072abb40a 
  ql/src/test/queries/clientpositive/limit_pushdown2.q PRE-CREATION 
  ql/src/test/results/clientpositive/limit_pushdown2.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/48500/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



[jira] [Created] (HIVE-13982) Extension to limit push down through order by & group by

2016-06-09 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-13982:
--

 Summary: Extension to limit push down through order by & group by
 Key: HIVE-13982
 URL: https://issues.apache.org/jira/browse/HIVE-13982
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Affects Versions: 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-13982.patch

Queries which follow the format are not optimal with map-side aggregation, 
because the Map 1 does not have TopN in the reduce sink.

These queries shuffle 100% of the aggregate in cases where the reduce de-dup 
does not kick in. 

As input data grows, it falls off a cliff of performance after 4 reducers.

{code}
select state, city, sum(sales) from table
group by state, city
order by state, city
limit 10;
{code}

{code}
select state, city, sum(sales) from table
group by city, state
order by state, city
limit 10;
{code}

{code}
select state, city, sum(sales) from table
group by city, state
order by state desc, city
limit 10;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13981) Operation.toSQLException eats full exception stack

2016-06-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-13981:
-

 Summary: Operation.toSQLException eats full exception stack
 Key: HIVE-13981
 URL: https://issues.apache.org/jira/browse/HIVE-13981
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Operation.toSQLException eats half of the exception stack and make debug hard. 
For example, we saw an exception:
{code}
org.apache.hive.service.cli.HiveSQL Exception : Error while compiling 
statement: FAILED : NullPointer Exception null
at org.apache.hive.service.cli.operation.Operation.toSQL Exception 
(Operation.java:336)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:113)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:182)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:278)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:421)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:408)
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:505)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:562)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang. NullPointer Exception
{code}
The real stack causing the NPE is lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13980) create table as select should acquire X lock on target table

2016-06-09 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-13980:
-

 Summary: create table as select should acquire X lock on target 
table
 Key: HIVE-13980
 URL: https://issues.apache.org/jira/browse/HIVE-13980
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman


hive> create table test.dummy as select * from oraclehadoop.dummy;
This acquires SHARED_READ on oraclehadoop.dummy table and SHARED_READ on _test_ 
database.

The effect is that you can drop _test.dummy_ from another session while the 
insert is still in progress.

This operation is a bit odd in that it combines a DDL operation which is not 
transactional with a DML operation which is.

If it were to fail in the middle, the target table would remain.  This can't be 
fixed easily but we should get an X lock on _test.dummy_.

The workaround is to split this into 2 commands
1. create table
2. perform insert
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 48233: HIVE-13884: Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-09 Thread Reuben Kuhnert

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48233/#review136810
---


Fix it, then Ship it!




Mostly minor cleanup nitpicks. Might make sense in the future to refactor this 
into a separate class that handles this sort of check, but this is fine for 
now. Fix then ship.


metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java (line 
3131)


Can we create a 'public static final String' for this instead of using a 
comment?



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 2523)


Nit: Strange extra space at the end, is that needed?



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java (line 2830)


Maybe StringUtils.isEmpty? I think it will do both of these checks for you.


- Reuben Kuhnert


On 六月 6, 2016, 6:19 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48233/
> ---
> 
> (Updated 六月 6, 2016, 6:19 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Naveen Gangam.
> 
> 
> Bugs: HIVE-13884
> https://issues.apache.org/jira/browse/HIVE-13884
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch verifies the # of partitions a table has before fetching any from 
> the metastore. I
> t checks that limit from 'hive.limit.query.max.table.partition'.
> 
> A limitation added here is that the variable must be on hive-site.xml in 
> order to work, and it does not accept to set this through beeline because 
> HiveMetaStore.java does not read the variables set through beeline. I think 
> it is better to keep it this way to avoid users changing the value on fly, 
> and crashing the metastore.
> 
> Another change is that EXPLAIN commands won't be executed either. EXPLAIN 
> commands need to fetch partitions in order to create the operator tree. If we 
> allow EXPLAIN to do that, then we may have the same OOM situations for large 
> partitions.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 94dd72e6624d13d2503f68d2fd2d2a84859a4500 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> 8e0bba60cc73890c1566e0f5df965f0f0bcfe0ec 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> b6d5276e49356f30147cb4f10262a2730ba99566 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
> a6d3f5385b33b8a4e31ee20ca5cb8f58c97c8702 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 31f0d7b89670b8a749bbe8a7ff2b4ff9f059a8e2 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  3152e77c3c7152ac4dbe7e779ce35f28044fe3c9 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86a243609b23e2ca9bb8849f0da863a95e477d5c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> c3d903b8cc8197ba8bea17145bec1444ed14eb22 
> 
> Diff: https://reviews.apache.org/r/48233/diff/
> 
> 
> Testing
> ---
> 
> Waiting for HiveQA.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



[jira] [Created] (HIVE-13979) MSCK throws exceptions when unpartitioned table has nested directories

2016-06-09 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-13979:
---

 Summary: MSCK throws exceptions when unpartitioned table has 
nested directories
 Key: HIVE-13979
 URL: https://issues.apache.org/jira/browse/HIVE-13979
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Rajesh Balamohan
Priority: Minor


When nested directories are present in unpartitioned dataset, msck throws 
exceptions. Query runs without issues though.

{noformat}
DataSet arranged as follows

tpch_flat_orc_1000.db/lineitem_random/a
tpch_flat_orc_1000.db/lineitem_random/b
tpch_flat_orc_1000.db/lineitem_random/c
tpch_flat_orc_1000.db/lineitem_random/d
tpch_flat_orc_1000.db/lineitem_random/e
tpch_flat_orc_1000.db/lineitem_random/f
tpch_flat_orc_1000.db/lineitem_random/g
tpch_flat_orc_1000.db/lineitem_random/h
tpch_flat_orc_1000.db/lineitem_random/i
tpch_flat_orc_1000.db/lineitem_random/j
tpch_flat_orc_1000.db/lineitem_random/k
tpch_flat_orc_1000.db/lineitem_random/l
tpch_flat_orc_1000.db/lineitem_random/m
tpch_flat_orc_1000.db/lineitem_random/n
tpch_flat_orc_1000.db/lineitem_random/o
tpch_flat_orc_1000.db/lineitem_random/p
tpch_flat_orc_1000.db/lineitem_random/q
tpch_flat_orc_1000.db/lineitem_random/r
tpch_flat_orc_1000.db/lineitem_random/s
tpch_flat_orc_1000.db/lineitem_random/t
tpch_flat_orc_1000.db/lineitem_random/u
tpch_flat_orc_1000.db/lineitem_random/v
tpch_flat_orc_1000.db/lineitem_random/w
tpch_flat_orc_1000.db/lineitem_random/x
tpch_flat_orc_1000.db/lineitem_random/y
tpch_flat_orc_1000.db/lineitem_random/z


CREATE EXTERNAL TABLE `lineitem_random`(
  `l_orderkey` bigint,
  `l_partkey` bigint,
  `l_suppkey` bigint,
  `l_linenumber` int,
  `l_quantity` double,
  `l_extendedprice` double,
  `l_discount` double,
  `l_tax` double,
  `l_returnflag` string,
  `l_linestatus` string,
  `l_shipdate` string,
  `l_commitdate` string,
  `l_receiptdate` string,
  `l_shipinstruct` string,
  `l_shipmode` string,
  `l_comment` string) stored as ORC LOCATION 
'.../tpch_flat_orc_1000.db/lineitem_random'

hive> SELECT sum(l_extendedprice * l_discount) AS revenue
> FROM lineitem_random
> WHERE l_shipdate >= '1993-01-01'
>   AND l_shipdate < '1994-01-01'
>   AND l_discount BETWEEN 0.06 - 0.01 AND 0.06 + 0.01
>   AND l_quantity < 25;


Status: DAG finished successfully

...
...
msck repair table lineitem_random;
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask


2016-06-09 11:14:55,102 WARN  [main]: exec.DDLTask (DDLTask.java:msck(1787)) - 
Failed to run metacheck:
org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:Unexpected component a)
at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1753)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:375)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1728)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1485)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1126)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: MetaException(message:Unexpected component a)
at 
org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:390)
at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1751)
... 20 more  

{noformat}






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13978) Exclusive lock for delete partition doesn't block subsequent read locks.

2016-06-09 Thread Igor Kuzmenko (JIRA)
Igor Kuzmenko created HIVE-13978:


 Summary: Exclusive lock for delete partition doesn't block 
subsequent read locks.
 Key: HIVE-13978
 URL: https://issues.apache.org/jira/browse/HIVE-13978
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 1.2.1
Reporter: Igor Kuzmenko


Bug based on this mail: 
http://mail-archives.apache.org/mod_mbox/hive-user/201606.mbox/%3CD37DAA17.5AB3F%25ekoifman%40hortonworks.com%3E

I'm using Hive 1.2.1, Hive JDBC driver 1.2.1 and perform simple test on 
transactional table:
{quote}
asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections 
where dt=20151124 and msisdn_last_digit=2", 1);
Thread.sleep(3000);
asyncExecute("alter table mobile_connections drop if exists partition 
(dt=20151124, msisdn_last_digit=2) purge", 2);
Thread.sleep(3000);
asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections 
where dt=20151124 and msisdn_last_digit=2", 3);
Thread.sleep(3000);
asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections 
where dt=20151124 and msisdn_last_digit=2", 4);
{quote}

Full code: http://pastebin.com/LsktC0sx
I cretate several threads, each execute query async. First is querying 
partition. Second drop partition. Others are the same as first. First query 
takes about 10-15 seconds to complete, so "alter table" query starts before 
first query completes.

As a result i get:
First query - successfully completes 
Second query - successfully completes
Third query - successfully completes
Fourth query - throw exception.

Here's output:
{quote}
Wed Jun 08 16:36:02 MSK 2016 Start thread 1
Wed Jun 08 16:36:05 MSK 2016 Start thread 2
Wed Jun 08 16:36:08 MSK 2016 Start thread 3
Wed Jun 08 16:36:11 MSK 2016 Start thread 4
Wed Jun 08 16:36:17 MSK 2016 Finish thread 1
Wed Jun 08 16:36:17 MSK 2016 Thread 1 result: '344186'
Wed Jun 08 16:36:17 MSK 2016 Thread 1 completed in 14443 ms

Wed Jun 08 16:36:19 MSK 2016 Finished 2
Wed Jun 08 16:36:19 MSK 2016 Thread 2 completed in 13967 ms

Wed Jun 08 16:36:20 MSK 2016 Finish thread 3
Wed Jun 08 16:36:20 MSK 2016 Thread 3 result: '344186'
Wed Jun 08 16:36:20 MSK 2016 Thread 3 completed in 11737 ms

java.sql.SQLException: Error while processing statement: FAILED: Execution 
Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex 
failed, vertexName=Map 1, vertexId=vertex_1461923723503_0931_1_00, 
diagnostics=[Vertex vertex_1461923723503_0931_1_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: mobile_connections initializer 
failed, vertex=vertex_1461923723503_0931_1_00 [Map 1], 
java.lang.RuntimeException: serious problem
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)
{quote}

HiveServer2 log contains this:
{quote}
Line 1043: 2016-06-08 16:36:04,546 INFO  [HiveServer2-Background-Pool: 
Thread-42]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response to 
queryId=hive_20160608163602_542056d9-c524-4df4-af18-6aa5e906284f 
LockResponse(lockid:179728, state:ACQUIRED)
Line 1349: 2016-06-08 16:36:05,214 INFO  [HiveServer2-Background-Pool: 
Thread-50]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) - Requesting: 
queryId=hive_20160608163604_832abbff-6199-497e-b969-fd8ac1465abc 
LockRequest(component:[LockComponent(type:EXCLUSIVE, level:PARTITION, 
dbname:default, tablename:mobile_connections, 
partitionname:dt=20151123/msisdn_last_digit=3)], txnid:0, user:hdfs, 
hostname:mercury)
Line 1390: 2016-06-08 16:36:05,270 INFO  [HiveServer2-Background-Pool: 
Thread-50]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response to 
queryId=hive_20160608163604_832abbff-6199-497e-b969-fd8ac1465abc 
LockResponse(lockid:179729, state:WAITING)
Line 2346: 2016-06-08 16:36:08,028 INFO  [HiveServer2-Background-Pool: 
Thread-68]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) - Requesting: 
queryId=hive_20160608163607_7b18da12-6f86-41c9-b4b1-be45252c18c2 
LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, 
dbname:default, tablename:mobile_connections), LockComponent(type:SHARED_READ, 
level:PARTITION, dbname:default, tablename:mobile_connections, 
partitionname:dt=20151123/msisdn_last_digit=3)], txnid:0, user:hdfs, 
hostname:mercury)
Line 2370: 2016-06-08 16:36:08,069 INFO  [HiveServer2-Background-Pool: 
Thread-68]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response to 
queryId=hive_20160608163607_7b18da12-6f86-41c9-b4b1-be45252c18c2 
LockResponse(lockid:179730, state:ACQUIRED)
Line 3561: 2016-06-08 16:36:11,000 INFO  [HiveServer2-Background-Pool: 
Thread-91]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) - Requesting: 
queryId=hive_20160608163610_b78a201b-ae6d-4040-9115-f92118d5b629 
LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, 
dbname:defa

[jira] [Created] (HIVE-13977) nvl funtion not working after left outer join

2016-06-09 Thread balaswamy vaddeman (JIRA)
balaswamy vaddeman created HIVE-13977:
-

 Summary: nvl funtion not working after left outer join 
 Key: HIVE-13977
 URL: https://issues.apache.org/jira/browse/HIVE-13977
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: balaswamy vaddeman


Recreating problem.

1).Create table with sample data.

create table tabletest (n bigint, t string); 
insert into tabletest values (1, 'one'); 
insert into tabletest values(2, 'two'); 

2) Run leftouter join query on single table.

select a.n as leftHandN 
, b.n as rightHandN 
, b.t as rightHandT 
, nvl(b.t,"empty") as rightHandTnvl -- Expected empty --> received empty
, nvl(b.n,-1) as rightHandNnvl -- Expected -1 --> received 1 
from 
(
select *
from tabletest 
where n=1
) a
left outer join
(
select *
from tabletest 
where 1=2
) 
on a.n = b.n;

nvl(b.n,-1) should return -1 but returns 1.

I have found b.n always returning a.n value.if a.n is 1 ,b.n is returning 1 and 
if it is 2,same 2 will be returned.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)