[ 
https://issues.apache.org/jira/browse/HIVE-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15852218#comment-15852218
 ] 

Vihang Karajgaonkar commented on HIVE-15803:
--------------------------------------------

I agree with [~hagleitn]. The issue cannot be solved just by changing it to 
"right" number of threads. It will always happen when number of partition keys 
(depth of the recursion tree) is greater than the number of the threads. For 
instance I could reproduce the problem by setting number of threads to 3 and 
creating 4 partition keys.

{noformat}
0 jdbc:hive2://localhost:10000/> set hive.mv.files.thread=3;
0: jdbc:hive2://localhost:10000/> create table repairtable3(col string) 
partitioned by (p1 string, p2 string, p3 string, p4 string);
0: jdbc:hive2://localhost:10000/> dfs -mkdir -p 
/user/hive/warehouse/repairtable3/p1=a/p2=b/p3=c/p4=d;
0: jdbc:hive2://localhost:10000/> dfs -touchz 
/user/hive/warehouse/repairtable3/p1=a/p2=b/p3=c/p4=d/datafile;
0: jdbc:hive2://localhost:10000/> msck repair table repairtable3;
{noformat}

The issue happens because each thread which processes a path X waits until some 
other thread from the pool processes the children paths of X. If the recursion 
level is deep enough eventually the pool runs out of threads to process the 
children paths.

> msck can hang when nested partitions are present
> ------------------------------------------------
>
>                 Key: HIVE-15803
>                 URL: https://issues.apache.org/jira/browse/HIVE-15803
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Minor
>
> Steps to reproduce. 
> {noformat}
> CREATE TABLE `repairtable`( `col` string) PARTITIONED BY (  `p1` string,  
> `p2` string)
> hive> dfs -mkdir -p /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b;
> hive> dfs -touchz 
> /apps/hive/warehouse/test.db/repairtable/p1=c/p2=a/p3=b/datafile;
> hive> set hive.mv.files.thread;
> hive.mv.files.thread=15
> hive> set hive.mv.files.thread=1;
> hive> MSCK TABLE repairtable;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to