[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2010-01-03 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796105#action_12796105
 ] 

Ning Zhang commented on HIVE-900:
-

Closing this issue since HDFS-767 is committed.

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-12-02 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784995#action_12784995
 ] 

Ning Zhang commented on HIVE-900:
-

Tried different approaches in Hive and it turns out none of them is perfect. 
Finally the solution should be in the HDFS side. The JIRA HDFS-767 is trying to 
solve that. 

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-29 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771569#action_12771569
 ] 

Ning Zhang commented on HIVE-900:
-

@parasad, yes that's definitely a good idea to scale out mapjoin with a large 
number of mappers. Dhruba also suggested to increase the replication factor for 
the small file. But as you mentioned, we need to revert the replication factor 
before mapjoin finishes or any exception is caught. I'll also investigate that. 

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-29 Thread Prasad Chakka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771551#action_12771551
 ] 

Prasad Chakka commented on HIVE-900:


just a of the wall idea, temporarily increase the replication factor for this 
block so that it is available in more racks thus reducing the network cost and 
also avoiding BlockMissingException. ofcourse, we need to find a way to 
reliably set the replication factor back to original setting.

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-29 Thread Prasad Chakka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771553#action_12771553
 ] 

Prasad Chakka commented on HIVE-900:


@venky, may be you can unblock your work by manually increasing the replication 
factory to very high and then issuing the query?

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-29 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771544#action_12771544
 ] 

Ning Zhang commented on HIVE-900:
-

The essential problem is that there are too many mappers are trying to 
accessing the same block at the same time so that it exceeds the threshold of 
accessing the same block. Thus the BlockMissingException is thrown. 

Discussed with Namit and Dhruba offline. There are the proposed solutions:

1) Make the HDFS fault tolerant to this issue. Dhruba mentioned there already 
exists retry logic implemented in the DFS client code: if the 
BlockMissingException is throw it will wait about 400ms and retry. If there are 
still exceptions then wait for 800 ms and so on until 5 unsuccessful retry. 
This mechanism works for non-correlated simultaneous request of the same block. 
However in this case, almost all the mappers request the same block at the same 
time, so their retries will be also at about the same time. So it would be 
better to introduce a random factor into the wait time. Dhruba will look into 
the DFS code and working on that. This will solve a broader type of issues 
beside the map-side join.

2) Another orthogonal issue brought up by Namit for map-side join is that if 
there are too many mappers and each of them request the same small table, it 
comes with a cost of transferring the small file to all these mappers. Even 
though the BlockMissingException is resolved, the cost is still there and it is 
proportional to the number of mappers. In this respect it would be better to 
reduce the number of mappers. But it also comes with the cost that each mappers 
then has to deal with larger portion of the large table. So we have to tradeoff 
the network cost of the small table and the processing cost of the large table. 
Will come with a heuristic on tune the parameters to decide the number of 
mappers for map join. 

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770135#action_12770135
 ] 

Namit Jain commented on HIVE-900:
-

Instead of relying on the mapper to copy each file to the distributed cache - 
can we rely on the hive client (ExecDriver) to do that ? 
>From the work, the client knows the tasks that need to be executed on the map 
>side. Before submitting the job, execute that work.
The execmapper needs to change to copy from that portion of the work only 
instead of executing the whole work.

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-24 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769602#action_12769602
 ] 

Ning Zhang commented on HIVE-900:
-

distributed cache is definitely one option. It seems it also works on copying 
file from hdfs: uri in addition to local directory. However based on the 
documentation of distributed cache, the cached file should be copied at the 
beginning of the mapper task. This may also have the same network inbound 
congestion issue if 3000 mappers are trying to copy the same file at the same 
time. Or is the distributed cache uses a smarter copying mechanism (like 
hierarchical rather than 1:ALL)? Otherwise distributing the jar file will face 
the same issue. 

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-23 Thread Venky Iyer
Yeah, I can do that, but my large table in the JOIN is really large  
and I'd like to avoid having to do that.


On Oct 23, 2009, at 8:09 PM, Ning Zhang wrote:

> Yes, that's the plan. You can also try the workaround to remove
> mapjoin hints.
>
> Ning
>
> On Oct 23, 2009, at 7:52 PM, Venky Iyer (JIRA) wrote:
>
>>
>>   [ 
>> https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769573#action_12769573
>> ]
>>
>> Venky Iyer commented on HIVE-900:
>> -
>>
>> This is a high-priority bug for me, blocking me on fairly important
>> stuff . The workaround that Dhruba had, of downloading data to the
>> client and adding to the distributedcache is a pretty good solution.
>>
>>> Map-side join failed if there are large number of mappers
>>> -
>>>
>>>   Key: HIVE-900
>>>   URL: https://issues.apache.org/jira/browse/HIVE-900
>>>   Project: Hadoop Hive
>>>Issue Type: Improvement
>>>  Reporter: Ning Zhang
>>>  Assignee: Ning Zhang
>>>
>>> Map-side join is efficient when joining a huge table with a small
>>> table so that the mapper can read the small table into main memory
>>> and do join on each mapper. However, if there are too many mappers
>>> generated for the map join, a large number of mappers will
>>> simultaneously send request to read the same block of the small
>>> table. Currently Hadoop has a upper limit of the # of request of a
>>> the same block (250?). If that is reached a BlockMissingException
>>> will be thrown. That cause a lot of mappers been killed. Retry
>>> won't solve but worsen the problem.
>>
>> -- 
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>

--
Venky Iyer
vi...@facebook.com






Re: [jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-23 Thread Ning Zhang
Yes, that's the plan. You can also try the workaround to remove  
mapjoin hints.

Ning

On Oct 23, 2009, at 7:52 PM, Venky Iyer (JIRA) wrote:

>
>[ 
> https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769573#action_12769573
>  
>  ]
>
> Venky Iyer commented on HIVE-900:
> -
>
> This is a high-priority bug for me, blocking me on fairly important  
> stuff . The workaround that Dhruba had, of downloading data to the  
> client and adding to the distributedcache is a pretty good solution.
>
>> Map-side join failed if there are large number of mappers
>> -
>>
>>Key: HIVE-900
>>URL: https://issues.apache.org/jira/browse/HIVE-900
>>Project: Hadoop Hive
>> Issue Type: Improvement
>>   Reporter: Ning Zhang
>>   Assignee: Ning Zhang
>>
>> Map-side join is efficient when joining a huge table with a small  
>> table so that the mapper can read the small table into main memory  
>> and do join on each mapper. However, if there are too many mappers  
>> generated for the map join, a large number of mappers will  
>> simultaneously send request to read the same block of the small  
>> table. Currently Hadoop has a upper limit of the # of request of a  
>> the same block (250?). If that is reached a BlockMissingException  
>> will be thrown. That cause a lot of mappers been killed. Retry  
>> won't solve but worsen the problem.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>



[jira] Commented: (HIVE-900) Map-side join failed if there are large number of mappers

2009-10-23 Thread Venky Iyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769573#action_12769573
 ] 

Venky Iyer commented on HIVE-900:
-

This is a high-priority bug for me, blocking me on fairly important stuff . The 
workaround that Dhruba had, of downloading data to the client and adding to the 
distributedcache is a pretty good solution.

> Map-side join failed if there are large number of mappers
> -
>
> Key: HIVE-900
> URL: https://issues.apache.org/jira/browse/HIVE-900
> Project: Hadoop Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> Map-side join is efficient when joining a huge table with a small table so 
> that the mapper can read the small table into main memory and do join on each 
> mapper. However, if there are too many mappers generated for the map join, a 
> large number of mappers will simultaneously send request to read the same 
> block of the small table. Currently Hadoop has a upper limit of the # of 
> request of a the same block (250?). If that is reached a 
> BlockMissingException will be thrown. That cause a lot of mappers been 
> killed. Retry won't solve but worsen the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.