Map-side join failed if there are large number of mappers
---------------------------------------------------------

                 Key: HIVE-900
                 URL: https://issues.apache.org/jira/browse/HIVE-900
             Project: Hadoop Hive
          Issue Type: Improvement
            Reporter: Ning Zhang
            Assignee: Ning Zhang


Map-side join is efficient when joining a huge table with a small table so that 
the mapper can read the small table into main memory and do join on each 
mapper. However, if there are too many mappers generated for the map join, a 
large number of mappers will simultaneously send request to read the same block 
of the small table. Currently Hadoop has a upper limit of the # of request of a 
the same block (250?). If that is reached a BlockMissingException will be 
thrown. That cause a lot of mappers been killed. Retry won't solve but worsen 
the problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to