Yes, that's the plan. You can also try the workaround to remove mapjoin hints.
Ning On Oct 23, 2009, at 7:52 PM, Venky Iyer (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/HIVE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769573#action_12769573 > > ] > > Venky Iyer commented on HIVE-900: > --------------------------------- > > This is a high-priority bug for me, blocking me on fairly important > stuff . The workaround that Dhruba had, of downloading data to the > client and adding to the distributedcache is a pretty good solution. > >> Map-side join failed if there are large number of mappers >> --------------------------------------------------------- >> >> Key: HIVE-900 >> URL: https://issues.apache.org/jira/browse/HIVE-900 >> Project: Hadoop Hive >> Issue Type: Improvement >> Reporter: Ning Zhang >> Assignee: Ning Zhang >> >> Map-side join is efficient when joining a huge table with a small >> table so that the mapper can read the small table into main memory >> and do join on each mapper. However, if there are too many mappers >> generated for the map join, a large number of mappers will >> simultaneously send request to read the same block of the small >> table. Currently Hadoop has a upper limit of the # of request of a >> the same block (250?). If that is reached a BlockMissingException >> will be thrown. That cause a lot of mappers been killed. Retry >> won't solve but worsen the problem. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. >