[ 
https://issues.apache.org/jira/browse/MAHOUT-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677185#comment-13677185
 ] 

Grant Ingersoll commented on MAHOUT-992:
----------------------------------------

[~ssc] or [~robin.a...@gmail.com] 

I see this in several places:
{code}
Path[] files = DistributedCache.getLocalCacheFiles(conf);
    if (files == null) {
      throw new IOException("Cannot read Frequency list from Distributed 
Cache");
    }
    if (files.length != 1) {
      throw new IOException("Cannot read Frequency list from Distributed Cache 
(" + files.length + ')');
    }
    FileSystem fs = FileSystem.getLocal(conf);
    Path fListLocalPath = fs.makeQualified(files[0]);
    // Fallback if we are running locally.
    if (!fs.exists(fListLocalPath)) {
      URI[] filesURIs = DistributedCache.getCacheFiles(conf);
      if (filesURIs == null) {
        throw new IOException("Cannot read Frequency list from Distributed 
Cache");
      }
      if (filesURIs.length != 1) {
        throw new IOException("Cannot read Frequency list from Distributed 
Cache (" + files.length + ')');
      }
      fListLocalPath = new Path(filesURIs[0].getPath());
    }
{code}

I don't really follow the "Fallback if running locally" comment.  The first 
part of the code is looking in the local file system.  Doesn't (or shouldn't?) 
Hadoop handle this seamlessly?

                
> Audit DistributedCache use to support EMR
> -----------------------------------------
>
>                 Key: MAHOUT-992
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-992
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: tom pierce
>            Assignee: Grant Ingersoll
>            Priority: Minor
>              Labels: newbie
>             Fix For: 0.8
>
>
> Apparently some of our DistributedCache use is not EMR-safe.  It would be 
> great if someone could audit our uses of DC, and fix up this problem where it 
> exists.
> For an example of problematic usage (and the fix), see MAHOUT-980.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to