[
https://issues.apache.org/jira/browse/MAHOUT-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084368#comment-14084368
]
ASF GitHub Bot commented on MAHOUT-1594:
----------------------------------------
GitHub user roengram opened a pull request:
https://github.com/apache/mahout/pull/38
MAHOUT-1594
Detail: https://issues.apache.org/jira/browse/MAHOUT-1594
Factorization example doesn't work correctly with Hadoop version:
2.4.0.2.1.1.0-385. The reason is that the example uses local for working
directories. I've changed all references to local dirs to HDFS ones, change
Linux shell commands to hadoop equivalents.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/roengram/mahout MAHOUT.1594
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/38.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #38
----
commit 43a4060d8fccc22c65295d00f32a8458ff19391f
Author: roengram <[email protected]>
Date: 2014-08-04T06:38:56Z
Use HDFS instead of local dir
----
> Example factorize-movielens-1M.sh does not use HDFS
> ---------------------------------------------------
>
> Key: MAHOUT-1594
> URL: https://issues.apache.org/jira/browse/MAHOUT-1594
> Project: Mahout
> Issue Type: Bug
> Components: Examples
> Affects Versions: 0.9
> Environment: Hadoop version: 2.4.0.2.1.1.0-385
> Git hash: 2b65475c3ab682ebd47cffdc6b502698799cd2c8 (trunk)
> Reporter: jaehoon ko
> Priority: Minor
> Labels: newbie, patch
> Fix For: 1.0
>
> Attachments: MAHOUT-1594.patch
>
>
> It seems that factorize-movielens-1M.sh does not use HDFS at all. All paths
> look local paths, not HDFS. So the example crashes immeidately because it
> cannot find input data from HDFS:
> {code}
> Exception in thread "main"
> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
> not exist: /tmp/mahout-work-hoseog.lee/movielens/ratings.csv
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:320)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:263)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:375)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:493)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> at
> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.run(DatasetSplitter.java:94)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at
> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.main(DatasetSplitter.java:64)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)