Hmm, your parameters look correct. You could try to remove the quotation
marks around /data/temp and check for whitespace characters.
--sebastian
On 21.12.2010 04:12, Gayatri Rao wrote:
Hi,
Thanks, I checked and you may be right. It looks like the RowSimilarityJob
did not run because of which there was no similarityMatrix directory.
I tried running only the RowSimilarityJob using the command below
hadoop jar core/target/mahout-core-0.4-job.jar
org.apache.mahout.math.hadoop.similarity.RowSimilarityJob
-Dmapred.input.dir=/data/temp/itemUserMatrix
-Dmapred.output.dir=/data/temp/similarityMatrix --numberOfColumns 6040
--similarityClassname SIMILARITY_COOCCURRENCE --maxSimilaritiesPerRow 100
--tempDir "/data/temp"
Then I am getting the error, It looks like the usage is not correct but I am
wondering what might be wrong.
10/12/21 08:36:23 ERROR common.AbstractJob: Unexpected /data/temp while
processing Job-Specific Options:
usage:<command> [Generic Options] [Job-Specific Options]
Generic Options:
-archives<paths> comma separated archives to be unarchived
on the compute machines.
-conf<configuration file> specify an application configuration file
-D<property=value> use value for given property
-files<paths> comma separated files to be copied to the
map reduce cluster
-fs<local|namenode:port> specify a namenode
-jt<local|jobtracker:port> specify a job tracker
-libjars<paths> comma separated jar files to include in the
classpath.
Job-Specific Options:
--input (-i) input Path to job input
directory.
--output (-o) output The directory
pathname
for output.
--numberOfColumns (-r) numberOfColumns Number of columns in
the input matrix
--similarityClassname (-s) similarityClassname Name of distributed
similarity class to
instantiate,
alternatively use
one
of the predefined
similarities
([SIMILARITY_COOCCURRENC
E,
SIMILARITY_EUCLIDEAN_DIS
TANCE,
SIMILARITY_LOGLIKELIHOOD
,
SIMILARITY_PEARSON_CORRE
LATION,
SIMILARITY_TANIMOTO_COEF
FICIENT,
SIMILARITY_UNCENTERED_CO
SINE,
SIMILARITY_UNCENTERED_ZE
RO_ASSUMING_COSINE])
--maxSimilaritiesPerRow (-m) maxSimilaritiesPerRow Number of maximum
similarities per row
(default: 100)
--help (-h) Print out help
--tempDir tempDir Intermediate output
directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
Thanks,
Gayatri
On Mon, Dec 20, 2010 at 1:31 PM, Sebastian Schelter<[email protected]> wrote:
Hi,
can you post the exact parameters you used to call the job? And please have
a look at your error logs again, I have the suspicion that something else
already went wrong before the exception that you posted occured, could you
check that too?
--sebastian
On 20.12.2010 08:50, Gayatri Rao wrote:
Hi,
I have been trying to run the Hadoop Item Based Collaborative Filtering
Job
as described in
https://cwiki.apache.org/confluence/display/MAHOUT/TasteCommandLine
Few MR jobs run sucessfully
((RecommenderJob-ItemIDIndexMapper-ItemIDIndexReduce,RecommenderJob-ToItemPrefsMapper-ToUserVectorReduc,RecommenderJob-CountUsersMapper-CountUsersReducer,RecommenderJob-MaybePruneRowsMapper-ToItemVectorsR)
After which the job dies with an exception
Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does
not exist: /data/temp/similarityMatrix
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:234)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:328)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I find the following files in /data/temp
[gaya...@e1aeu046110d mahout-distribution-0.4]$ hadoop dfs -ls /data/temp
Found 4 items
drwxr-xr-x - gayatri supergroup 0 2010-12-17 16:53
/data/temp/countUsers
drwxr-xr-x - gayatri supergroup 0 2010-12-17 16:51
/data/temp/itemIDIndex
drwxr-xr-x - gayatri supergroup 0 2010-12-17 16:54
/data/temp/itemUserMatrix
drwxr-xr-x - gayatri supergroup 0 2010-12-17 16:52
/data/temp/userVectors
Is this a configuration issue? I am not able to understand what might be
the
error.
Thanks
Gayatri