Re: How to execute RecommenderJob without preference value

2013-05-11 Thread 万代豊
Not sure unless I intetionally reproduce this situation, however, Mahout
recommendation seems to be senstive to the carriage code placed at the end
of your final input data record.
For instance,
If your final data record ends like

5 105
5 106

and have no succeeding recods, I believe Mahout should blindly read the NUL
record and eventually fall into
.ArrayIndexOutOfBoundsException.

Regards,,,
Y.Mandai




2013/5/11 滝口倫理 

> I tried to give exactly path. But I see the same error..
>
>
> [hadoop@localhost ~]$ hadoop jar
> /usr/lib/mahout/mahout-core-0.7-cdh4.2.1-job.jar
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJo
> b -i /user/hadoop/recommend2_in -o /user/hadoop/rec_out -s
> SIMILARITY_LOGLIKELIHOOD -b true
>
> [Error]
> 
>
> 13/05/11 09:14:09 INFO mapreduce.Job: Task Id :
> attempt_1368231064557_0001_m_00_0, Status : FAILED
> Error: java.lang.ArrayIndexOutOfBoundsException: 1
> at
>
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
> at
>
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
>
> 13/05/11 09:14:33 INFO mapreduce.Job: Counters: 6
> Job Counters
> Failed map tasks=4
> Launched map tasks=4
> Other local map tasks=3
> Rack-local map tasks=1
> Total time spent by all maps in occupied slots (ms)=26278
> Total time spent by all reduces in occupied slots (ms)=0
> Exception in thread "main" java.io.FileNotFoundException: File does not
> exist: /user/hadoop/temp/preparePreferenceMatrix/numUsers.bin
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1312)
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1258)
> at
>
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1231)
> 
>
>
>
>
>
> 2013/5/10 Sebastian Schelter 
>
> > I think there is some problem with the pathes to the files which you
> > supply. You should try to give absolute pathes to the files.
> >
> > Best,
> > Sebastian
> >
> >
> > On 10.05.2013 13:36, 滝口倫理 wrote:
> > > I would like to get recommended items by using RecommenderJob.
> > > The input data I made is as below. There aren't preference value on
> > purpose.
> > >
> > > When I run RecommenderJob, I got some errors.
> > >
> > > Does it mean I have to prepare the preference value for input file?
> > > I want to do RecommenderJob without preference value.
> > >
> > > Regards
> > > Takiguchi
> > >
> > >
> > > [mahout command]
> > > ===
> > >
> > > [hadoop@localhost test]$ cat rere2
> > > 1 101
> > > 1 102
> > > 1 103
> > > 2 101
> > > 2 102
> > > 2 103
> > > 2 104
> > > 3 101
> > > 3 104
> > > 3 105
> > > 3 107
> > > 4 101
> > > 4 103
> > > 4 104
> > > 4 106
> > > 5 101
> > > 5 102
> > > 5 103
> > > 5 104
> > > 5 105
> > > 5 106
> > >
> > > [hadoop@localhost test]$ hadoop fs -mkdir recommend2_in
> > >
> > > [hadoop@localhost test]$ hadoop fs -put rere2 recommend2_in
> > >
> > > [hadoop@localhost test]$ hadoop jar
> > > /usr/lib/mahout/mahout-core-0.7-cdh4.2.1-job.jar \
> > > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob \
> > > -i recommend2_in -o rec_out -s SIMILARITY_LOGLIKELIHOOD \
> > > -b true
> > > ===
> > >
> > >
> > > [Error]
> > >
> >
> =
> > >
> > >
> > > 13/05/10 20:15:54 INFO mapreduce.Job: Task Id :
> > > attempt_1368183830239_0002_m_00_0, Status : FAILED
> > > Error: java.lang.ArrayIndexOutOfBoundsException: 1
> > > at
> > >
> >
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
> > > at
> > >
> >
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
> > >
> > > 13/05/10 20:16:02 INFO mapreduce.Job: Task Id :
> > > attempt_1368183830239_0002_m_00_1, Status : FAILED
> > > Error: java.lang.ArrayIndexOutOfBoundsException: 1
> > >
> > > 13/05/10 20:16:10 INFO mapreduce.Job: Task Id :
> > > attempt_1368183830239_0002_m_00_2, Status : FAILED
> > > Error: java.lang.ArrayIndexOutOfBoundsException: 1
> > >
> > > 13/05/10 20:16:18 INFO mapreduce.Job: Counters: 6
> > > Job Counters
> > > Failed map tasks=4
> > > Launched map tasks=4
> > > Other local map tasks=3
> > > Rack-local map tasks=1
> > > Total time spent by all maps in occupied slots
> (ms)=26266
> > > Total time spent by all reduces in occupied slots
> (ms)=0
> > > Exception in thread "main" java.io.FileNotFoundException: File does not
> > > exis

Re: How to execute RecommenderJob without preference value

2013-05-11 Thread Sean Owen
You can't have a blank line, if that's what you mean, yes. That's not
a valid record. A terminal newline is fine.
But the error seems to be something else:

java.io.FileNotFoundException: File does not exist:
/user/hadoop/temp/preparePreferenceMatrix/numUsers.bin


Re: Statistical machine learning with Gaussian distributions

2013-05-11 Thread Matthew McClain
In k-means clustering, the clusters are characterized by their mean
vectors, and data samples belong to clusters according to the distance to
these means. If distance is measured using the L-2 norm (Euclidean
distance), assigning data samples to clusters is equivalent to using
maximum likelihood, where the clusters are characterized by multivariate
Gaussian distributions - the distribution means are the same as the cluster
means and the covariance matrices are all equal to the identity matrix. In
the same way, using a Mahalanobis distance measure is like using a
different covariance matrix in the distributions, but all of the covariance
matrices are still the same for all clusters. This constraint can be
removed by characterizing each cluster by the mean and covariance of its
samples, and using maximum likelihood in place of the distance measurement
for assigning clusters to samples.

Matt


On Fri, May 10, 2013 at 6:41 PM, Ted Dunning  wrote:

> K-means uses Gaussian errors.  The dirichlet clustering can be configured
> to use Gaussian errors.
>
> SVD uses Gaussian errors.  QR decomposition can be used to solve problems
> with Gaussian errors.
>
> I think I don't understand what you are asking about.
>
>
> On Fri, May 10, 2013 at 1:10 PM, Matthew McClain  >wrote:
>
> > I'm pretty new to Mahout, but it looks like there aren't any statistical
> > machine learning algorithms that use Gaussian distributions.
> Specifically,
> > I'm thinking of clustering algorithms that use Gaussian distributions to
> > model clusters and hidden Markov models that use Gaussian distributions.
> > Can someone tell if these are in Mahout somewhere, and if not, has this
> > been discussed at all?
> >
> > Thanks,
> > Matt
> >
>


Re: Class Not Found from 0.8-SNAPSHOT for org.apache.lucene.analysis.WhitespaceAnalyzer

2013-05-11 Thread 万代豊
Well, my Mahout-0.8-SNAPSHOT is now fine with the analyzer option
"org.apache.lucene.analysis.core.WhitespaceAnalyzer", but there are still
some steps to get over with...
This could be the Hadoop version incompatibility issue and if so, then what
should be the right/minimum Hadoop version? (At least "ClusterDump" with
Mahout-SNAPSHOT-0.8 worked fine against exisiting K-means result previously
done in 0.7)
I've been with Hadoop-0.20.203 (Pseudo-distributed) and Mahout-0.7 for
sometime and have just recently upgraded Mahout side up to 0.8-SNAPSHOT.

$MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
NHTSA-namedVector -ow -a org.apache.lucene.analysis.core.WhitespaceAnalyzer
-chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB:
/usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
n-gram size is: 2
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
LLR value: 50.0
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
reduce tasks: 1
13/05/12 01:45:48 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

 at org.apache.hadoop.ipc.Client.call(Client.java:1030)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
 at $Proxy1.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 at $Proxy1.addBlock(Unknown Source)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)

13/05/12 01:45:48 WARN hdfs.DFSClient: Error Recovery for block null bad
datanode[0] nodes == null
13/05/12 01:45:48 WARN hdfs.DFSClient: Could not get block locations.
Source file
"/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar"
- Aborting...
13/05/12 01:45:48 INFO mapred.JobClient: Cleaning up the staging area
hdfs://localhost:9000/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001
Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
 at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
 at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

 at org.apache.hadoo

Re: Class Not Found from 0.8-SNAPSHOT for org.apache.lucene.analysis.WhitespaceAnalyzer

2013-05-11 Thread Suneel Marthi
Its definitely not a Mahout-Hadoop compatibility issue and is more to do with 
your hadoop setup.

Check this link:

http://stackoverflow.com/questions/15585630/file-jobtracker-info-could-only-be-replicated-to-0-nodes-instead-of-1






 From: 万代豊 <20525entrad...@gmail.com>
To: "user@mahout.apache.org"  
Sent: Saturday, May 11, 2013 1:14 PM
Subject: Re: Class Not Found from 0.8-SNAPSHOT for 
org.apache.lucene.analysis.WhitespaceAnalyzer
 

Well, my Mahout-0.8-SNAPSHOT is now fine with the analyzer option
"org.apache.lucene.analysis.core.WhitespaceAnalyzer", but there are still
some steps to get over with...
This could be the Hadoop version incompatibility issue and if so, then what
should be the right/minimum Hadoop version? (At least "ClusterDump" with
Mahout-SNAPSHOT-0.8 worked fine against exisiting K-means result previously
done in 0.7)
I've been with Hadoop-0.20.203 (Pseudo-distributed) and Mahout-0.7 for
sometime and have just recently upgraded Mahout side up to 0.8-SNAPSHOT.

$MAHOUT_HOME/bin/mahout seq2sparse --namedVector -i NHTSA-seqfile01/ -o
NHTSA-namedVector -ow -a org.apache.lucene.analysis.core.WhitespaceAnalyzer
-chunk 200 -wt tfidf -s 5 -md 3 -x 90 -ng 2 -ml 50 -seq -n 2
Running on hadoop, using /usr/local/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB:
/usr/local/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
n-gram size is: 2
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
LLR value: 50.0
13/05/12 01:45:48 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
reduce tasks: 1
13/05/12 01:45:48 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

at org.apache.hadoop.ipc.Client.call(Client.java:1030)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)

13/05/12 01:45:48 WARN hdfs.DFSClient: Error Recovery for block null bad
datanode[0] nodes == null
13/05/12 01:45:48 WARN hdfs.DFSClient: Could not get block locations.
Source file
"/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar"
- Aborting...
13/05/12 01:45:48 INFO mapred.JobClient: Cleaning up the staging area
hdfs://localhost:9000/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001
Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File
/home/hadoop/mapred/staging/hadoop/.staging/job_201305120144_0001/job.jar
could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Serve

Re: Statistical machine learning with Gaussian distributions

2013-05-11 Thread Ted Dunning
On Sat, May 11, 2013 at 9:43 AM, Matthew McClain wrote:

> This constraint can be
> removed by characterizing each cluster by the mean and covariance of its
> samples, and using maximum likelihood in place of the distance measurement
> for assigning clusters to samples.
>

Just a note that ordinary k-means doesn't work well with variable
covariance.  You need some form of regularization.  The Dirichlet
clustering in Mahout provides on such method for doing this.


Re: How to execute RecommenderJob without preference value

2013-05-11 Thread Tomo Taki
I've found the problem. The separator should be comma. When I use space
with separator , I got the those errors.
Thanks everyone for helping me.
I will pay attention on separator next time.


[Successful Log]

===

[hadoop@localhost test]$ cat rec5

1,101

1,102
1,103
2,101
2,102
2,103
2,104
3,101
3,104
3,105
3,107
4,101
4,103
4,104
4,106
5,101
5,102
5,103
5,104
5,105
5,106

[hadoop@localhost test]$ hadoop fs -cat rec5/rec5

1,101
1,102
1,103
2,101
2,102
2,103
2,104
3,101
3,104
3,105
3,107
4,101
4,103
4,104
4,106
5,101
5,102
5,103
5,104
5,105
5,106

[hadoop@localhost test]$ hadoop jar
/usr/lib/mahout/mahout-core-0.7-cdh4.2.1-job.jar
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -i rec5 -o
rec_result5 -s SIMILARITY_LOGLIKELIHOOD

[hadoop@localhost test]$ hadoop fs -ls rec_result5
Found 2 items
-rw-r--r--   1 hadoop supergroup  0 2013-05-12 14:01
rec_result5/_SUCCESS
-rw-r--r--   1 hadoop supergroup108 2013-05-12 14:01
rec_result5/part-r-0
[hadoop@localhost test]$ hadoop fs -cat rec_result5/part-r-0

1   [106:1.0,105:1.0,104:1.0]
2   [106:1.0,105:1.0]
3   [106:1.0,103:1.0,102:1.0]
4   [105:1.0,102:1.0]
5   [107:1.0]



===




2013/5/10 滝口倫理 

> I would like to get recommended items by using RecommenderJob.
> The input data I made is as below. There aren't preference value on
> purpose.
>
> When I run RecommenderJob, I got some errors.
>
> Does it mean I have to prepare the preference value for input file?
> I want to do RecommenderJob without preference value.
>
> Regards
> Takiguchi
>
>
> [mahout command]
> ===
>
> [hadoop@localhost test]$ cat rere2
> 1 101
> 1 102
> 1 103
> 2 101
> 2 102
> 2 103
> 2 104
> 3 101
> 3 104
> 3 105
> 3 107
> 4 101
> 4 103
> 4 104
> 4 106
> 5 101
> 5 102
> 5 103
> 5 104
> 5 105
> 5 106
>
> [hadoop@localhost test]$ hadoop fs -mkdir recommend2_in
>
> [hadoop@localhost test]$ hadoop fs -put rere2 recommend2_in
>
> [hadoop@localhost test]$ hadoop jar
> /usr/lib/mahout/mahout-core-0.7-cdh4.2.1-job.jar \
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob \
> -i recommend2_in -o rec_out -s SIMILARITY_LOGLIKELIHOOD \
> -b true
> ===
>
>
> [Error]
>
> =
>
>
> 13/05/10 20:15:54 INFO mapreduce.Job: Task Id :
> attempt_1368183830239_0002_m_00_0, Status : FAILED
> Error: java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
> at
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
>
> 13/05/10 20:16:02 INFO mapreduce.Job: Task Id :
> attempt_1368183830239_0002_m_00_1, Status : FAILED
> Error: java.lang.ArrayIndexOutOfBoundsException: 1
>
> 13/05/10 20:16:10 INFO mapreduce.Job: Task Id :
> attempt_1368183830239_0002_m_00_2, Status : FAILED
> Error: java.lang.ArrayIndexOutOfBoundsException: 1
>
> 13/05/10 20:16:18 INFO mapreduce.Job: Counters: 6
> Job Counters
> Failed map tasks=4
> Launched map tasks=4
> Other local map tasks=3
> Rack-local map tasks=1
> Total time spent by all maps in occupied slots (ms)=26266
> Total time spent by all reduces in occupied slots (ms)=0
> Exception in thread "main" java.io.FileNotFoundException: File does not
> exist: /user/hadoop/temp/preparePreferenceMatrix/numUsers.bin
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1312)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1258)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1231)
>
>
>
>
> =
>