Hello,
Sorting is done by the SortingComparator which performs sorting based on the
value of key. A possible solution would be the following:
You could write a custom Writable comparable class which extends
WritableComparable (lets call it MyCompositeFieldWritableComparable), that will
store
Hello Douglass,
you could take a look at Mahout and Myrrix projects. These are two projects
thatprovide implementations of recommendation machine learning algorithms.
There are MapReduce implementations as well, to support massive datasets.
In addition, these systems provide client
Good morning!
Check again the name and path of your jar file. I guess you don't spell it
correct when you write the command so hadoop cannot find it, as indicated by
this message:
Error opening job jar: hadoop-examples-0.20.203.0.jar
Good luck
Sofia
Good morning!
I would be grateful if anyone could help me about a serious problem that I'm
facing.
I try to run a hadoop job on a 12-node luster (has 48 task capacity), and I
have problems when dealing with big input data (10-20GB) which gets worse when
I increase the number of reducers.
common-user@hadoop.apache.org; Sofia
Georgiakaki geosofie_...@yahoo.com
Sent: Friday, September 23, 2011 4:28 PM
Subject: Re: many killed tasks, long execution time
Can you include the complete stack trace of the IOException you are seeing?
--Bobby Evans
On 9/23/11 2:15 AM, Sofia Georgiakaki
Good evening,
this topic seems very interesting.
To be sure I understood the case - do you mean that I can write a simple Java
program and access a file stored in HDFS from within the java application?
Assuming that I have e.g. 10 files of size 30GB each stored on HDFS on a
cluster of 15
: Joey Echeverria [mailto:j...@cloudera.com]
Sent: Friday, August 12, 2011 6:28 AM
To: common-user@hadoop.apache.org; Sofia Georgiakaki
Subject: Re: Hadoop--store a sequence file in distributed cache?
You can use any kind of format for files in the distributed cache, so
yes you can use sequence
-user@hadoop.apache.org; Sofia Georgiakaki geosofie_...@yahoo.com
Sent: Friday, August 12, 2011 11:30 AM
Subject: Re: Hadoop--store a sequence file in distributed cache?
Hi Sofia,
I assume that output of first job is stored on HDFS. In that case I would
directly read file from Mappers without
thesis, and I don't know
from who I should ask for help.
Thank you very much in advance,
Sofia Georgiakaki
undergraduate student
department of Electronic Computer Engineering
Technical University of Crete, Greece
is possible that
it will be updated to Hadoop 0.20.203. Will I have a problem using the old api
then??
Hadoop is confusing, I say.
Thank you,
Sofia Georgiakaki
can set the different
InputFormats...
Could someone give me a helping hand please?
Thank you in advance,
Sofia Georgiakaki
Good afternoon,
during writing a MapReduce job, I need to get the value of some configuration
settings.
For instance, I need to get the value of dfs.write.packet.size inside the
reducer, so I write, using the context of the reducer:
Configuration
Good evening,
I have built an Rtree on HDFS, in order to improve the query performance of
high-selectivity spatial queries.
The Rtree is composed of a number of hdfs files (each one created by one
Reducer, so as the number of the files is equal to the number of the reducers),
where each file
13 matches
Mail list logo