Hi all,
I just inherited a hadoop/EC2/nutch project that has been running for a few
weeks, and lo and behold, the task magically entered a state that is not
running, nor complete nor failed.
I'd love to figure out why it is what it is right now, as well as resume it
without losing what it's
Hi all,
I am interested in saving the the output of both the mapper and the
reducer in HDFS, is there an efficient way of doing this?
Of course i could just run the mapper followed by the identity reducer,
and then an identity mapper with my reducer. However,
it seems like a waste to run the
Hello,
Is this possible to add slaves with IP address not known in advance
to an Hadoop cluster while a computation is going on?
And the reverse capability: is it possible to cleanly permanently
remove a slave node from the hadoop cluster?
Thank you,
François.
hi
By setting isSplitable false, we prevent the files from splitting.
we can check that from the no. of map tasks..
but how do we check, if the records are proper..
Chandravadana S
Enis Soztutar wrote:
Nope, not right now. But this has came up before. Perhaps you will
contribute one?
Owen O'Malley wrote:
On Sep 24, 2008, at 1:50 AM, Trinh Tuan Cuong wrote:
We are developing a project and we are intend to use Hadoop to handle
the processing vast amount of data. But to convince our customers
about the using of Hadoop in our project, we must show them the
advantages ( and
Gerardo Velez wrote:
Hi everybody!
I'm a newbie on hadoop and after follow up some hadoop examples and studied
them. I will start my own application but I got a question.
Is there anyway I could debug my own hadoop application?
Actually I've been working on IntelliJ IDE, but I'm feeling
Hello, all,
I'm trying to create a custom Hadoop image for EC2. The scripts in
src/contrib/ec2 work fine up to the point of uploading the image to
S3. Here are the last few lines of output:
Uploaded hadoop-0.18.1-i386.part.17 to
Hello Stuart
I had a comparable problem a few minutes ago and fixed it by adding *--url
http://s3.amazonaws.com* to the command ec2-upload-bundle in the script *
create-hadoop-image-remote*
See
http://developer.amazonwebservices.com/connect/thread.jspa?threadID=16543
Maybe try connecting to
Mikhail Yakshin wrote:
On Wed, Sep 24, 2008 at 9:24 PM, Elia Mazzawi wrote:
I got these errors I don't know what they mean, any help is appreciated.
I suspect that either its a H/W error or the cluster is out of space to
store intermediate results?
there is still lots of free space left on
Hi,
We have a requirement to essentially expire temporary files that are no longer
needed in an HDFS share. I have noticed some traffic on this very same issue
and was wondering how best to approach the problem and/or contribute.
Basically, we need to remove a user specified subset of files from
Hello all,
I am getting some odd behavior from hadoop which seems like a bug. I
have created a custom input format, and I am observing that my
getSplits method is being called twice. Each call is on a different
instance of the input format. The job, however, is only run once,
using the
Does HDFS guarantee that all the blocks of a particular compressed file exist
on the same datanode? (same question for any of its replicas)
This is important because in HaDoop, each Map is required to process the entire
compressed file. If the blocks of the compressed
file exist on different
Hi,
I'm trying to build an index using the index contrib in Hadoop
0.18.0, but the reduce tasks are consistently failing.
In the output from the hadoop jar command, I see messages like this:
08/09/25 14:12:11 INFO mapred.JobClient: map 27% reduce 4%
08/09/25 14:12:23 INFO mapred.JobClient:
On Sep 25, 2008, at 2:26 PM, Joe Shaw wrote:
Hi,
I'm trying to build an index using the index contrib in Hadoop
0.18.0, but the reduce tasks are consistently failing.
What did the logs for the task-attempt
'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job
succeed?
Hi,
On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote:
What did the logs for the task-attempt
'attempt_200809180916_0027_r_07_2' look like? Did the TIP/Job succeed?
You mean inside userlogs/attempt_blah_blah/syslog? I didn't know
about this log file before, thanks!
Hi again,
Ugh, sorry about the butchered output.
On Thu, Sep 25, 2008 at 5:42 PM, Joe Shaw [EMAIL PROTECTED] wrote:
Hi,
On Thu, Sep 25, 2008 at 5:32 PM, Arun C Murthy [EMAIL PROTECTED] wrote:
What did the logs for the task-attempt
'attempt_200809180916_0027_r_07_2' look like? Did the
Hi all,
I'm trying to set up a small cluster with 3 machines. I'd like to have one
machine serves as the namenode and the jobtracker, while the 3 all serve as
the datanode and tasktrackers.
After following the set up instructions, I got an exception running
$HADOOP_HOME/bin/start-dfs.sh:
Thanks, Lohit.
I took a look at the Task_Counter.properties file and fiured out that I
would like to use REDUCE_INPUT_RECORDS.
I want to access this within my reduce function, just to check the value.
In order to do this, I tried to include
importorg.apache.hadoop.mapred.Task;
and I had
I like katta!
2008/9/24 Stefan Groschupf [EMAIL PROTECTED]
Hi All,
thanks a lot for your interest.
Both my katta and the hadoop survey slides can be found here:
http://find23.net/2008/09/23/hadoop-user-group-slides/
If you have a chance please give katta a test drive and give us some
Thanks!!!
On Wed, Sep 24, 2008 at 11:11 AM, Stefan Groschupf [EMAIL PROTECTED] wrote:
Hi All,
thanks a lot for your interest.
Both my katta and the hadoop survey slides can be found here:
http://find23.net/2008/09/23/hadoop-user-group-slides/
If you have a chance please give katta a test
The counter for reduce input records is updated as the reduces consume
records. Processing it in a reduce is meaningless. Relying on counters
for job correctness (particularly from the same job) is risky at best.
Neither maps nor reduces are given a set number of records, unless
this is
The decision making system seems interesting to me. :)
The question I want to ask is whether it is possible to perform statistical
analysis on the data using Hadoop and MapReduce.
I'm sure Hadoop could do it. FYI, The Hama project is an easy-to-use
to matrix algebra and its uses in
I think you can try org.apache.hadoop.mapred.lib.MultipleOutputs, it
will be released in 0.19 but you can apply the patch now.
Just my idea, not sure it's efficient or not
2008/9/25 Christian Ulrik Søttrup [EMAIL PROTECTED]:
Hi all,
I am interested in saving the the output of both the mapper
23 matches
Mail list logo