Aaron,
Does this have to do with the cloud we got at SoftLayer or is it something
at the university like you said?
I think I provisioned 6 - 10 with 1G connections let me log in and check
really quick.. no they're all correct..
Speed: 1000Mb/s
On Wed, Oct 29, 2008 at 9:49 PM, Aaron
Hey all,
I have a problem with invalid blocks which Hadoop doesn't seem to
realize that are invalid.
For some reason, a lot of our blocks got truncated to 192KB
(HADOOP-4543). When I try to drain off nodes, Hadoop tries to
replicate these blocks with the truncated blocks as the source.
Thank you so much!
Owen O'Malley wrote:
I uploaded a patch that does a secondary sort. Take a look at:
https://issues.apache.org/jira/browse/HADOOP-4545
It reads input with two numbers per a line. Such as:
-1 -4
-3 23
5 10
-1 -2
-1 300
-1 10
4 1
4 2
4 10
4 -1
4 -10
10 20
10
Hi folks;
I'm writing a Hadoop Pipes application, and I need to generate a bunch
of integers that are unique across all map tasks. If each map task has
a unique integer ID, I can make sure my integers are unique by including
that integer ID. I have this theory that each map task has a unique
On Oct 30, 2008, at 9:03 AM, Joel Welling wrote:
I'm writing a Hadoop Pipes application, and I need to generate a
bunch
of integers that are unique across all map tasks. If each map task
has
a unique integer ID, I can make sure my integers are unique by
including
that integer ID. I
Devaraj Das wrote:
I wrote a patch to address the NPE in JobTracker.killJob() and compiled
it against TRUNK. I've put this on the cluster and it's now been holding
steady for the last hour or so.. so that plus whatever other differences
there are between 18.1 and TRUNK may have fixed things.
I'm growing very frustrated with a simple cluster setup. I can get
the cluster setup on two machines, but have troubles when trying to
extend the installation to 3 or more boxes. I keep seeing the below
errors. It seems the reduce tasks can't get access to the data.
I can't seem to
Arun gave a great talk about debugging and tuning at the Rapleaf event.
Take a look:
http://www.vimeo.com/2085477
Alex
On Thu, Oct 30, 2008 at 6:20 AM, Malcolm Matalka
[EMAIL PROTECTED] wrote:
I'm not sure of the correct way, but when I need to log a job I have it
print out with some unique
Hi,
can you, please, explain the difference between fs.default.name and
dfs.http.address (like how and when is SecondaryNameNode using
fs.default.name and how/when dfs.http.address). I have set them both to
same (namenode's) hostname:port. Is this correct (or dfs.http.address
needs some other
So its not just at 16%, but depends on the task:
2008-10-30 13:58:29,702 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_200810301345_0001_r_00_0 0.25675678% reduce copy (57 of
74 at 13.58 MB/s)
2008-10-30 13:58:29,357 WARN org.apache.hadoop.mapred.TaskTracker:
Hi Scott,
Your reducer classes are unable to read map outputs. You may check
mapred.local.dir property in your conf/hadoop-default.xml and
conf/hadoop-site.xml . These directories should be valid directories in
your slaves. You can give multiple folders with comma separated values.
- Prasad.
So, Philippe reports that the problem goes away with 0.20-dev
(trunk?): http://mahout.markmail.org/message/swmzreg6fnzf6icv We
aren't totally clear on the structure of SVN for Hadoop, but it seems
like it is not fixed by this patch.
On Oct 29, 2008, at 10:28 AM, Grant Ingersoll wrote:
Thanks for the answer. It looks like the values are setup correctly.
I see /mapred/local directory created successfully as well. Do I need
to explicity define a value in hadoop-site.xml?
hadoop-default.xml:
property
namemapred.local.dir/name
value${hadoop.tmp.dir}/mapred/local/value
I'd like my log messages to display the hostname of the node that they were
outputted on. Sure, this information can be grabbed from the log filename,
but I would like each log message to also have the hostname. I don't think
log4j provides support to include the hostname in a log, so I've tried
Is the presentation online as well? (Hard to see some of the slides
in the video)
On Oct 30, 2008, at 1:34 PM, Alex Loddengaard wrote:
Arun gave a great talk about debugging and tuning at the Rapleaf
event.
Take a look:
http://www.vimeo.com/2085477
Alex
On Thu, Oct 30, 2008 at 6:20 AM,
I don't think so, unfortunately. I couldn't find them. I remember someone
mentioning that the slides would be posted somewhere, though.
Alex
On Thu, Oct 30, 2008 at 1:16 PM, Scott Whitecross [EMAIL PROTECTED] wrote:
Is the presentation online as well? (Hard to see some of the slides in the
Stefan Will wrote:
Hi Raghu,
Each DN machine has 3 partitions, e.g.:
FilesystemSize Used Avail Use% Mounted on
/dev/sda1 20G 8.0G 11G 44% /
/dev/sda3 1.4T 756G 508G 60% /data
tmpfs 3.9G 0 3.9G 0% /dev/shm
All of the paths in
Seems that the slides for each of the 3 Rapleaf talks are posted in the
descriptions:
The Collector - A Tool to Have Multi-Writer Appends into HDFS
http://docs.google.com/Present?docid=dgz78tv5_10gpjhnvg9
Katta - Distributed Lucene Index in Production
Hi!
Is there a way of using the value read in the configure() in the Map or
Reduce phase?
Erik
On Thu, Oct 23, 2008 at 2:40 AM, Aaron Kimball [EMAIL PROTECTED] wrote:
See Configuration.setInt() in the API. (JobConf inherits from
Configuration). You can read it back in the configure() method
Raghu Angadi wrote:
Devaraj fwded the stacks that Aaron sent. As he suspected there is a
deadlock in RPC server. I will file a blocker for 0.18 and above. This
deadlock is more likely on a busy network.
Aaron,
Could you try the patch attached to
Thanks Otis.
I need to correlate the log data and the database data. What I was hoping I
could do is to write mapReduce jobs to do this but it seems that I can't
unless I write a mysql input format for hadoop. Is there already an
implementation out there for this? Any other way to do this? If
Pl check your classpath entries.
Looks like hadoop-core jar before you shutdown the cluster and after u
changed hadoop-env.sh are different
-Sagar
Songting Chen wrote:
Hi,
I modified the classpath in hadoop-env.sh in namenode and datanodes before shutting down the cluster. Then problem
22 matches
Mail list logo