Re: Does hadoop support append option?

2011-10-17 Thread Uma Maheswara Rao G 72686
AFAIK, append option is there in 20Append branch. Mainly supports sync. But there are some issues with that. Same has been merged to 20.205 branch and will be released soon (rc2 available). And also fixed many bugs in this branch. As per our basic testing it is pretty good as of now.Need to

Hadoop archive

2011-10-17 Thread Jonas Hartwig
Hi, im new to the community. Id like to create an archive but I get the error: Exception in archives null. Im using hadoop 0.204.0. the issue was tracked under MAPREDUCE-1399 https://issues.apache.org/jira/browse/MAPREDUCE-1399 and solved. How do I combine my hadoop version with a new

Re: Is there a good way to see how full hdfs is

2011-10-17 Thread Uma Maheswara Rao G 72686
We can write the simple program and you can call this API. Make sure Hadoop jars presents in your class path. Just for more clarification, DN will send their stats as parts of hertbeats, So, NN will maintain all the statistics about the diskspace usage for the complete filesystem and etc...

Re: Is there a good way to see how full hdfs is

2011-10-17 Thread Harsh J
Uma/Ivan, The DistributedFileSystem class explicitly is _not_ meant for public consumption, it is an internal one. Additionally, that method has been deprecated. What you need is FileSystem#getStatus() if you want the summarized report via code. A job, that possibly runs du or df, is a good

Re: Is there a good way to see how full hdfs is

2011-10-17 Thread Ivan.Novick
Hi Harsh, I need access to the data programatically for system automation, and hence I do not want a monitoring tool but access to the raw data. I am more than happy to use an exposed function or client program and not an internal API. So i am still a bit confused... What is the simplest way to

Re: Is there a good way to see how full hdfs is

2011-10-17 Thread Uma Maheswara Rao G 72686
Yes, that was deprecated in trunk If you want to use by programatically, this will be the better option as well. /** {@inheritDoc} */ @Override public FsStatus getStatus(Path p) throws IOException { statistics.incrementReadOps(1); return dfs.getDiskStatus(); } This should work for

How do I connect Java Visual VM to a remote task?

2011-10-17 Thread W.P. McNeill
I'm investigating a bug where my mapper and reducer tasks run out of memory. It only reproduces when I run on large data sets, so the best way to dig in is to launch my job with sufficiently large inputs on the cluster and monitor the memory characteristics of the failing JVMs remotely. Java

Re: How do I connect Java Visual VM to a remote task?

2011-10-17 Thread Rahul Jain
The easy way to debug such problems in our experience is to use 'jmap' to take a few snapshots of one of the tasktrackers (child tasks) and analyze them under a profiler tool such as jprofiler, yourkit etc. This should give you pretty good indication of objects that are using up most heap memory.

Re: Jira Assignment

2011-10-17 Thread Arun C Murthy
Done. On Oct 16, 2011, at 10:00 AM, Mahadev Konar wrote: Arun, This was fixed a week ago or so. Here's the infra ticket. https://issues.apache.org/jira/browse/INFRA-3960 You should be able to add new contributors now. thanks mahadev On Sun, Oct 16, 2011 at 9:36 AM, Arun C Murthy

Re: Jira Assignment

2011-10-17 Thread lcrawfordmills
Please remove me from this email list. Thank you --Original Message-- From: Arun C Murthy To: common-user@hadoop.apache.org ReplyTo: common-user@hadoop.apache.org Subject: Re: Jira Assignment Sent: Oct 17, 2011 3:08 PM Done. On Oct 16, 2011, at 10:00 AM, Mahadev Konar wrote: Arun,

Hadoop node disk failure - reinstall question

2011-10-17 Thread Mayuran Yogarajah
One of our nodes died today, it looks like the disk containing the OS expired. I will need to reinstall the machine. Are there any known issues with using the same hostname / IP again, or is it better to give it a new IP / host name ? The second disk on the machine is still operational and

Re: Hadoop node disk failure - reinstall question

2011-10-17 Thread patrick sang
This is what i would think: Are there any known issues with using the same hostname / IP again, or is it better to give it a new IP / host name ? as far as i understand, it is actually good to keep the same hostname/ip because the TTL in both dns or client library would bite you. The second

Building and adding new Datanode

2011-10-17 Thread Gauthier, Alexander
Hi guys, noob questions; What do I need to install a new node soon to be added to a cluster and how do I add it? I'm using CDH3 distribution. Thank you!! Alex Gauthier Engineering Manager Teradata Corp. Mobile: 510-427-5447 Office: 858-485-2144 fax: 858-485-2581 www.teradata.com

Is there a way to get the version of a remote hadoop instance

2011-10-17 Thread Tao.Zhang2
Hi, Is there a way to get the version of a remote hadoop instance through java API? Suppose there are two machines: A and B. I deploy the hadoop instance on machine A, while my application is deployed on machine B. Before starting my application, I want to check whether the hadoop instance

Re: Is there a good way to see how full hdfs is

2011-10-17 Thread Rajiv Chittajallu
If you are running 0.20.204 http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo ivan.nov...@emc.com wrote on 10/17/11 at 09:18:20 -0700: Hi Harsh, I need access to the data programatically for system automation, and hence I do not want a monitoring tool

Re: SimpleKMeansCLustering - Failed to set permissions of path to 0700

2011-10-17 Thread Raj Vishwanathan
Can you run any map/reduce jobs suchas word count? Raj Sent from my iPad Please excuse the typos. On Oct 17, 2011, at 5:18 PM, robpd robpodol...@yahoo.co.uk wrote: Hi I am new to Mahout and Hadoop. I'm currently trying to get the SimpleKMeansClustering example from the Maout in Action

Re: Hadoop node disk failure - reinstall question

2011-10-17 Thread Uma Maheswara Rao G 72686
- Original Message - From: Mayuran Yogarajah mayuran.yogara...@casalemedia.com Date: Tuesday, October 18, 2011 4:24 am Subject: Hadoop node disk failure - reinstall question To: common-user@hadoop.apache.org common-user@hadoop.apache.org One of our nodes died today, it looks like the

Re: Building and adding new Datanode

2011-10-17 Thread Harsh J
Hey Alexander, Just install the DataNode packages on the machine and configure it (copy over a config from existing DN perhaps, but make sure to check the dfs.data.dir properties before you start), and start it up. It will join the cluster as long as the configuration points to the right

Re: How do I connect Java Visual VM to a remote task?

2011-10-17 Thread Harsh J
Hello, (Inline) On Tue, Oct 18, 2011 at 12:04 AM, W.P. McNeill bill...@gmail.com wrote: snip 1. *Turn on JMX remote for the tasks*...I added the following options to mapred.child.java.opts: com.sun.management.jmxremote,