Re: how can I decommission nodes on-the-fly?

2008-11-26 Thread lohit
As Amareshwari said, you can almost safely stop TaskTracker process on node. Task(s) running on that would be considered failed and would be re-executed by JobTracker on another node. Reason why we decomission DataNode is to protect against data loss. DataNode stores HDFS blocks, by

Re: How to retrieve rack ID of a datanode

2008-11-26 Thread lohit
I take that back. I forgot about the changes in new version of HDFS. If you are testing this take a look at TestReplication.java Lohit - Original Message From: Ramya R [EMAIL PROTECTED] To: core-user@hadoop.apache.org Cc: [EMAIL PROTECTED] Sent: Tuesday, November 25, 2008 11:15:28 PM

output in memory

2008-11-26 Thread ZhiHong Fu
Hello , now , I have a mapreduce job , which i want the job result will not be stored in a file , I just need it to be showed to users. So , how can i write a outputFormat for that ? for example , The job will read a big number of data from the database, and then I will process the data

Re: how can I decommission nodes on-the-fly?

2008-11-26 Thread Steve Loughran
lohit wrote: As Amareshwari said, you can almost safely stop TaskTracker process on node. Task(s) running on that would be considered failed and would be re-executed by JobTracker on another node. Reason why we decomission DataNode is to protect against data loss. DataNode stores HDFS blocks, by

Memory allocation - please confirm

2008-11-26 Thread tim robertson
Hi, Could you please sanity check this: In Hadoop-site.xml I add: property namemapred.child.java.opts/name value-Xmx1G/value descriptionIncreasing the size of the heap to allow for large in memory index of polygons/description /property Is this all required to increase the -Xmx for

Namenode BlocksMap on Disk

2008-11-26 Thread Dennis Kubes
From time to time a message pops up on the mailing list about OOM errors for the namenode because of too many files. Most recently there was a 1.7 million file installation that was failing. I know the simple solution to this is to have a larger java heap for the namenode. But the

Re: Memory allocation - please confirm

2008-11-26 Thread tim robertson
Thanks! Just making sure that this was the only parameter needing set. Cheers Tim On Wed, Nov 26, 2008 at 1:20 PM, Dennis Kubes [EMAIL PROTECTED] wrote: I have always seen -Xmx set in megabytes versus gigabytes. It does work for me on Ubuntu as G but tt may depend on the JVM and OS,

Re: HDFS directory listing from the Java API?

2008-11-26 Thread Jürgen Broß
Hi Shane, I think what you are looking for is the following: Path dirPath = new Path(path to dir); FileStatus[] files = FileSystem.get(conf).listStatus(dirPath); Each FileStatus entry in the above array contains a Path reference (files[i].getPath()) to the file or directory contained in

Re: output in memory

2008-11-26 Thread tim robertson
I would still store the result in file, and then write a user interface that renders the output file as required... How would you know the user is still on the other end waiting to view the result? If you are sure, then perhaps the thing that launches the job could block until it is finished,

Re: How We get old version of Haddop

2008-11-26 Thread Dennis Kubes
For currently active releases: http://www.apache.org/dist/hadoop/core/ For older releases and branches not actively maintained: http://archive.apache.org/dist/hadoop/core/ Dennis Rashid Ahmad wrote: Dear Freinds, How we get hadoop old version.

How to let Reducer know on which partition it is working

2008-11-26 Thread Jürgen Broß
Hi all, my Reducers need to load a huge HashMap from data present in the HDFS. This data has been partitioned by a previous map/reduce job. The complete data would not fit into main memory of a Reducer machine. It would suffice to load only the correct partition of the data. The problem is

RE: how can I decommission nodes on-the-fly?

2008-11-26 Thread Koji Noguchi
+1 Created Jira. https://issues.apache.org/jira/browse/HADOOP-4733 Koji Steve Loughran wrote: At some point in the future, I could imagine it being handy to have the ability to decomission a task tracker, which would tell it to stop accepting new work, and run the rest down. This would

Highly dynamic Hadoop Cluster

2008-11-26 Thread Ricky Ho
Does Hadoop support the environment where nodes join and leave without a preconfigured file like hadoop-site.xml ? The characteristic is that none of the IP addresses and node names of any machines are stable. They will change after the machine is reboot after crash. Before that, I use a

Re: Highly dynamic Hadoop Cluster

2008-11-26 Thread Steve Loughran
Ricky Ho wrote: Does Hadoop support the environment where nodes join and leave without a preconfigured file like hadoop-site.xml ? The characteristic is that none of the IP addresses and node names of any machines are stable. They will change after the machine is reboot after crash. Before

Re: How to let Reducer know on which partition it is working

2008-11-26 Thread Owen O'Malley
On Nov 26, 2008, at 4:35 AM, Jürgen Broß wrote: I'm not sure how to let a Reducer know in its configure() method which partition it will get from the Partitioner, From: http://hadoop.apache.org/core/docs/r0.19.0/mapred_tutorial.html#Task+JVM+Reuse look for mapred.task.partition, which is

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Billy Pearson
I would like to see something like this also I run 32bit servers so I am limited on how much memory I can use for heap. Besides just storing to disk I would like to see some sort of cache like a block cache that will cache parts the BlocksMap this would help reduce the hits to disk for lookups

Re: Auto-shutdown for EC2 clusters

2008-11-26 Thread Tom White
I've just created a basic script to do something similar for running a benchmark on EC2. See https://issues.apache.org/jira/browse/HADOOP-4382. As it stands the code for detecting when Hadoop is ready to accept jobs is simplistic, to say the least, so any ideas for improvement would be great.

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Sagar Naik
We can also try to mount the particular dir on ramfs and reduce the performance degradation -Sagar Billy Pearson wrote: I would like to see something like this also I run 32bit servers so I am limited on how much memory I can use for heap. Besides just storing to disk I would like to see some

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Doug Cutting
Dennis Kubes wrote: 2) Besides possible slight degradation in performance, is there a reason why the BlocksMap shouldn't or couldn't be stored on disk? I think the assumption is that it would be considerably more than slight degradation. I've seen the namenode benchmarked at over 50,000

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Raghu Angadi
Dennis Kubes wrote: From time to time a message pops up on the mailing list about OOM errors for the namenode because of too many files. Most recently there was a 1.7 million file installation that was failing. I know the simple solution to this is to have a larger java heap for the

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Brian Bockelman
On Nov 26, 2008, at 12:08 PM, Doug Cutting wrote: Dennis Kubes wrote: 2) Besides possible slight degradation in performance, is there a reason why the BlocksMap shouldn't or couldn't be stored on disk? I think the assumption is that it would be considerably more than slight degradation.

Re: s3n exceptions

2008-11-26 Thread Per Jacobsson
Are you using 0.18? I know that the copy HDFS to s3n isn't supported there yet. I think there's a fix in 0.19. / Per On Mon, Nov 24, 2008 at 2:11 AM, Alexander Aristov [EMAIL PROTECTED] wrote: Hi all I am testing s3n file system facilities and try to copy from hdfs to S3 in original format

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Doug Cutting
Brian Bockelman wrote: Do you have any graphs you can share showing 50k opens / second (could be publicly or privately)? The more external benchmarking data I have, the more I can encourage adoption amongst my university... The 50k opens/second is from some internal benchmarks run at Y!

Re: s3n exceptions

2008-11-26 Thread Alexander Aristov
yes, I am trying 0.18.2 But according to hadoop wiki it is supported http://wiki.apache.org/hadoop/AmazonS3 and http://issues.apache.org/jira/browse/HADOOP-930 Alexander 2008/11/26 Per Jacobsson [EMAIL PROTECTED] Are you using 0.18? I know that the copy HDFS to s3n isn't supported there

Namenode reporting empty files for originally non-empty files

2008-11-26 Thread Christian Kunz
On a cluster running hadoop-0.17.2: We have a handful of files that originally were not empty but now reported with 0 size. I checked the corresponding blocks on the DataNodes and they are indeed non-empty. I restarted the namenode, with no success. With the non-empty blocks I can re-create

Re: s3n exceptions

2008-11-26 Thread Per Jacobsson
I'm thinking of this bug: http://issues.apache.org/jira/browse/HADOOP-3361 It was the cause of problems I hit trying to copy from HDFS to S3N with 0.18.1. / Per On Wed, Nov 26, 2008 at 12:57 PM, Alexander Aristov [EMAIL PROTECTED] wrote: yes, I am trying 0.18.2 But according to hadoop wiki

Re: HDFS directory listing from the Java API?

2008-11-26 Thread Shane Butler
Got it! Thanks Jürgen and Lohit! On Wed, Nov 26, 2008 at 11:10 PM, Jürgen Broß [EMAIL PROTECTED] wrote: Hi Shane, I think what you are looking for is the following: Path dirPath = new Path(path to dir); FileStatus[] files = FileSystem.get(conf).listStatus(dirPath); Each FileStatus entry

Re: Cannot run program bash: java.io.IOException: error=12, Cannot allocate memory

2008-11-26 Thread Comfuzed
I found in my fstab I had accidentally disabled my swap partition typing free, I saw I had no swap space. Then I followed this guide http://www.linux.com/feature/121916 and all was well. hth m -- View this message in context:

Re: mbox archive files for hadoop mailing lists.

2008-11-26 Thread Jeff Hammerbacher
Hey Stefan, Check out http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200712.mbox for an example. Replace [2007] in the URL with the desired year and [12] with the desired month, and [hadoop-core-user] with the desired mailing list. Later, Jeff On Wed, Nov 26, 2008 at 5:39 PM, Stefan

Re: Filesystem closed errors

2008-11-26 Thread Bryan Duxbury
My app isn't a map/reduce job. On Nov 25, 2008, at 9:07 PM, David B. Ritch wrote: Do you have speculative execution enabled? I've seen error messages like this caused by speculative execution. David Bryan Duxbury wrote: I have an app that runs for a long time with no problems, but when I

Re: Filesystem closed errors

2008-11-26 Thread Bryan Duxbury
I'm fairly certain that I'm not closing the Filesystem anywhere. That said, the issue you pointed at could somehow be connected. On Nov 25, 2008, at 9:11 PM, Hong Tang wrote: Does your code ever call fs.close()? If so, https:// issues.apache.org/jira/browse/HADOOP-4655 might be relevant to

Hadoop Tutorial Workshop in South Korea

2008-11-26 Thread Jaesun Han
Hi, all Korea Hadoop Community hosts half-day Hadoop Tutorial Workshop on November 28(Friday) in Seoul, South Korea. You can check and register the workshop in our website. http://www.hadoop.or.kr/?document_srl=1945 Time: Friday, November 28, 14:00 ~ 18:00 Location: Seoul National University