Re: [ANNOUNCE] Hadoop release 0.19.1 available

2009-02-26 Thread Nigel Daley
Sorry for delay. Should now be fixed. Please confirm. Thx, Nige On Feb 25, 2009, at 2:27 AM, Aviad sela wrote: Nigel, The SVN tag http://svn.apach.org/repos/asf/core/tags/release-0.19.1 include also the branch folder branch-0.19 which results with double size project which is not necessa

Re: Atomicity of file operations?

2009-02-26 Thread Brian Long
Thanks Brian. I will go with the copy to tmp and flip with rename model. -B On Thu, Feb 26, 2009 at 3:49 PM, Brian Bockelman wrote: > > On Feb 26, 2009, at 4:14 PM, Brian Long wrote: > > What kind of atomicity/visibility claims are made regarding the various >> operations on a FileSystem? >> I h

How to deal with HDFS failures properly

2009-02-26 Thread Brian Long
I'm wondering what the proper actions to take in light of a NameNode or DataNode failure are in an application which is holding a reference to a FileSystem object. * Does the FileSystem handle all of this itself (e.g. reconnect logic)? * Do I need to get a new FileSystem using .get(Configuration)?

Re: HDFS architecture based on GFS?

2009-02-26 Thread kang_min82
Hello Matei, Which Tasktracker did you mean here ? I don't understand that. In general we have mane Tasktrackers and each of them runs on one separate Datanode. Why doesn't the JobTracker talk directly to the Namenode for a list of Datanodes and then performs the MapReduce tasks there. Any he

Re: Shuffle phase

2009-02-26 Thread Owen O'Malley
On Feb 26, 2009, at 2:35 PM, Nathan Marz wrote: Interesting. Is there a JIRA for this? Two. HADOOP-5223 subsumes HADOOP-1338. -- Owen

Re: Atomicity of file operations?

2009-02-26 Thread Brian Bockelman
On Feb 26, 2009, at 4:14 PM, Brian Long wrote: What kind of atomicity/visibility claims are made regarding the various operations on a FileSystem? I have multiple processes that write into local sequence files, then uploads them into a remote directory in HDFS. A map/reduce job runs which

Re: Shuffle phase

2009-02-26 Thread Nathan Marz
Interesting. Is there a JIRA for this? On Feb 26, 2009, at 2:07 PM, Owen O'Malley wrote: On Feb 26, 2009, at 2:03 PM, Nathan Marz wrote: Do the reducers batch copy map outputs from a machine? That is, if a machine M has 15 intermediate map outputs destined for machine R, will machine R c

Announcing CloudBase-1.2 release

2009-02-26 Thread Tarandeep Singh
Hi, We have released 1.2 version of CloudBase on sourceforge- http://cloudbase.sourceforge.net/ [ CloudBase is a data warehouse system built on top of Hadoop's Map-Reduce architecture. It uses ANSI SQL as its query language and comes with a JDBC driver. It is developed by Business.com and is rele

Atomicity of file operations?

2009-02-26 Thread Brian Long
What kind of atomicity/visibility claims are made regarding the various operations on a FileSystem? I have multiple processes that write into local sequence files, then uploads them into a remote directory in HDFS. A map/reduce job runs which operates on whatever is in the directory. The processes

Re: Shuffle phase

2009-02-26 Thread Owen O'Malley
On Feb 26, 2009, at 2:03 PM, Nathan Marz wrote: Do the reducers batch copy map outputs from a machine? That is, if a machine M has 15 intermediate map outputs destined for machine R, will machine R copy the intermediate outputs one at a time or all at once? Currently, one at a time. In 0

Re: Eclipse plugin

2009-02-26 Thread Iman
Hi John, When I created the hadoop location, the hadoop.job.ugi did not appear in the advanced parameter. But when I later edited it, it was there. I donnu how that was fixed:) Also to get it to work, I had to edit the fs.default.name and mapred.job.tracker in hadoop/conf/hadoop-site.xml I ad

Shuffle phase

2009-02-26 Thread Nathan Marz
Do the reducers batch copy map outputs from a machine? That is, if a machine M has 15 intermediate map outputs destined for machine R, will machine R copy the intermediate outputs one at a time or all at once?

Re: Eclipse plugin

2009-02-26 Thread John Livingstone
Iman-4, I have encountered the same problem that you have encountered: Not being able to access HDFS on my Hadoop VMware Linux server (uning the Hadoop Yahoo tutorial) and not seeing "hadoop.job.ugi" in my Eclipse Europa 3.3.2 list of parameters. What did you have to do or change to get it to wor

Re: OutOfMemory error processing large amounts of gz files

2009-02-26 Thread bzheng
Arun C Murthy-2 wrote: > > > On Feb 24, 2009, at 4:03 PM, bzheng wrote: >> > >> 2009-02-23 14:27:50,902 INFO org.apache.hadoop.mapred.TaskTracker: >> java.lang.OutOfMemoryError: Java heap space >> > > That tells that that your TaskTracker is running out of memory, not > your reduce tasks.

Re: Unable to Decommission Node

2009-02-26 Thread Roger Donahue
For anyone who runs into this in the future, I seem to have figured out what caused it. When I added the hostname to the exclude file, it duplicated my node on the web report page with one in IP form and the other in hostname form when I moused over. When I added the IP (not hostname) to the exclu

RE: Orange Labs is hosting an event about recommendation engines - March 3rd

2009-02-26 Thread Brian MacKay
Jeremy, I won't be able to attend the event... but if you could, please consider pod casting the discussions and posting them along with any presentations online. I'd like to hear what was reviewed and read about it. Thanks, Brian -Original Message- From: Adam Rose [mailto:a...@tubemo

Re: Can anyone verify Hadoop FS shell command return codes?

2009-02-26 Thread Mikhail Yakshin
On Mon, Feb 23, 2009 at 4:02 PM, S D wrote: > I'm attempting to use Hadoop FS shell (http://hadoop > .apache.org/core/docs/current/hdfs_shell.html) within a ruby script. My > challenge is that I'm unable to get the function return value of the > commands I'm invoking. As an example, I try to run ge

Re: OutOfMemory error processing large amounts of gz files

2009-02-26 Thread Arun C Murthy
On Feb 24, 2009, at 4:03 PM, bzheng wrote: 2009-02-23 14:27:50,902 INFO org.apache.hadoop.mapred.TaskTracker: java.lang.OutOfMemoryError: Java heap space That tells that that your TaskTracker is running out of memory, not your reduce tasks. I think you are hitting http://issues.apache