Re: finalize upgrade

2007-12-16 Thread Torsten Curdt
On 14.12.2007, at 23:35, Konstantin Shvachko wrote: Well, from the output it looks like that has been run. At least I cannot see any sign telling me I still need to run it ...still was the previous directory on the name node. The way it works in pre 0.16 is that you start the cluster, and

Re: finalize upgrade

2007-12-14 Thread Torsten Curdt
On 14.12.2007, at 19:41, Konstantin Shvachko wrote: Sorry, it looks like the UI and report feature will appear only in 0.16. It is related to HADOOP-1604. In general you are not supposed to remove any directories manually. That's why I am so careful :) You should just use finalizeUpgrade.

Re: finalize upgrade

2007-12-14 Thread Torsten Curdt
Can anyone confirm? On 13.12.2007, at 09:46, Torsten Curdt wrote: No sign of 'upgrade still needs to be finalized' or something ...so I assume removing the 'previous' dir is safe then? On 12.12.2007, at 21:18, Konstantin Shvachko wrote: 2) Is there a way of finding

Re: finalize upgrade

2007-12-13 Thread Torsten Curdt
No sign of 'upgrade still needs to be finalized' or something ...so I assume removing the 'previous' dir is safe then? On 12.12.2007, at 21:18, Konstantin Shvachko wrote: 2) Is there a way of finding out whether finalize still needs to be run? Yes, you can see it on the name-node web UI, a

finalize upgrade

2007-12-12 Thread Torsten Curdt
Hey guys, triggered by a post on the mailing list I also checked our 0.14 cluster and although we really though we did the finalize after the upgrade we also have a big "previous" dir there. A couple of things I am wondering here... 1) I thought that the data is actually not duplicated ..

Re: rename dir while writing

2007-11-05 Thread Torsten Curdt
s it, then the original writer will get an IO exception either when it finished writing to the current block or when it closes the file. Thanks, dhruba -Original Message----- From: Torsten Curdt [mailto:[EMAIL PROTECTED] Sent: Monday, November 05, 2007 2:05 AM To: hadoop-user@lucene.apache.o

Re: rename dir while writing

2007-11-05 Thread Torsten Curdt
ssing something here? -jorgenj -Original Message- From: Owen O'Malley [mailto:[EMAIL PROTECTED] Sent: Monday, November 05, 2007 9:24 AM To: hadoop-user@lucene.apache.org Subject: Re: rename dir while writing On Nov 5, 2007, at 2:05 AM, Torsten Curdt wrote: Is there anywhere documented

rename dir while writing

2007-11-05 Thread Torsten Curdt
Is there anywhere documented the expected behavior of concurrent changes in the filesystem? As an example: Hdfs client C1 is slowly writing to "/path/a/file". Now another hdfs client C2 renames "/path/a" to /path/b". What happens? Will C1 continue to write but the file will be in "/ path/b

Re: jdk6 on darwin

2007-10-12 Thread Torsten Curdt
On 12.10.2007, at 20:10, Doug Cutting wrote: Michael Bieniosek wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my dev

Re: jdk6 on darwin (was: 14.1 to 14.2)

2007-10-12 Thread Torsten Curdt
On 12.10.2007, at 19:34, Michael Bieniosek wrote: Does anybody know if there is a jdk6 available for Mac? I checked the apple developer site, and there doesn't seem to be one available, despite blogs from last year claiming apple was distributing it. Since I do my development work on a

Re: A couple of usability problems

2007-09-26 Thread Torsten Curdt
Something we noticed too - that has changed with our upgrade to 0.14.0 On 25.09.2007, at 21:36, Ted Dunning wrote: My jobs seem to do that. I am surprised yours do not. What version of hadoop are you running? I am using 0.13.1 On 9/25/07 10:30 AM, "Nathan Wang" <[EMAIL PROTECTED]> wrote:

Re: Hadoop User Get Together SF Bay Area

2007-09-25 Thread Torsten Curdt
Bugger ...there are some occasion where it sucks to be based in Europe :) Have fun! Guys, I was wondering if people would be interested in an informal Hadoop Get Together. Could be just a simple format with people meeting someplace in the SF Bay Area to exchange ideas. Let me know if there i

Re: Namenode can't connect with Datanode during upgrade from 0.13.1 to 0.14.1

2007-09-12 Thread Torsten Curdt
Well, as long as only one has an IP assigned your are good. There are still multi-home problems (even with 0.14.1) which is why I was asking. (jira issue still to be opened) cheers -- Torsten

Re: Namenode can't connect with Datanode during upgrade from 0.13.1 to 0.14.1

2007-09-12 Thread Torsten Curdt
I was trying to upgrade hadoop 0.13.1 to 0.14.1, but when I follow the instruction at http://wiki.apache.org/lucene-hadoop/ Hadoop_0.14_Upgrade, running "./start-dfs.sh -upgrad", I found no progress with the upgrading process. This takes a while before you see progress. Give it some time.

Re: Anybody using HDFS as a long term storage solution?

2007-09-06 Thread Torsten Curdt
datacenter-awareness. I don't think there is such a rack-awareness ability for the DFSClient or TaskTracker though. -Michael On 9/6/07 3:10 PM, "Torsten Curdt" <[EMAIL PROTECTED]> wrote: Another big question: Has anybody tried using HADOOP / HDFS across multiple geogr

Re: Anybody using HDFS as a long term storage solution?

2007-09-06 Thread Torsten Curdt
Another big question: Has anybody tried using HADOOP / HDFS across multiple geographic sites? That's actually a biggy I would be very much interested in, too cheers -- Torsten

Re: Overhead of Java?

2007-09-06 Thread Torsten Curdt
On 06.09.2007, at 09:56, Pietu Pohjalainen wrote: Jeroen Verhagen wrote: On 9/5/07, Steve Schlosser <[EMAIL PROTECTED]> wrote: question, but I was wondering if anyone has a reasonable qualitative answer that I can pass on when people ask. Is this question really relevant since Hadoop is de

not reducing

2007-08-28 Thread Torsten Curdt
We came across an issue where our jobs failed to report back to the tracker. (https://issues.apache.org/jira/browse/HADOOP-1790) Now we are getting a little bit further and the map-phase is working just fine but the reduce seems to be just stuck at 0%. We are see the following in the logs:

restoring state via map reduce

2007-08-20 Thread Torsten Curdt
I am wondering what the most efficient way would be handle the following scenario with map reduce in hadoop. Let's say we have the following data time=1, ip=1, a=1 time=2, ip=2, a=2 time=3, ip=2, b=4 time=2, ip=1, b=2 time=4, ip=1, a=4 time=5, ip=2, a=7 time=6, ip=1, c=9 time=

Re: Hadoop suitable for production

2007-03-14 Thread Torsten Curdt
Thanks Timothy for your short answer, I guess I have to be a bit more specific! Actually I'm interested in the distributed FS rather than in the Map/Reduce features. Did the HDFS change very much since it has been moved out of the nutch project?As far as I can tell the version of hadoop is a very

Re: running new jobs

2007-01-07 Thread Torsten Curdt
Thanks for the answer ...what about the second part ...about python? On 08.01.2007, at 02:11, 张茂森 wrote: Hadoop will distribute your classes written by Java automatically, and the only thing you should do is just submit your job. -邮件原件- 发件人: Torsten Curdt [mailto:[EMAIL PROTECTED] 发

running new jobs

2007-01-05 Thread Torsten Curdt
A few things that aren't really clear to me yet ...hadoop is deployed and I want to schedule a new job. Let's say it is written in java. Will hadoop distribute the classes so the job is available on all nodes? Or do I have to make it is deployed everywhere. Also: there are python examples a