Re: permission problems in hadoop MapFile (0.12)

2007-08-24 Thread Albert Chern
It looks like you're trying to write the output of a job to /root. I'm not familiar with Nutch, but I just looked at the source and I think you might be launching the crawl from /root, and Nutch is trying to create a temporary linkdb there. Try launching the crawl from a directory you have permis

Re: Detailed steps to run Hadoop in distributed system...

2007-03-02 Thread Albert Chern
> > > > in the last line "<" is missing when i was copying and pasting in the > hadoop-site.xml file its written correctly > > I can ping the machines from each other No problem in it. > > Thanks for ur reply... please try to find any other

Re: MapReduce

2007-03-02 Thread Albert Chern
there is no key/value pairs... but he is telling that we can have MapReduce without key/value pairs Albert Chern wrote: > > Sometimes you need to do a little work to fit a problem into map reduce. > You are correct; in this problem, there really are no key/value pairs, so > you w

Re: Detailed steps to run Hadoop in distributed system...

2007-03-02 Thread Albert Chern
t; is missing when i was copying and pasting in the hadoop-site.xml file its written correctly I can ping the machines from each other No problem in it. Thanks for ur reply... please try to find any other mistakes.. Albert Chern wrote: > > Your hadoop-site.xml is missing

Re: MapReduce

2007-03-02 Thread Albert Chern
Sometimes you need to do a little work to fit a problem into map reduce. You are correct; in this problem, there really are no key/value pairs, so you would use a dummy value. For example, we could just use 0 as a key, so our test scores are: (0, 95) (0,100) (0, 70) and so on... Each map gets o

Re: Detailed steps to run Hadoop in distributed system...

2007-03-02 Thread Albert Chern
Your hadoop-site.xml is missing a "<" in the last line, but it looks like you're having a network problem. Can you ping the machines from each other? On 3/1/07, jaylac <[EMAIL PROTECTED]> wrote: Hi Hadoop-Users. Have anyone successfully tried running hadoop in two systems? I've tried ru

Re: RE: Corrupt DFS edits-file

2006-12-12 Thread Albert Chern
to automatically create a directory equivalent to /lost+found? While EditLog processing, if the parent directory of a file does not exist, the file can go into /lost+found? Thanks, dhruba -Original Message- From: Albert Chern [mailto:[EMAIL PROTECTED] Sent: Friday, December 08, 20

Re: Corrupt DFS edits-file

2006-12-08 Thread Albert Chern
This happened to me too, but the problem was the OP_MKDIR instructions were in the wrong order. That is, in the edits file the parent directory was created after the child. Maybe you should check to see if that's the case. I fixed it by using vi in combination with xxd. When you have the file

Re: Re: Re: MapFile.get() has a bug?

2006-11-28 Thread Albert Chern
so on. some idea? Feng On 11/28/06, Albert Chern <[EMAIL PROTECTED]> wrote: > > Well, I looked at the source and I can tell you WHY it happens, but > I'm not sure if the behavior is correct or not. Basically the MapFile > keeps an index of where each key is; this index is ho

Re: Re: MapFile.get() has a bug?

2006-11-27 Thread Albert Chern
Well, I looked at the source and I can tell you WHY it happens, but I'm not sure if the behavior is correct or not. Basically the MapFile keeps an index of where each key is; this index is how the MapFile seeks quickly to the correct record. However, there is a parameter called the index interva

Re: Re: What does "Lost Task Tracker" Mean?

2006-11-27 Thread Albert Chern
Hey Doug, Thanks for the help. I figured out what was causing it, though. It was because some of the machines were running out of temporary disk space for the reduce operation. This was with version 0.7. On 11/27/06, Doug Cutting <[EMAIL PROTECTED]> wrote: Albert Chern wrote: > On

What does "Lost Task Tracker" Mean?

2006-11-23 Thread Albert Chern
Hello, One error I seem to be getting a lot is "Lost Task Tracker". The two reduce tasks that a particular machine is working on will fail with this error, which then propagates back to all the map tasks that it has completed, causing them to reexecute. What exactly does this message mean? I w

Re: One map task freezing

2006-11-17 Thread Albert Chern
Just wanted to chime in and say that I have experienced this often too. The task eventually completes, but after an inordinate amount of time. The last lone task gets stuck in the initialiazation stage. It's usually faster to restart the job. On 11/17/06, Johan Oskarsson <[EMAIL PROTECTED]> wro

Re: Re: JAR packaging

2006-10-30 Thread Albert Chern
ed lib in the Job JAR or do you mean by putting them in the lib directory where Hadoop runs? From the looks of RunJar.java I think you mean the first option (of course, the second option works, too) -Grant On Oct 30, 2006, at 6:29 AM, Vetle Roeim wrote: > On Sat, 28 Oct 2006 22:13:35 +0200

Re: JAR packaging

2006-10-28 Thread Albert Chern
I'm not sure if the first option works. If it does let me know. One of the developers taught me to use option 2 by creating a jar with your dependencies in lib/. The tasktrackers will automatically include everything in lib/ on their classpaths. On 10/28/06, Grant Ingersoll <[EMAIL PROTECTED]>

Re: Re: Statistical clustering MapReduce example?

2006-10-27 Thread Albert Chern
Thank you for the paper Stefan. I was surprised at how well the K-Means algorithm fits into MapReduce. I actually never thought of writing the output back to the FS to test for convergence. On 10/27/06, David Pollak <[EMAIL PROTECTED]> wrote: Stefan, THis is most excellent stuff! Thanks for

Re: Chaining MapReduce operations

2006-10-25 Thread Albert Chern
There is something similar in src/examples/org/apache/hadoop/examples/Grep.java. Instead of outputting pure word counts, it counts occurrences of a regex. Switch the first map reduce job with the one in WordCount.java in the same directory and you're golden. On 10/25/06, David Pollak <[EMAIL PR

Re: RE: Classes Not Loading Correctly in MapRed 0.7

2006-10-17 Thread Albert Chern
AIL PROTECTED]> wrote: Hi Albert, There has been a critical bug in 0.7 release. The bug and the patch are avalaible at https://issues.apache.org/jira/browse/HADOOP-607. There should be a release soon fixing this problem. Regards Mahadev > -Original Message----- > From: Alber

Classes Not Loading Correctly in MapRed 0.7

2006-10-12 Thread Albert Chern
Hello, After upgrading from 0.6 to 0.7, we have experienced a strange error with MapReduce jobs. Jobs that ran perfectly before now seem to have trouble finding the classes from our code. The stack trace is long, but the errors happen in MapTask.run() when it tries to get the map output key cla

MapReduce and Jar Files

2006-10-06 Thread Albert Chern
Hello, I have run into a problem when attempting to run jobs that require code from multiple jars. The JobConf only allows me to set one jar file, so the other jars are not copied to the task trackers, and hence the jobs fail due to the dependencies. Besides merging the jars or manually copying

Reducer and Keys

2006-09-30 Thread Albert Chern
Hello, I have noticed that the Reducer does not enforce that the key is kept unaltered after the reduce operation. What would happen if output with a different key was collected? I realize that the output would no longer be sorted, but would it cause the job to fail? Thanks, Albert

Re: Re: Setting Up a Backup/Failover Namenode

2006-09-11 Thread Albert Chern
Just one NameNode for now. This feature is not implemented yet, sorry. How were planning to use the second name node? --Konstantin Albert Chern wrote: > Dear Hadoop Experts, > > The opening comments of the NameNode class state: > > "There is a single NameNode running in

Setting Up a Backup/Failover Namenode

2006-09-11 Thread Albert Chern
ure this second NameNode. Can anyone give me some pointers? Thanks in advance, Albert Chern