How can the reducer be invoked lazily?

2007-12-14 Thread Rui Shi
Hi, How can we specify so that the reducers can be invoked lazily? For instance, I know there are no partitions in the range of 200-300. How can I let the hadoop know that no need to invoke reduce tasks for those partitions? Thanks, Rui ___

Re: finalize upgrade

2007-12-14 Thread Konstantin Shvachko
Well, from the output it looks like that has been run. At least I cannot see any sign telling me I still need to run it ...still was the previous directory on the name node. The way it works in pre 0.16 is that you start the cluster, and issue: hadoop dfsadmin -finalizeUpgrade I've just ru

Re: finalize upgrade

2007-12-14 Thread Torsten Curdt
On 14.12.2007, at 19:41, Konstantin Shvachko wrote: Sorry, it looks like the UI and report feature will appear only in 0.16. It is related to HADOOP-1604. In general you are not supposed to remove any directories manually. That's why I am so careful :) You should just use finalizeUpgrade.

RE: advanced map/reduce tutorials?

2007-12-14 Thread Joydeep Sen Sarma
brute force: let the input be splittable. in each map job, open the original file and for each line in the split, iterate over all preceding lines in the input file. this will at least get u the parallelism. but a better approach would be try and cast ur problem as a sorting/grouping problem. d

Re: finalize upgrade

2007-12-14 Thread Konstantin Shvachko
Sorry, it looks like the UI and report feature will appear only in 0.16. It is related to HADOOP-1604. In general you are not supposed to remove any directories manually. You should just use finalizeUpgrade. The way it works in pre 0.16 is that you start the cluster, and issue: hadoop dfsadmin -fi

advanced map/reduce tutorials?

2007-12-14 Thread Chris Fellows
Hello, The map/reduce tutorials in the hadoop src are great for getting started. Are there any similar tutorials for more advanced use cases? Especially complicated ones that might involve subclassing RecordReader, InputFormat, and others. In particular I want to write a job that does a cartes

Re: Amazon SimpleDB

2007-12-14 Thread Toby DiPasquale
On Dec 14, 2007 1:04 PM, Billy <[EMAIL PROTECTED]> wrote: > Is this based of hbase? It doesn't appear so, but it does appear to be an interesting cross-set of BigTable and CouchDB. Worth checking out, but each "domain" (table) is currently limited to 10GB in size, so this becomes problematic with

Re: Amazon SimpleDB

2007-12-14 Thread jeff . fedor
No J Sent from my BlackBerry device on the Rogers Wireless Network -Original Message- From: "Billy" <[EMAIL PROTECTED]> Date: Fri, 14 Dec 2007 12:04:06 To:hadoop-user@lucene.apache.org Subject: Re: Amazon SimpleDB Is this based of hbase? Billy "Garth Patil" <[EMAIL PROTECTED]>

Re: Amazon SimpleDB

2007-12-14 Thread Billy
Is this based of hbase? Billy "Garth Patil" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > http://www.amazon.com/gp/browse.html?node=342335011 > Sorry for the OT post, but I thought this might be interesting to the > HBase users. > Best, > Garth >

Re: api about hbase?

2007-12-14 Thread stack
See first item in the FAQ: http://wiki.apache.org/lucene-hadoop/Hbase/FAQ. St.Ack ma qiang wrote: Hi colleague, After reading the api docs about hbase,I don't know how to manipulate the hase using the java api .Would you please send me some examples? Thank you! Ma Qiang Departmen

RE: api about hbase?

2007-12-14 Thread Jim Kellerman
The only examples we have at this point are the unit tests. They are rather contrived but it's what we have. You can find them in the source tree under src/contrib/hbase/src/test/org/apache/hadoop/hbase --- Jim Kellerman, Senior Engineer; Powerset > -Original Message- > From: ma qiang [

Re: commodity vs. high perf machines: which would you rather

2007-12-14 Thread Chris Fellows
Ok, added MachineScaling and linked to it from an added FAQ. It could probably use a going over by someone more experienced in hadoop. Also it'd be great to expand it to include network considerations and scaling suggestions for the namenode if it's different than datanodes. Thanks again -

Re: finalize upgrade

2007-12-14 Thread Torsten Curdt
Can anyone confirm? On 13.12.2007, at 09:46, Torsten Curdt wrote: No sign of 'upgrade still needs to be finalized' or something ...so I assume removing the 'previous' dir is safe then? On 12.12.2007, at 21:18, Konstantin Shvachko wrote: 2) Is there a way of finding out whether finalize stil

RE: map/reduce and Lucene integration question

2007-12-14 Thread Butler, Mark (Labs)
Hi team, First off, I would like to express that I am very impressed with Hadoop and very grateful to everyone who has contributed to it and provided this software open source. re: Lucene and Hadoop I am in the process of implementing a Lucene distributed index (DLucene), based on the design