You can also use Dryad + DryadLinq on windows machines and write map reduce programs using their API.
On Fri, Sep 18, 2009 at 11:05 AM, Hong Tang <ht...@yahoo-inc.com> wrote: > http://issues.apache.org/jira/browse/HADOOP-4998 is opened for the purpose > of substituting bash calls with library calls. It has been there for 8 > months now and looks like it could use some help from hadoop contributors. > :) > > -Hong > > On Sep 17, 2009, at 7:29 PM, Harish Mallipeddi wrote: > >> MySpace recently released their map-reduce implementation as opensource >> (it's .NET based). MySpace as you might know is one of the few big >> websites >> that runs on Windows. >> >> http://code.google.com/p/qizmt/ >> >> On Thu, Sep 17, 2009 at 10:42 PM, Steve Loughran <ste...@apache.org> >> wrote: >> >>> Bill Habermaas wrote: >>> >>>> It's interesting that Hadoop, being written entirely in Java, has such a >>>> spotty reputation running on different platforms. I had to patch it to >>>> run >>>> on AIX and need cygwin (gack!) so it will run on Windows. I'm surprised >>>> nobody has thought about removing it's use of bash to run system >>>> commands >>>> (which is NOT especially portable). Now that Hadoop only comes only in a >>>> Java 1.6 flavor why can't it figure out disk space using the native java >>>> runtime instead of executing the DF command under bash? Of course it >>>> runs >>>> other system commands as well which in my opinion isn't too cool. >>>> >>> >>> >>> It is run at scale on big linux systems, and they are the ones that >>> encounter problems with 16GB heaps and exec(), various other JVM quirks >>> that >>> lead the developers to say Linux + Sun JVM only. You are free to use >>> other >>> operating systems and even JVMs (I've used JRockit with some minor >>> logging >>> problems in test runs), but you get to encounter the problems. You can >>> and >>> should submit patches back, but if you diverge from the approved >>> standard, >>> you get to retest at scale, because nobody else is going to do it for >>> you. >>> >>> Supporting different unix versions is much easier than supporting >>> windows+linux/unix, especially if you are trying to do high availability >>> stuff, integrate with management tools, etc. I think it would be nice if >>> Hadoop would build and run standalone on Windows without cygwin, but for >>> all >>> other actions, a more ruthless "Unix-ish only" would be harsh but make it >>> easier to manage problems. >>> >>> Even in a Linux-only world, you are left with the "which distro", >>> question >>> -were there to be official apache Hadoop RPMs and .deb files, there'd be >>> discussions about which platforms to support. RHEL+Centos 5.X would be >>> the >>> obvious choice, but what else? >>> >>> -steve >>> >> >> >> >> -- >> Harish Mallipeddi >> http://blog.poundbang.in > >