On 06/13/2011 07:52 AM, Loughran, Steve wrote: >>On 06/10/2011 03:23 PM, Bible, Landy wrote: >> I'm currently running HDFS on Windows 7 desktops. I had to create a >> hadoop.bat that provided the same functionality of the shell scripts, and >> some Java Service Wrapper configs to run the DataNodes and NameNode as >> windows services. Once I get my system more functional I plan to do a write >> up about how I did it, but it wasn't too difficult. I'd also like to see >> Hadoop become less platform dependent.
>why? Do you plan to bring up a real Windows server datacenter to test it on? Not a datacenter, but a large-ish cluster of desktops, yes. >Whether you like it or not, all the big Hadoop clusters run on Linux I realize that, I use Linux wherever possible, much to the annoyance of my Windows only co-workers. However, for my current project, I'm using all the Windows 7 and Vista desktops at my site as a storage cluster. The first idea was to run Hadoop on Linux in a VM in the background on each desktop, but that seemed like overkill. The point here is to use the resources we have but aren't using, rather than buy new resources. Academia is funny like that. >> So far, I've been unable to make MapReduce work correctly. The services >> run, but things don't work, however I suspect that this is due to DNS not >> working correctly in my environment. >yes, that's part of the anywhere you have to fix. Edit the host tables so that >DNS and reverse DNS appears to work. That's >c:\windows\system32\drivers\etc\hosts, unless on a win64 box it moves. Why does Hadoop even care about DNS? Every node checks in with the NameNode and JobTrackers, so they know where they are, why not just go pure IP based and forget DNS. Managing the hosts file is a pain... even when you automate it, it just seems unneeded.