Bill Habermaas wrote:
It's interesting that Hadoop, being written entirely in Java, has such a
spotty reputation running on different platforms. I had to patch it to run
on AIX and need cygwin (gack!) so it will run on Windows. I'm surprised
nobody has thought about removing it's use of bash to run system commands
(which is NOT especially portable). Now that Hadoop only comes only in a
Java 1.6 flavor why can't it figure out disk space using the native java
runtime instead of executing the DF command under bash? Of course it runs
other system commands as well which in my opinion isn't too cool.


It is run at scale on big linux systems, and they are the ones that encounter problems with 16GB heaps and exec(), various other JVM quirks that lead the developers to say Linux + Sun JVM only. You are free to use other operating systems and even JVMs (I've used JRockit with some minor logging problems in test runs), but you get to encounter the problems. You can and should submit patches back, but if you diverge from the approved standard, you get to retest at scale, because nobody else is going to do it for you.

Supporting different unix versions is much easier than supporting windows+linux/unix, especially if you are trying to do high availability stuff, integrate with management tools, etc. I think it would be nice if Hadoop would build and run standalone on Windows without cygwin, but for all other actions, a more ruthless "Unix-ish only" would be harsh but make it easier to manage problems.

Even in a Linux-only world, you are left with the "which distro", question -were there to be official apache Hadoop RPMs and .deb files, there'd be discussions about which platforms to support. RHEL+Centos 5.X would be the obvious choice, but what else?

-steve

Reply via email to