On Tue 19 Aug 2014 05:34:40 AM EDT, J. Roeleveld wrote: > On Monday, August 18, 2014 10:53:51 AM Alec Ten Harmsel wrote: >> On Mon 18 Aug 2014 10:50:23 AM EDT, Rich Freeman wrote: >>> Hadoop is a very specialized tool. It does what it does very well, >>> but if you want to use it for something other than map/reduce then >>> consider carefully whether it is the right tool for the job. >> >> Agreed; unless you have decent hardware and can comfortably measure >> your data in TB, it'll be quicker to use something else once you factor >> in the administration time and learning curve. > > The benefit of clustering technologies is that you don't need high-end > hardware to start with. You can use the old hardware you found collecting dust > in the basement.
Yes, but... if you are doing anything that *needs* to be fast (i.e. if you're not a hobbyist), you don't need some super fancy database machine but you still need some decent hardware (gotta have enough RAM for that JVM ;) ). If you'd like to take a look at our hardware, you can check out http://caen.github.io/hadoop/hardware.html. > The learning curve isn't as steep as it used to be. There are plenty of tools > to make it easier to start using Hadoop. There are plenty of great tools (Pig, Sqoop, Hive, RHadoop, etc.) that you can use so you're not writing Java. This is all client-side; it doesn't make the administration easier. I agree that it's easy to start using it (It's possible to configure a small cluster from scratch in half an hour), but it takes a lot more time to tune your installation so it actually performs well. Just like any other piece of server software; serving a website with httpd is easy, but serving it well and adding security takes a lot more time. Rich Freeman wrote: > As long as you're counting words and don't mind coding everything in Java. :) We discourage researchers from writing in Java and instead use any of the things I list above, unless they really like Java. > I found that if you want to avoid using Java, then the > available documentation plummets Yeah, this is still a pretty big problem. Documentation is pretty sparse. Alec