I have 6 blade servers (2 socket, quad-core Intel running at 3.0 Ghz, and 32 GB RAM) setup on a SAN for crawling. Since they are all using the same SAN, what is the most efficient way to set Nutch up? Should I be using a hadoop cluster?
I plan to incrementally build the indexes. If anyone has some docs or can
point me to some, I'd appreciate it.
Thanks in advance,
Alex
