Actually, one of the most reliable ways to kill a database is to use it as input or output for even a small Hadoop cluster. Having hundreds of processes all open connections and read at once is fairly abusive.
On Thu, Nov 24, 2011 at 9:30 AM, Sean Owen <[email protected]> wrote: > A relational database is not a common data source for Hadoop. Not that it > couldn't be, it's just that Hadoop operates by sequentially accessing > petabytes of potentially unstructured data. A relational database would be > expensive overkill for just storing huge blobs. >
