I'm working with large datasets and have limited hardware resources (Like everyone else!)
I was wondering what would people recommend for storing my data in when using mahout. I've roughly 100gb of data right now, that will grow and shrink over time. If I distribute the storage the maximum number of nodes I would have access to is three. I guess this is really a 'how long is a piece of string' question, but would still appreciate peoples experiences! My requirements would be speed! Steve
