Hello,

you're right. That's exactly what I ment. And your answer is exactly what I thought. I was just wondering if Hadoop can distribute the data to other node's local storages if own local space is full.

Thanks

On 06/14/2012 03:38 PM, Harsh J wrote:
Ondřej,

If by processing you mean trying to write out (map outputs)>  20 GB of
data per map task, that may not be possible, as the outputs need to be
materialized and the disk space is the constraint there.

Or did I not understand you correctly (in thinking you are asking
about MapReduce)? Cause you otherwise have ~50 GB space available for
HDFS consumption (assuming replication = 3 for proper reliability).

On Thu, Jun 14, 2012 at 1:25 PM, Ondřej Klimpera<klimp...@fit.cvut.cz>  wrote:
Hello,

we're testing application on 8 nodes, where each node has 20GB of local
storage available. What we are trying to achieve is to get more than 20GB to
be processed on this cluster.

Is there a way how to distribute the data on the cluster?

There is also one shared NFS storage disk with 1TB of available space, which
is now unused.

Thanks for your reply.

Ondrej Klimpera



Reply via email to