Hi Hyunsik, Unfortunately you can't control the servers that blocks go on. Hadoop does block allocation for you, and it tries its best to distribute data evenly among the cluster, so long as replicated blocks reside on different machines, on different racks (assuming you've made Hadoop rack-aware).
Hope this clears things up. Alex 2009/6/23 Hyunsik Choi <c0d3h...@gmail.com> > Hi all, > > I would like to give data locality. In other words, I want to place > certain data blocks on one machine. In some problems, subsets of an > entire dataset need one another for answer. Most of the graph problems > are good examples. > > Is it possible? If impossible, can you advice about that? > > Thank you in advance. > > - Hyunsik Choi - >