-vip /user/hadoop/input should be enough.
On Wed, Mar 5, 2014 at 5:31 PM, Suijian Zhou <suijian.z...@gmail.com> wrote: > Hi, Experts, > Could anybody remind me how to load mutiple input files in a giraph > command line? The following do not work, they only load the first input > file: > -vip /user/hadoop/input/ttt.txt /user/hadoop/input/ttt2.txt > or > -vip /user/hadoop/input/ttt.txt -vip /user/hadoop/input/ttt2.txt > > Best Regards, > Suijian > > > > > 2014-03-01 16:12 GMT-06:00 Suijian Zhou <suijian.z...@gmail.com>: > > Hi, >> Here I'm trying to process a very big input file through giraph, ~70GB. >> I'm running the giraph program on a 40 nodes linux cluster but the program >> just get stuck there after it read in a small fraction of the input file. >> Although each node has 16GB mem, it looks that only one node read the input >> file which is on HDFS(into its memory). As the input file is so big, is >> there a way to scatter the input file on all the nodes so each node will >> read in a fraction of the file then start processing the graph? Will it be >> helpful if we split the single big input file into many smaller files and >> let each node read in one of them to process( of course the overall >> stucture of the graph should be kept)? Thanks! >> >> Best Regards, >> Suijian >> >> > -- Claudio Martella