Hi, Experts, Could anybody remind me how to load mutiple input files in a giraph command line? The following do not work, they only load the first input file: -vip /user/hadoop/input/ttt.txt /user/hadoop/input/ttt2.txt or -vip /user/hadoop/input/ttt.txt -vip /user/hadoop/input/ttt2.txt
Best Regards, Suijian 2014-03-01 16:12 GMT-06:00 Suijian Zhou <suijian.z...@gmail.com>: > Hi, > Here I'm trying to process a very big input file through giraph, ~70GB. > I'm running the giraph program on a 40 nodes linux cluster but the program > just get stuck there after it read in a small fraction of the input file. > Although each node has 16GB mem, it looks that only one node read the input > file which is on HDFS(into its memory). As the input file is so big, is > there a way to scatter the input file on all the nodes so each node will > read in a fraction of the file then start processing the graph? Will it be > helpful if we split the single big input file into many smaller files and > let each node read in one of them to process( of course the overall > stucture of the graph should be kept)? Thanks! > > Best Regards, > Suijian > >