Hi, I am recently quite confused about how hama splits input file. I ran hama pagerank with a very simple input file (with only 4 vertices and 6 edges). The file is split into 4 parts(while there are 3 tasks) and the job failed. The bsp master log shows that : Scheduling of job pagerank could not be done successfully, killing it. And then zookeeper session timeout. The job then succeed dramatically after I change the vertices name of the input file (call this file1), but fail again when I try to delete one line from the file. Even when I changed the file to be exactly the same as file1 but with a different name, the job still failed. Can someone tell me how does the splitting part work? I am really confused.
Best, Sandy
