Re: Different num supersteps

2014-03-03 Thread Martin Neumann
Hi, I managed to fix it even if I'm still not entirely sure what happened. The fix is to make a new Text object every time a Text is required as input (Text does not implement Cloneable). I guess it So instead of: Text candidate = e.getTargetVertexId(); ... vertex.setValue(candidate)) The

Re: Different num supersteps

2014-03-03 Thread Sebastian Schelter
Hi Martin, I'm not sure wether we require InputFormats to be threadsafe. Can someone answer that question? Maybe thats the reason you see this behavior. --sebastian On 03/03/2014 10:05 AM, Martin Neumann wrote: I checked the input just creating the graph and comparing it. While I cant say

Re: Different num supersteps

2014-03-03 Thread Martin Neumann
I checked the input just creating the graph and comparing it. While I cant say the graph is correct (its to big) its at least consistent. So the only things where the different output can come from is the connected component part (see code further down). I'm completely stomped, the code is basical

Re: Different num supersteps

2014-02-27 Thread Sebastian Schelter
Hi Martin I don't think that there are problems with comparing and sorting Text writables as Hadoop is basically a big external sorting system. I'm not sure I understand your edge input reader, it looks very complex, maybe there's a bug somewhere. You could try to preprocess your data using

Re: Different num supersteps

2014-02-27 Thread Martin Neumann
Hm I ran the job 5 times and made a diff between the outputs and they are not the same. I cant find anything in the code that could lead to this behaviour. The only idea where to look a the moment would be the identifier. Has anyone experience with String identifier? Is a possible that there are

Re: Different num supersteps

2014-02-27 Thread Martin Neumann
The data I have as input is not in a Graph-Format so I use an EdgeInputFormat to create a Graph. Its also deterministic so the same Graph should be build with the same input. Each line in the input is a set of connected vertices. I create edges in a way that they form a star around the vertex with

Re: Different num supersteps

2014-02-27 Thread Sebastian Schelter
Hi Martin, You are right, this should not happen, your code looks correct. One way to check the output would be to simply count the number vertices per component and see if that number stays stable. Do you supply all vertices in your input data or are some vertices created during the computa

Different num supersteps

2014-02-27 Thread Martin Neumann
Hej, I have modified the connected component example to fit my input data. I expect it to be deterministic. But when I run it multiple times it takes a different number of Super steps. This only happens on the complete dataset and not on my small test dataset. (So I cannot check the output for c