oh woops! yes i meant i change it to an undirected format!
On Thu, Apr 17, 2014 at 4:11 PM, ghufran malik <ghufran1ma...@gmail.com>wrote: > Hi Jae, > > Thanks so much for pointing out that it wasn't directed. I made the > changes and made a directed graph and connected components now works :) > > Thanks, > Ghufran > > > On Wed, Apr 16, 2014 at 7:31 PM, Yu, Jaewook <jaewook...@intel.com> wrote: > >> Ghufran, >> >> >> >> The Youtube community dataset >> (com-youtube.ungraph.txt.gz<https://snap.stanford.edu/data/bigdata/communities/com-youtube.ungraph.txt.gz>) >> [1] is formatted as directed graph although the description says it’s >> undirected graph. With some minor changes in your conversion program, you >> should be able to generated a proper undirected adjacency list. >> >> >> >> Hope this will help. >> >> >> >> Thanks, >> >> Jae >> >> >> >> [1] https://snap.stanford.edu/data/com-Youtube.html >> >> >> >> *From:* Yu, Jaewook [mailto:jaewook...@intel.com] >> *Sent:* Wednesday, April 16, 2014 11:00 AM >> *To:* user@giraph.apache.org >> *Subject:* RE: Running ConnectedComponents in a cluster. >> >> >> >> Hi Ghufran, >> >> >> >> Have you verified the neighbors of each vertex actually exist? From your >> adjacency list, for example, 278447 278447 532613, is the neighbor’s vertex >> id 532613 valid? >> >> >> >> Thanks, >> >> Jae >> >> >> >> >> >> *From:* ghufran malik >> [mailto:ghufran1ma...@gmail.com<ghufran1ma...@gmail.com>] >> >> *Sent:* Wednesday, April 16, 2014 9:22 AM >> *To:* user@giraph.apache.org >> *Subject:* Running ConnectedComponents in a cluster. >> >> >> >> Hi, >> >> I have setup Giraph on my university cluster of computers (Giraph >> 1.1.0-SNAPSHOT-for-hadoop-2.0.0-cdh4.3.1). I've successfully ran the >> connected components algorithm on a very small test dataset using 4 workers >> and it produced the expected output. >> >> >> dataset: >> >> vertex id, vertex value, neighbours.... >> >> 0 0 1 >> 1 1 0 2 3 >> 2 2 1 3 >> 3 3 1 2 >> >> output: >> 1 0 >> 0 0 >> 3 0 >> 2 0 >> >> >> >> However when I tried to run this algorithm on a larger dataset >> (reformatted version of com-youtube.ungraph from Stanford snap to match the >> IntIntNullTextVertexInputFormat) it successfully complets but the incorrect >> output is produced. It seems to just output the vertex id with its orignal >> value (its vertex id is its original value that i set). >> >> A snippet of the dataset is provided: >> >> vertex id, vertex value, neighbours.... >> ....... >> 278447 278447 532613 >> 278449 278449 305447 324115 414238 >> 83899 83899 153460 172614 176613 211448 >> 773749 773749 845366 >> 773748 773748 960388 >> ....... >> >> output produced: >> ............. >> 73132 73132 >> 831308 831308 >> 199788 199788 >> 763644 763644 >> 300572 300572 >> ............. >> >> there's not one vertex value that isn't the same as its original vertex >> ID. >> >> The computation also stops after superstep 0 is done and goes no further, >> whereas on my smaller data set completes 3 supersteps. >> >> Does anyone have an idea to why this is? >> >> Kind regards, >> >> Ghufran >> >> >> > >