I am seeing something strange with outerJoinVertices(and triangle count that
relies on this api):

Here is what I am doing:
1) Created a Graph with multiple partitions i.e created a graph with
minEdgePartitions(in api GraphLoader.edgeListFile), where minEdgePartitions
>=1; and use partitionBy(PartitionStrategy.RandomVertexCut) on generated
graph. Note: vertex attribute type is Int in this case
2) next I am building neighborhood ids by calling collectNeighborIds i.e.
returned vertex attribute type is Array[VertexId] ;
VertexRDD[Array[VertexId]]
3) finally join vertex ids from 2 to graph (generated in step 1) via
outerJoinVertices
4) Create a subgraph on joined graph from 3 where I only keep the edges with
ed.srcAttr != -1 && ed.dstAttr != -1 i.e. filter out null attr vertices
5) Finally checked the number edges left in subgraph from step4

I ran this program in a loop where minEdgePartitions is changed in each
iteration. When minEdgePartitions == 1 I see correct number of edges. When
minEdgePartitions == 2 result is ~1/2 number of edges; when
minEdgePartitions == 3 result is ~1/3 number of edges and so on

It seems that outerJoinVertices is returning srcAttr(and dstAtt) = nulll for
many attributes; and from numbers it seems that it might be returning null
for vertices residing on other partitions ?

Environment : I am using RC5; and 22 executers.

BUT I get correct number of edges in each iteration when I repeated my
experiment by keeping the vertex attribute type Int in step 2 (i.e. just
kept the number of vertices instead of array of vertices), which is same as
the type vertex attribute in graph before join.

Is this a know bug fixed recently? or are we supposed to set some flags when
updating the vertex attribute type? 



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-0-outerJoinVertices-seems-to-return-null-for-vertex-attributes-when-input-was-partitioned-and-tp6799.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to