graph.vertices.collect().foreach(println)

2017-02-28 Thread balaji9058
HI,
Any one have idea about this.

When i use graph.vertices.collect() in the console(spark console) getting
limited vertices data, as i have million records

res34: Array[(org.apache.spark.graphx.VertexId, (Array[Double],
Array[Double], Double, Double))] =
Array((8501952,(Array(-1.0720014023085627, -0.166344
57650848242),Array(-0.6103788820773441,
-0.07786941417473896),-1.3895885415250486,1.0)),
(8502656,(Array(0.4372264795058396, 0.013542308038655674),Arra
y(0.8989895103168658, 0.10203414054868458),0.6711091242883567,1.0)),
(1024388,(Array(0.5121940125583068, 0.023648040456737244)


But when i use like below, as object hashcode is printed, but certianly i
want data in above format to do my further operations

 g.vertices.collect().foreach(println)

(8502417,(*[D@f068ac,[D@16825ec*,0.6711091242883567,1.0))
(8502099,(*[D@91cd94,[D@4fb2ea*,0.6711091242883567,1.0))
(1939777,(*[D@6b48d0,[D@17979f6*,0.6368329031347298,0.0))
(8500191,(*[D@36063b,[D@d3339a*,-0.37782997016492503,1.0))
(8500927,(*[D@1e0e2a8,[D@140ab89*,0.6368329031347298,1.0))
(8500255,(*[D@bde454,[D@d96b14*,0.6528450234389184,1.0))

Any idea is appreciated. 

val newG = g.vertices.collect()
newG.foreach(println) also not working...


thanks




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/graph-vertices-collect-foreach-println-tp28437.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Graphx Examples for ALS

2017-02-17 Thread balaji9058
Hi,

Where can i find the the ALS recommendation algorithm for large data set?

Please feel to share your ideas/algorithms/logic to build recommendation
engine by using spark graphx

Thanks in advance.

Thanks,
Balaji



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-Examples-for-ALS-tp28401.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Bipartite projection with Graphx

2017-02-03 Thread balaji9058
Hi,

Is possible Bipartite projection with Graphx

Rdd1
#id name
1   x1
2   x2
3   x3
4   x4
5   x5
6   x6
7   x7
8   x8

Rdd2
#id name
10001   y1
10002   y2
10003   y3
10004   y4
10005   y5
10006   y6

EdgeList
#src id Dest id
1   10001
1   10002
2   10001
2   10002
2   10004
3   10003
3   10005
4   10001
4   10004
5   10003
5   10005
6   10003
6   10006
7   10005
7   10006
8   10005

  val nodes = Rdd1++ Rdd2
 val Network = Graph(nodes, links)

with above network need to create projection graphs like x1-x2 weight (see
the image in below wiki link)
example:

https://en.wikipedia.org/wiki/Bipartite_network_projection





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Bipartite-projection-with-Graphx-tp28360.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Spark Graphx with Database

2016-12-26 Thread balaji9058
Hi All,

I would like to know about spark graphx execution/processing with
database.Yes, i understand spark graphx is in-memory processing but some
extent we can manage querying but would like to do much more complex query
or processing.Please suggest me the usecase or steps for the same.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Graphx-with-Database-tp28253.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Graphx triplet comparison

2016-12-13 Thread balaji9058
Hi Thanks for reply.

Here is my code:
class BusStopNode(val name: String,val mode:String,val maxpasengers :Int)
extends Serializable
case class busstop(override val name: String,override val mode:String,val
shelterId: String, override val maxpasengers :Int) extends
BusStopNode(name,mode,maxpasengers) with Serializable
case class busNodeDetails(override val name: String,override val
mode:String,val srcId: Int,val destId :Int,val arrivalTime :Int,override val
maxpasengers :Int) extends BusStopNode(name,mode,maxpasengers) with
Serializable
case class routeDetails(override val name: String,override val
mode:String,val srcId: Int,val destId :Int,override val maxpasengers :Int)
extends BusStopNode(name,mode,maxpasengers) with Serializable

val busstopRDD: RDD[(VertexId, BusStopNode)] =
  sc.textFile("\\BusStopNameMini.txt").filter(!_.startsWith("#")).
map { line =>
  val row = line split ","
  (row(0).toInt, new
busstop(row(0),row(3),row(1)+row(0),row(2).toInt))
}

busstopRDD.foreach(println)

val busNodeDetailsRdd: RDD[(VertexId, BusStopNode)] =
  sc.textFile("\\RouteDetails.txt").filter(!_.startsWith("#")).
map { line =>
  val row = line split ","
  (row(0).toInt, new
busNodeDetails(row(0),row(4),row(1).toInt,row(2).toInt,row(3).toInt,0))
}
busNodeDetailsRdd.foreach(println)

 val detailedStats: RDD[Edge[BusStopNode]] =
sc.textFile("\\routesEdgeNew.txt").
filter(! _.startsWith("#")).
map {line =>
val row = line split ','
Edge(row(0).toInt, row(1).toInt,new BusStopNode(row(2),
row(3),1)
   )}

val busGraph = busstopRDD ++ busNodeDetailsRdd
busGraph.foreach(println)
val mainGraph = Graph(busGraph, detailedStats)
mainGraph.triplets.foreach(println)
 val subGraph = mainGraph subgraph (epred = _.srcAttr.name == "101")
 //Working Fine
 for (subTriplet <- subGraph.triplets) {
 println(subTriplet.dstAttr.name)
 }
 
 //Working fine
  for (mainTriplet <- mainGraph.triplets) {
 println(subTriplet.dstAttr.name)
 }
 
 //causing error while iterating both at same time
 for (subTriplet <- subGraph.triplets) {
for (mainTriplet <- mainGraph.triplets) {   //Nullpointer exception
is causing here
   if
(subTriplet.dstAttr.name.toString.equals(mainTriplet.dstAttr.name)) {

  println("hello")//success case on both destination names of of
subgraph and maingraph
}
  }
}
}

BusStopNameMini.txt
101,bs,10,B
102,bs,10,B
103,bs,20,B
104,bs,14,B
105,bs,8,B


RouteDetails.txt

#101,102,104  4 5 6
#102,103 3 4
#103,105,104 2 3 4
#104,102,101  4 5 6
#104,1015
#105,104,102 5 6 2
1,101,104,5,R
2,102,103,5,R
3,103,104,5,R
4,102,103,5,R
5,104,101,5,R
6,105,102,5,R

routesEdgeNew.txt it contains two types of edges are bus to bus with edge
value is distance and bus to route with edge value as time
#101,102,104  4 5 6
#102,103 3 4
#103,105,104 2 3 4
#104,102,101  4 5 6
#104,1015
#105,104,102 5 6 2
101,102,4,BS
102,104,5,BS
102,103,3,BS
103,105,4,BS
105,104,3,BS
104,102,4,BS
102,101,5,BS
104,101,5,BS
105,104,5,BS
104,102,6,BS
101,1,4,R,102
101,1,4,R,103
102,2,5,R
103,3,6,R
103,3,5,R
104,4,7,R
105,5,4,Z
101,2,9,R
105,5,4,R
105,2,5,R
104,2,5,R
103,1,4,R
101,103,4,BS
101,104,4,BS
101,105,4,BS
101,103,5,BS
101,104,5,BS
101,105,5,BS
1,101,4,R







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-comparison-tp28198p28205.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Graphx triplet comparison

2016-12-12 Thread balaji9058
Hi,

I would like to know how to do graphx triplet comparison in scala.

Example there are two triplets;

val triplet1 = mainGraph.triplet.filter(condition1)
val triplet2 = mainGraph.triplet.filter(condition2)

now i want to do compare triplet1 & triplet2 with condition3





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-comparison-tp28198.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Graphx triplet loops causing null pointer exception

2016-12-12 Thread balaji9058
HI,

I am getting null pointer exception when i am executing the triplet loop
inside another triplet loop

works fine below :
 for (mainTriplet <- mainGraph.triplets) {
println(mainTriplet.dstAttr.name) 
   }

works fine below :
 for (subTriplet <- subGrapgh.triplets) {
println(subTriplet .dstAttr.name) 
   }

Below is the causing issue :
1. for (subTriplet <- subGrapgh.triplets) {
2. println(subTriplet .dstAttr.name)  
3. for (mainTriplet <- mainGraph.triplets) {
 4.println(mainTriplet.dstAttr.name)   
// here i want to do operation like
subTriplet.dstAttr.name.toString.equals(mainTriplet.dstAttr.name)
6.}
7.}  

line number 2 prints only one time and getting below error immediately

ERROR Executor: Exception in task 0.0 in stage 10.0 (TID 10)
java.lang.NullPointerException
at
org.apache.spark.graphx.impl.GraphImpl.triplets$lzycompute(GraphImpl.scala:50)
at org.apache.spark.graphx.impl.GraphImpl.triplets(GraphImpl.scala:49)


Please help and let me know if anything required in my explanation



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-loops-causing-null-pointer-exception-tp28195.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org