Maybe connected component is what you need ?
On Oct 5, 2015 19:02, "Robineast" wrote:
> GraphX has a Shortest Paths algorithm implementation which will tell you,
> for
> all vertices in the graph, the shortest distance to a specific ('landmark')
> vertex. The returned
Helena,
The CassandraInputDStream sounds interesting. I dont find many things in
the jira though. Do you have more details on what it tries to achieve ?
Thanks,
Anwar.
On Tue, Mar 24, 2015 at 2:39 PM, Helena Edelson helena.edel...@datastax.com
wrote:
Streaming _from_ cassandra,
It looks like to be similar (simpler) to the connected component
implementation in GraphX.
Have you checked that ?
I have questions though, in your example, the graph is a tree. What is the
behavior if it is a more general graph ?
Cheers,
Anwar Rizal.
On Mon, Jan 12, 2015 at 1:02 AM, dizzy5112
why mapPartitionsWithInputSplit has DeveloperApi
annotation ? Is it possible to remove ?
Best regards,
Anwar Rizal.
On Sun, Dec 21, 2014 at 10:47 PM, Shuai Zheng szheng.c...@gmail.com wrote:
I just found a possible answer:
http://themodernlife.github.io/scala/spark/hadoop/hdfs/2014/09/28
I presume that you need to have access to the path of each file you are
reading.
I don't know whether there is a good way to do that for HDFS, I need to
read the files myself, something like:
def openWithPath(inputPath: String, sc:SparkContext) = {
val fs= (new
Can you clarify what you're trying to achieve here ?
If you want to take only top 10 of each RDD, why don't sort followed by
take(10) of every RDD ?
Or, you want to take top 10 of five minutes ?
Cheers,
On Thu, May 29, 2014 at 2:04 PM, nilmish nilmish@gmail.com wrote:
I have a DSTREAM