[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-03-13 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361319#comment-14361319
 ] 

Apache Spark commented on SPARK-5790:
-

User 'brennonyork' has created a pull request for this issue:
https://github.com/apache/spark/pull/5023

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-03-13 Thread Brennon York (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360613#comment-14360613
 ] 

Brennon York commented on SPARK-5790:
-

[~maropu] did you get those tests in a PR or into the master branch for Spark? 
I was going to close this issue, but wanted to make sure we didn't lose those 
tests! :)

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-02-23 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334292#comment-14334292
 ] 

Takeshi Yamamuro commented on SPARK-5790:
-

Thanks for your work :)

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-02-20 Thread Brennon York (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329252#comment-14329252
 ] 

Brennon York commented on SPARK-5790:
-

[~maropu] this looks very similar to the work I just pushed up for 
[SPARK-1955|https://github.com/apache/spark/pull/4705] which was acting as the 
overarching issue for this ticket. I didn't write tests though which would be a 
major benefit. Would you be willing to refactor and only include the tests to 
close this issue out? That would help out tremendously and I wouldn't want to 
lose that effort!

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-02-20 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14328778#comment-14328778
 ] 

Takeshi Yamamuro commented on SPARK-5790:
-

Hi,

What's the status of your work?
I fixed this bugs, so if you haven't finished yet, plz refer to my patch:
https://github.com/maropu/spark/commit/1f64794b2ce33e64f340e383d4e8a60639a7eb4b

Thanks.

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>Assignee: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5790) VertexRDD's won't zip properly for `diff` capability

2015-02-12 Thread Brennon York (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319316#comment-14319316
 ] 

Brennon York commented on SPARK-5790:
-

FWIW this issue is a blocker for 
[SPARK-4600|https://issues.apache.org/jira/browse/SPARK-4600] that I'm working 
on as `diff` relies on the use of `zipPartitions` causing this. If someone 
could assign this to me I'll continue working this issue.

> VertexRDD's won't zip properly for `diff` capability
> 
>
> Key: SPARK-5790
> URL: https://issues.apache.org/jira/browse/SPARK-5790
> Project: Spark
>  Issue Type: Bug
>  Components: GraphX
>Reporter: Brennon York
>
> For VertexRDD's with differing partition sizes one cannot run commands like 
> `diff` as it will thrown an IllegalArgumentException. The code below provides 
> an example:
> {code}
> import org.apache.spark.graphx._
> import org.apache.spark.rdd._
> val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => 
> (id, id.toInt+1)))
> setA.collect.foreach(println(_))
> val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)))
> setB.collect.foreach(println(_))
> val diff = setA.diff(setB)
> diff.collect.foreach(println(_))
> val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => 
> (id, id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
> setA.diff(setC).collect
> // java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
> partitions
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org