[GitHub] spark pull request: [SPARK-2823][GraphX]fix GraphX EdgeRDD zipPart...

2014-12-23 Thread luluorta
Github user luluorta commented on the pull request:

https://github.com/apache/spark/pull/1763#issuecomment-68027993
  
Thanks, @Earne 

Actually we already had a method to customize the partition number of 
EdgeRDD by using `Graph.partitionBy` 
[Graph.scala#L136](https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala#L136).

I guess the better name for the param of `coalesce(numEdgePartitions)` is 
maxEdgePartitions, cause it is used for making sure the generated EdgeRDD with 
no more than `maxEdgePartitions` partitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4115][GraphX] Add overrided count for e...

2014-10-28 Thread luluorta
GitHub user luluorta opened a pull request:

https://github.com/apache/spark/pull/2975

[SPARK-4115][GraphX] Add overrided count for edge counting of EdgeRDD.

Accumulate sizes of all the EdgePartitions just like the VertexRDD.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/luluorta/spark graph-edge-count

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2975


commit 86ef0e5a820955b3d90eb0d5b6108111aa1a462d
Author: luluorta 
Date:   2014-10-28T07:34:45Z

Add overrided count for edge counting of EdgeRDD.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4109] Correctly deserialize Task.stageI...

2014-10-27 Thread luluorta
GitHub user luluorta opened a pull request:

https://github.com/apache/spark/pull/2971

[SPARK-4109] Correctly deserialize Task.stageId

The two subclasses of Task, ShuffleMapTask and ResultTask, do not correctly 
deserialize stageId. Therefore, the accessing of TaskContext.stageId always 
returns zero value to the user.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/luluorta/spark fix-task-ser

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2971


commit ff35ee6522743989fb85ae5382f28e040af9d8a1
Author: luluorta 
Date:   2014-10-28T05:45:08Z

correctly deserialize Task.stageId




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2014-08-04 Thread luluorta
GitHub user luluorta opened a pull request:

https://github.com/apache/spark/pull/1767

[SPARK-2827][GraphX]Add degree distribution operators in GraphOps for GraphX



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/luluorta/spark graphx-degree

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1767.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1767


commit d1e053f3c7a95272676edfad485a31f69290effd
Author: luluorta 
Date:   2014-08-04T08:42:28Z

add max/min degree

commit 1c35298bfd3bea5b8eeba6bb4804b3fe74ff7fd9
Author: luluorta 
Date:   2014-08-04T08:56:29Z

add  degree distribution




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: fix GraphX EdgeRDD zipPartitions

2014-08-04 Thread luluorta
GitHub user luluorta opened a pull request:

https://github.com/apache/spark/pull/1763

fix GraphX EdgeRDD zipPartitions

If the users set “spark.default.parallelism” and the value is different 
with the EdgeRDD partition number, GraphX jobs will throw:
java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
partitions

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/luluorta/spark fix-graph-zip

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1763.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1763


commit 83389614959fb2c84b947362af1e0babbfe767d5
Author: luluorta 
Date:   2014-08-04T07:03:17Z

fix GraphX EdgeRDD zipPartitions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org