[GitHub] spark pull request: [SPARK-4884]: Improve Partition docs
Github user msiddalingaiah commented on the pull request: https://github.com/apache/spark/pull/3722#issuecomment-67496889 @ash211 Not a problem, I created a JIRA ticket and updated the title/description. Thanks!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [DOC]: Improve Partition docs
GitHub user msiddalingaiah opened a pull request: https://github.com/apache/spark/pull/3722 [DOC]: Improve Partition docs Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html You can merge this pull request into a Git repository by running: $ git pull https://github.com/msiddalingaiah/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3722 commit 0fc12d712a7544ec9a522a7fe7d53e697a3f91f2 Author: Madhu Siddalingaiah Date: 2014-11-20T20:13:43Z Documentation: add description for repartitionAndSortWithinPartitions commit cd2b05a02cf46e4103c24ece346e2f6013aeacc2 Author: Madhu Siddalingaiah Date: 2014-12-01T13:44:51Z Merge remote-tracking branch 'upstream/master' commit 332f7a29b3e8ad47dd5a083e6509bdb98e2e555f Author: Madhu Siddalingaiah Date: 2014-12-01T13:47:16Z Documentation: replace with commit cbccbfe8b3872ca6885960d33e4a7b19046e6e0c Author: Madhu Siddalingaiah Date: 2014-12-01T13:51:16Z Documentation: replace with (again) commit 38faca4ba7ad89392ff5c12737f5bcbc86709d81 Author: Madhu Siddalingaiah Date: 2014-12-01T19:33:43Z Merge remote-tracking branch 'upstream/master' commit 51d14b91e494c817211ce0eddc56e7a610543d90 Author: Madhu Siddalingaiah Date: 2014-12-17T14:51:25Z Merge remote-tracking branch 'upstream/master' commit 79e679fd4aee8284c774267e4d21e764a39f62cb Author: Madhu Siddalingaiah Date: 2014-12-17T15:00:02Z [DOC]: improve documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Documentation: add description for repartition...
Github user msiddalingaiah commented on the pull request: https://github.com/apache/spark/pull/3390#issuecomment-65067064 OK, done. Please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Documentation: add description for repartition...
GitHub user msiddalingaiah opened a pull request: https://github.com/apache/spark/pull/3390 Documentation: add description for repartitionAndSortWithinPartitions You can merge this pull request into a Git repository by running: $ git pull https://github.com/msiddalingaiah/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3390.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3390 commit 0fc12d712a7544ec9a522a7fe7d53e697a3f91f2 Author: Madhu Siddalingaiah Date: 2014-11-20T20:13:43Z Documentation: add description for repartitionAndSortWithinPartitions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-983] Support external sorting
Github user msiddalingaiah commented on the pull request: https://github.com/apache/spark/pull/1090#issuecomment-46497060 @jerryshao @andrewor14 @xiajunluan I'm confused. Does the PR mentioned above also address SPARK-983? SPARK-983 was assigned to me some time ago. Please advise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-983] Support external sorting
Github user msiddalingaiah commented on the pull request: https://github.com/apache/spark/pull/1090#issuecomment-46218706 Thanks. It's not clear to me if it can use ExternalAppendInlyMap as the underlying buffer either. There was some discussion about how to handle memory management in Jira, There was no concensus at the time, so there was duplication. Some code can be factored into a common class, I chose not to change too much at once. I'm tied up in the near future. When does this have to be resolved? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-983] Support external sorting
GitHub user msiddalingaiah opened a pull request: https://github.com/apache/spark/pull/1090 [SPARK-983] Support external sorting This satisfies SPARK-983 Support external sorting for RDD#sortByKey() It also adds a general sortPartitions() method to RDD. You can merge this pull request into a Git repository by running: $ git pull https://github.com/msiddalingaiah/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1090.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1090 commit a4f7404ac28a913b4b5cc390e3f321ae6627b941 Author: msiddalingaiah Date: 2014-05-05T03:23:46Z Address SPARK-1717 commit db47ac52f293b32af277fb7dfdaa6f484e7dde93 Author: Siddalingaiah Madhukesh Date: 2014-05-05T14:55:57Z Merge remote-tracking branch 'upstream/master' commit eb0d39f7603627a86ca4248a260a59fd041d7068 Author: msiddalingaiah Date: 2014-06-01T03:01:54Z Merge remote-tracking branch 'upstream/master' commit a76b3a15bba7d23fb4d3cdd5cc57f0c8bef8efca Author: msiddalingaiah Date: 2014-06-01T18:16:32Z Add sortPartitions(...) commit f7dde2ecd38e099cdfc97336230d8bfc473c8179 Author: msiddalingaiah Date: 2014-06-11T00:50:29Z Merge remote-tracking branch 'upstream/master' commit 63825ba3ff21a72a1308e97c9b5fe3368a86c68e Author: msiddalingaiah Date: 2014-06-11T02:33:44Z Basic merge sert commit 1d7709e5f013fc76bca92a97e7108efbc38071ec Author: msiddalingaiah Date: 2014-06-12T00:30:08Z [SPARK-983][WIP] Add DiskBuffer/DiskBufferIterator commit 5c2b55ab314b9c961a41910076d46820eef5e2c1 Author: msiddalingaiah Date: 2014-06-12T01:39:11Z [SPARK-983][WIP] refactor code/spill sublists commit 38a82c4be1e5d99fce9ed50660c2062f80af3584 Author: msiddalingaiah Date: 2014-06-14T03:35:52Z Merge remote-tracking branch 'upstream/master' commit 3dd4be624a082214113fbf3f7b2ab655cd14bf92 Author: msiddalingaiah Date: 2014-06-14T03:57:33Z [SPARK-983][WIP] fetch upstream/refactor code commit 5ffdf96780e5a48c2542c3bb10c4459c9fbaea00 Author: msiddalingaiah Date: 2014-06-14T18:47:21Z [SPARK-983][WIP] add SizeEstimator commit ad1a1a8b1a2e28d3bd900a0075c7a0e2ff741a20 Author: msiddalingaiah Date: 2014-06-15T01:51:58Z [SPARK-983] complete disk spill logic commit 929955744fc401808a7c5ad72e581cf5a62ade0e Author: msiddalingaiah Date: 2014-06-15T01:53:05Z Merge remote-tracking branch 'upstream/master' --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Address SPARK-1717
GitHub user msiddalingaiah opened a pull request: https://github.com/apache/spark/pull/641 Address SPARK-1717 I tested the change locally with Spark 0.9.1, but I can't test with 1.0.0 because there was no AMI for it at the time. It's a trivial fix, so it should cause any problems. You can merge this pull request into a Git repository by running: $ git pull https://github.com/msiddalingaiah/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/641.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #641 commit a4f7404ac28a913b4b5cc390e3f321ae6627b941 Author: msiddalingaiah Date: 2014-05-05T03:23:46Z Address SPARK-1717 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---