[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-11-06 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 @hhbyyh Thanks for your suggestions. Will try to incorporate these in a day or so. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-11-06 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r149215757 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -105,6 +106,56 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-11 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r144124146 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143573574 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143572643 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143572667 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143572179 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143572149 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143570173 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143570164 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143569334 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143569449 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143567888 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143567788 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143567807 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143566804 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529694 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -245,5 +508,28 @@ class Word2VecSuite extends SparkFunSuite

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529286 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -189,6 +305,136 @@ class Word2VecSuite extends SparkFunSuite

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528339 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528173 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516772 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516595 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516496 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516384 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-10-05 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 Thanks for your comments/suggestions @MLnick and @sethah . Working on incorporating these. --- - To unsubscribe, e-mail

[GitHub] spark pull request #18123: [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative ...

2017-05-26 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/18123 [SPARK-20903] [ML] Word2Vec Skip-Gram + Negative Sampling ## What changes were proposed in this pull request? This enhances [CBOW + Negative Sampling](https://github.com/apache

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-05-18 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 Code-review comments/suggestions so far have been incorporated. Thanks for looking into the code. Happy to incorporate more suggestions and feedback. --- If your project is set up

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-05-05 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r115009247 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -36,7 +36,10 @@ import org.apache.spark.util.{Utils, VersionUtils

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-05-04 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 @MLnick I half expected that. No worries. I have incorporated some of your feedback in the meantime and also added subsampling as well. Thanks for looking into the code. --- If your project

[GitHub] spark issue #17519: [SPARK-15352][Doc] follow-up: add configuration docs for...

2017-05-04 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17519 @lins05 Apologies for the delay in responding and thanks for adding the docs. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-04-28 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 @Krimit _Can you provide some information about the practical differences between CBOW and skip-grams?_ ![Model Architectures](https://cloud.githubusercontent.com/assets/6588487

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-04-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 @Krimit @MLnick @hhbyyh I am working on getting your earlier queries answered. @Krimit Thanks for looking into the code, I will try to get the code-review feedback incorporated

[GitHub] spark issue #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Words mode...

2017-04-19 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17673 The [original paper](https://arxiv.org/abs/1301.3781) proposed two model architectures for generating word embeddings, Continuous Skip-Gram model and continuous Bag-of-words model. Spark ML

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-04-18 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/17673 [SPARK-20372] [ML] Word2Vec Continuous Bag of Words model ## What changes were proposed in this pull request? This adds Continuous Bag of Words implementation to Word2Vec

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-03-28 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 Rebased to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17325: [SPARK-19803][CORE][TEST] Proactive replication t...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17325#discussion_r108261401 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -481,27 +481,39 @@ class

[GitHub] spark issue #17325: [SPARK-19803][CORE][TEST] Proactive replication test fai...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17325 Thanks @kayousterhout ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17325: [SPARK-19803][CORE][TEST] Proactive replication t...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17325#discussion_r108241606 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1179,9 +1179,13 @@ private[spark] class BlockManager

[GitHub] spark pull request #13932: [SPARK-15354] [CORE] Topology aware block replica...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13932#discussion_r108204165 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPolicy.scala --- @@ -53,6 +53,46 @@ trait BlockReplicationPolicy

[GitHub] spark pull request #13932: [SPARK-15354] [CORE] Topology aware block replica...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13932#discussion_r108203064 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockReplicationPolicySuite.scala --- @@ -68,7 +68,60 @@ class BlockReplicationPolicySuite

[GitHub] spark pull request #13932: [SPARK-15354] [CORE] Topology aware block replica...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13932#discussion_r108202578 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPolicy.scala --- @@ -88,26 +129,96 @@ class RandomBlockReplicationPolicy

[GitHub] spark pull request #17325: [SPARK-19803][CORE][TEST] Proactive replication t...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17325#discussion_r108198851 --- Diff: core/src/test/scala/org/apache/spark/storage/BlockManagerReplicationSuite.scala --- @@ -481,27 +481,39 @@ class

[GitHub] spark issue #17325: [SPARK-19803][CORE][TEST] Proactive replication test fai...

2017-03-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/17325 Elaborating a little more on how replication happens and what the code change here does: Spark executors cache a list of peers that is refreshed every 60s by default. When replicating

[GitHub] spark pull request #17325: [SPARK-19803][CORE][TEST] Proactive replication t...

2017-03-16 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/17325 [SPARK-19803][CORE][TEST] Proactive replication test failures ## What changes were proposed in this pull request? Executors cache a list of their peers that is refreshed by default every

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-28 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 Rebased to resolve merge conflicts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-23 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/14412 "spark.storage.exceptionOnPinLeak" based check only works if executors are created. I put in an assertion check using logic similar to it in the testProactiveReplication tests. -

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101854267 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -195,17 +198,39 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101851236 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -195,17 +198,39 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101847693 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1131,14 +1131,43 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101847672 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1131,14 +1131,43 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101843543 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -195,17 +198,39 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-17 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r101843045 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1131,14 +1131,47 @@ private[spark] class BlockManager

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-03 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 jenkins ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13932: [SPARK-15354] [CORE] Topology aware block replication st...

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 No test errors. Looks like the test process was killed midway. Tests added as a part of this PR took less than 7s, so couldn't have caused the delay. --- If your project is set up

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99237148 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -188,24 +189,45 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99190414 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala --- @@ -188,24 +189,45 @@ class BlockManagerMasterEndpoint

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99189219 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1152,20 +1185,25 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99185105 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1152,20 +1185,25 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99174290 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1131,14 +1131,47 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] Proactive block replication

2017-02-02 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r99174354 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1131,14 +1131,47 @@ private[spark] class BlockManager

[GitHub] spark issue #13932: [SPARK-15354] [CORE] [WIP] Topology aware block replicat...

2016-12-14 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 Rebased to master to resolve merge conflict --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

2016-09-20 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13152 Rebased to master to resolve merge conflicts --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

2016-08-29 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13152 Thanks for the suggestions. I have corrected the style check errors and verified that locally, so hopefully there are not more style errors. I have also done a couple of modifications per

[GitHub] spark pull request #13152: [SPARK-15353] [CORE] Making peer selection for bl...

2016-08-29 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r7684 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockReplicationPolicy.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #13152: [SPARK-15353] [CORE] Making peer selection for bl...

2016-08-12 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r74650320 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1088,109 +1108,88 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] [WIP] Proactive block replic...

2016-08-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/14412#discussion_r74088323 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala --- @@ -37,10 +37,11 @@ import org.apache.spark.util.Utils class

[GitHub] spark pull request #13152: [SPARK-15353] [CORE] Making peer selection for bl...

2016-08-05 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r73751665 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1088,109 +1108,88 @@ private[spark] class BlockManager

[GitHub] spark pull request #13152: [SPARK-15353] [CORE] Making peer selection for bl...

2016-08-04 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r73593762 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -159,8 +162,27 @@ private[spark] class BlockManager

[GitHub] spark pull request #14412: [SPARK-15355] [CORE] [WIP] Proactive block replic...

2016-07-29 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/14412 [SPARK-15355] [CORE] [WIP] Proactive block replication ## What changes were proposed in this pull request? We are proposing addition of pro-active block replication in case

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

2016-07-27 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13152 The state being managed inside getRandomPeer() is also modified in a couple of other places, so it won't be a very clean change to remove some of it out of getRandomPeer. Even if that is done

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

2016-07-26 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13152 The topology info is only queried when the executor initiates and is assumed to stay the same throughout the life of the executor. Depending on the cluster manager being used, I am assuming

[GitHub] spark issue #13932: [SPARK-15354] [CORE] [WIP] Topology aware block replicat...

2016-07-22 Thread shubhamchopra
Github user shubhamchopra commented on the issue: https://github.com/apache/spark/pull/13932 Based on feedback from @rxin, added a Basic Strategy that replicates HDFS behavior as a simpler alternative to the constraint solver. I also ran some performance tests on the constraint

[GitHub] spark pull request #13932: [SPARK-15354] [CORE] [WIP] Topology aware block r...

2016-06-27 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/13932 [SPARK-15354] [CORE] [WIP] Topology aware block replication strategies ## What changes were proposed in this pull request? Implementations of strategies for resilient block

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-18 Thread shubhamchopra
Github user shubhamchopra commented on the pull request: https://github.com/apache/spark/pull/13152#issuecomment-220156087 Fixed style issues pointed out by @HyukjinKwon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-18 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/13152#discussion_r63780828 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1079,109 +1103,97 @@ private[spark] class BlockManager

[GitHub] spark pull request: [SPARK-15353] [CORE] Making peer selection for...

2016-05-17 Thread shubhamchopra
GitHub user shubhamchopra opened a pull request: https://github.com/apache/spark/pull/13152 [SPARK-15353] [CORE] Making peer selection for block replication pluggable ## What changes were proposed in this pull request? This PR makes block replication strategies pluggable