[GitHub] spark issue #23146: [SPARK-26173] [MLlib] Prior regularization for Logistic ...

2018-11-29 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/23146 cc: @kiszk @viirya @yanboliang @srowen Could you please review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22806: [SPARK-25250] : On successful completion of a task attem...

2018-11-29 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/22806 cc: @jiangxb1987 @cloud-fan @srowen Could you please review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22806: [SPARK-25250] : On successful completion of a tas...

2018-11-29 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/22806#discussion_r237447252 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -1091,6 +1091,10 @@ private[spark] class TaskSetManager( def

[GitHub] spark issue #23146: [SPARK-26173] [MLlib] Prior regularization for Logistic ...

2018-11-29 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/23146 Hi @elfausto , Could you update the ticket with a detailed description, to help out the reviewers. Also, please mention the test classes added/updated in this PR in the description. Thank

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-11-28 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r236992453 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1058,31 +1064,37 @@ private class

[GitHub] spark issue #21942: [SPARK-24283][ML] Make ml.StandardScaler skip conversion...

2018-08-23 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/21942 Hi @mengxr, @jkbradley, @MLnick, @holdenk, @viirya , could you please review this PR ? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r211695230 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1058,31 +1064,37 @@ private class

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r211579406 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1099,7 +,7 @@ private class

[GitHub] spark issue #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Join in ca...

2018-08-21 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/22168 Hi @tejasapatil, @viirya, @hvanhovell, & @kiszk, can you please review this pull request? --- - To unsubscribe, e-

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread sujithjay
GitHub user sujithjay opened a pull request: https://github.com/apache/spark/pull/22168 [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Join in case of data skew ## What issue does this pull request address ? JIRA: [https://issues.apache.org/jira/browse/SPARK-24985](https

[GitHub] spark issue #21942: [SPARK-24283][ML] Make ml.StandardScaler skip conversion...

2018-08-13 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/21942 Hi @mengxr, @jkbradley, @MLnick and @holdenk , could you please review this PR ? Thank you. --- - To unsubscribe, e-mail

[GitHub] spark issue #21942: [SPARK-24283][ML] Make ml.StandardScaler skip conversion...

2018-08-13 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/21942 OK to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21942: [SPARK-24283][ML] Make ml.StandardScaler skip conversion...

2018-08-02 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/21942 @hhbyyh I, personally, like the idea of moving the unit tests to ML. However, are you suggesting I move some of them as part of this PR? Or were you suggesting that it is a task we need to take

[GitHub] spark issue #16909: [SPARK-13450] Introduce ExternalAppendOnlyUnsafeRowArray...

2018-08-01 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/16909 Hi @sheperdh , the PR author does make a brief note about 'Full Outer Joins' in the PR description. > NOTE: I have not changed FULL OUTER JOIN to use this new array implementation. Chang

[GitHub] spark issue #21942: [SPARK-24283][ML] Make ml.StandardScaler skip conversion...

2018-08-01 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/21942 @holdenk Could you also please take a look at this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #21942: [SPARK-24283][ML] Make ml.StandardScaler skip con...

2018-08-01 Thread sujithjay
GitHub user sujithjay opened a pull request: https://github.com/apache/spark/pull/21942 [SPARK-24283][ML] Make ml.StandardScaler skip conversion of Spar… …k ml vectors to mllib vectors ## What changes were proposed in this pull request? Currently, ml.StandardScaler

[GitHub] spark issue #20821: [SPARK-23678][GraphX] a more efficient partition strateg...

2018-04-03 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20821 Hi @weiwee , I have a few questions about this contribution: 1. What are the use cases for which this partitioning strategy is more suitable, compared to the existing strategies? 2. You

[GitHub] spark pull request #20881: Add a note about jobs running in FIFO order in th...

2018-04-03 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20881#discussion_r178729284 --- Diff: docs/job-scheduling.md --- @@ -215,6 +215,9 @@ pool), but inside each pool, jobs run in FIFO order. For example, if you create means

[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...

2018-04-02 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20959#discussion_r178567646 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2127,4 +2127,39 @@ class JsonSuite extends

[GitHub] spark issue #20919: Feature/apply func to rdd

2018-04-02 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20919 Hi @gianmarcodonetti , can you edit the description of this PR to include an explanation of the changes made? Also, excluding very minor changes, you should first discuss the changes you

[GitHub] spark issue #20931: [SPARK-23815][Core]Spark writer dynamic partition overwr...

2018-04-02 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20931 Hi @fangshil , can you try and add test cases to verify this changes introduced in this patch? --- - To unsubscribe, e-mail

[GitHub] spark pull request #20956: [SPARK-23841][ML] NodeIdCache should unpersist th...

2018-04-02 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20956#discussion_r178517357 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/NodeIdCache.scala --- @@ -95,7 +95,7 @@ private[spark] class NodeIdCache( splits

[GitHub] spark pull request #20956: [SPARK-23841][ML] NodeIdCache should unpersist th...

2018-04-02 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20956#discussion_r178518257 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/NodeIdCache.scala --- @@ -166,9 +166,13 @@ private[spark] class NodeIdCache

[GitHub] spark issue #19373: [SPARK-22150][CORE] PeriodicCheckpointer fails in case o...

2018-03-26 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/19373 cc: @felixcheung @jkbradley @mengxr Could you please review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19373: [SPARK-22150][CORE] PeriodicCheckpointer fails in case o...

2018-03-26 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/19373 Hi @szhem , you could add consider identifying contributors who have worked on the code being changed, and reach out to them for review

[GitHub] spark pull request #19373: [SPARK-22150][CORE] PeriodicCheckpointer fails in...

2018-03-26 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/19373#discussion_r177014127 --- Diff: core/src/main/scala/org/apache/spark/rdd/util/PeriodicRDDCheckpointer.scala --- @@ -73,8 +76,6 @@ import

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-24 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 The failed unit test (in HistoryServerSuite.scala) seems unrelated to this PR. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Thank you, @mridulm for reviewing this PR. I have addressed the latest review comments. --- - To unsubscribe, e-mail

[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-23 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158586350 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, ObjectOutputStream

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Thank you, @HyukjinKwon . I will try again after the hotfix is merged to master. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Scala style tests are failing on a file 'SparkHiveExample.scala' , which is unrelated to this PR. Will rebase to master and try again

[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-21 Thread sujithjay
Github user sujithjay commented on a diff in the pull request: https://github.com/apache/spark/pull/20002#discussion_r158267365 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -67,6 +71,16 @@ object Partitioner

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-20 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 @tgravescs Thank you for keeping me informed. I look forward to receiving your review. Happy holidays! --- - To unsubscribe

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-20 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 @tgravescs , could you please take a look when you have some time ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-16 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Thank you, @HyukjinKwon . The tests passed after rebasing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-16 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 cc: @tgravescs @codlife Could you please review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-16 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 Hi @HyukjinKwon , can you please help me with these SparkR tests failures? They seem unrelated to me. --- - To unsubscribe, e

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-16 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 SparkR test failure seems unrelated to this PR. Any ideas what's wrong? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-16 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 SparkR test failure seems unrelated to this PR. Any ideas what's wrong? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-16 Thread sujithjay
GitHub user sujithjay opened a pull request: https://github.com/apache/spark/pull/20002 [SPARK-22465][Core][WIP] Add a safety-check to RDD defaultPartitioner ## What changes were proposed in this pull request? In choosing a Partitioner to use for a cogroup-like operation between

[GitHub] spark issue #18254: Fixed typo in sql.functions

2017-06-09 Thread sujithjay
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/18254 Sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18254: Fixed typo in sql.functions

2017-06-09 Thread sujithjay
GitHub user sujithjay opened a pull request: https://github.com/apache/spark/pull/18254 Fixed typo in sql.functions ## What changes were proposed in this pull request? I fixed a typo in the Scaladoc for the method `def struct(cols: Column*): Column`. 'retained