[GitHub] spark issue #19190: [SPARK-21976][DOC] Fix wrong documentation for Mean Abso...
Github user FavioVazquez commented on the issue: https://github.com/apache/spark/pull/19190 Thanks to Carlos Munguia, Jared Romero and Christhian Flores :). @montactuaria @jared275 @chris122flores --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19190: [SPARK-21976][DOC]
GitHub user FavioVazquez opened a pull request: https://github.com/apache/spark/pull/19190 [SPARK-21976][DOC] ## What changes were proposed in this pull request? Fixed wrong documentation for Mean Absolute Error. Even though the code is correct for the MAE: ```scala @Since("1.2.0") def meanAbsoluteError: Double = { summary.normL1(1) / summary.count } ``` In the documentation the division by N is missing. ## How was this patch tested? All of spark tests were run. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark mae-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19190.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19190 commit b2e2f8caef4f43d62cb48fc09a0fe17e71f3f4dd Author: FavioVazquez Date: 2015-04-30T19:46:40Z Merge remote-tracking branch 'upstream/master' commit edab1ef07312cf44e26fccb7fca8f4a4977ad3ee Author: FavioVazquez Date: 2015-05-05T14:16:02Z Merge remote-tracking branch 'upstream/master' commit 9af7074235b6c13001924e037772195b640115b8 Author: FavioVazquez Date: 2015-05-15T13:58:04Z Merge remote-tracking branch 'upstream/master' commit f27a20b9f53b643d5c963729f56ee548d0c8e263 Author: FavioVazquez Date: 2015-06-04T16:10:00Z Merge remote-tracking branch 'upstream/master' commit ad882a378e24458832b961fd97eb4b7662203ef9 Author: FavioVazquez Date: 2015-06-09T12:47:59Z Merge remote-tracking branch 'upstream/master' commit 424a92853f44c34f310e0a9e8dd927d246bd9171 Author: FavioVazquez Date: 2015-06-17T14:56:35Z Merge remote-tracking branch 'upstream/master' commit 5311719db61454eee5e1715f44507c294509ec1c Author: FavioVazquez Date: 2015-09-21T06:43:29Z Merge remote-tracking branch 'upstream/master' commit d6b551b1d25cef07096b7e1fc22b659ed753d9dc Author: faviovazquez Date: 2017-09-11T13:36:27Z Merge remote-tracking branch 'upstream/master' commit a95cfc6c5e88f44429319b52462b537ae0bc1857 Author: Favio André Vázquez Date: 2017-09-11T13:49:03Z Fix doc for MAE --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8274][Documentation-MLlib] Fix wrong UR...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/6722#issuecomment-110371513 Thanks @srowen! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8274][Documentation-MLlib] Fix wrong UR...
GitHub user FavioVazquez opened a pull request: https://github.com/apache/spark/pull/6722 [SPARK-8274][Documentation-MLlib] Fix wrong URLs in MLlib Frequent Pattern Mining Documentation There is a mistake in the URLs of the Scala section of FP-Growth in the MLlib Frequent Pattern Mining documentation. The URL points to https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/fpm/FPGrowth.html which is the Java's API, the link should point to the Scala API https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.fpm.FPGrowth There's another mistake in the FP-GrowthModel in the same section, the link points, again, to the Java's API https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/fpm/FPGrowthModel.html, the link should point to the Scala API https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.fpm.FPGrowthModel You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark fix-wrog-urls-mllib-fpgrowth Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6722 commit b2e2f8caef4f43d62cb48fc09a0fe17e71f3f4dd Author: FavioVazquez Date: 2015-04-30T19:46:40Z Merge remote-tracking branch 'upstream/master' commit edab1ef07312cf44e26fccb7fca8f4a4977ad3ee Author: FavioVazquez Date: 2015-05-05T14:16:02Z Merge remote-tracking branch 'upstream/master' commit 9af7074235b6c13001924e037772195b640115b8 Author: FavioVazquez Date: 2015-05-15T13:58:04Z Merge remote-tracking branch 'upstream/master' commit f27a20b9f53b643d5c963729f56ee548d0c8e263 Author: FavioVazquez Date: 2015-06-04T16:10:00Z Merge remote-tracking branch 'upstream/master' commit ad882a378e24458832b961fd97eb4b7662203ef9 Author: FavioVazquez Date: 2015-06-09T12:47:59Z Merge remote-tracking branch 'upstream/master' commit e1ca54dfa69179758166e059177e11155f0310b3 Author: FavioVazquez Date: 2015-06-09T12:55:50Z - Fixed wrong URLs in MLlib Frequent Pattern Mining, FP-Growth Scala section --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7671] Fix wrong URLs in MLlib Data Type...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/6196#issuecomment-102553184 Happy to help @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7671] Fix wrong URLs in MLlib Data Type...
GitHub user FavioVazquez opened a pull request: https://github.com/apache/spark/pull/6196 [SPARK-7671] Fix wrong URLs in MLlib Data Types Documentation There is a mistake in the URL of Matrices in the MLlib Data Types documentation (Local matrix scala section), the URL points to https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.linalg.Matrices which is a mistake, since Matrices is an object that implements factory methods for Matrix that does not have a companion class. The correct link should point to https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.linalg.Matrices$ There is another mistake, in the Local Vector section in Scala, Java and Python In the Scala section the URL of Vectors points to the trait Vector (https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.linalg.Vector) and not to the factory methods implemented in Vectors. The correct link should be: https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.linalg.Vectors$ In the Java section the URL of Vectors points to the Interface Vector (https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html) and not to the Class Vectors The correct link should be: https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vectors.html In the Python section the URL of Vectors points to the class Vector (https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html#pyspark.mllib.linalg.Vector) and not the Class Vectors The correct link should be: https://spark.apache.org/docs/latest/api/python/pyspark.mllib.html#pyspark.mllib.linalg.Vectors You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark fix-typo-matrices-mllib-datatypes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6196.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6196 commit b2e2f8caef4f43d62cb48fc09a0fe17e71f3f4dd Author: FavioVazquez Date: 2015-04-30T19:46:40Z Merge remote-tracking branch 'upstream/master' commit edab1ef07312cf44e26fccb7fca8f4a4977ad3ee Author: FavioVazquez Date: 2015-05-05T14:16:02Z Merge remote-tracking branch 'upstream/master' commit 9af7074235b6c13001924e037772195b640115b8 Author: FavioVazquez Date: 2015-05-15T13:58:04Z Merge remote-tracking branch 'upstream/master' commit 3e9efd56b8d6436f2985db24e2074a2662f3ed89 Author: FavioVazquez Date: 2015-05-15T18:31:49Z - Fixed wrong URLs in the MLlib Data Types Documentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-102049341 All test passed @srowen. It was, as expected, an unrelated error. Is everything set now to merge this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-102002721 This has happened before @srowen, I think this is again an unrelated fail. Could you ask jenkins to retest this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101981241 I'm happy to help, give me a sec and I'll push the changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101980296 Are you sure Sean? I could make the change and push it, but if is easier to make the change in the merge you tell me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101601409 I hope that this is the correct way of making all the changes you suggested. Please check this and thank you @srowen @vanzin and @pwendell. Let me know if there is something else that could be done, or if this finishes the patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101565704 Great, I'm familiar with the process @srowen. Thank you guys for all the suggestions, I'm making the changes and be pushing the changes soon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101511508 I see @pwendell, I'll push the changes tomorrow, is a little late here in Venezuela. Greetings and thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101478579 And @srowen you said some days ago that you knew the places that this PR needed a Rebase, could you point them out to me please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101475218 In summary, add that line that @pwendell suggested.But I'm not sure about the default profiles, should I erase the hadoop-1 profile? there will be no default hadoop version now? Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101474934 I will make the suggestes changes and push them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101474903 Sorry i closed it by accident --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez closed the pull request at: https://github.com/apache/spark/pull/5786 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
GitHub user FavioVazquez reopened a pull request: https://github.com/apache/spark/pull/5786 [SPARK-7249] Updated Hadoop dependencies due to inconsistency in the versions Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons. Changes proposed by @vanzin resulting from previous pull-request https://github.com/apache/spark/pull/5783 that did not fixed the problem correctly. Please let me know if this is the correct way of doing this, the comments of @vanzin are in the pull-request mentioned. You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark update-hadoop-dependencies Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5786 commit ec91ce3c405123818a4c56ef361d9cc82951677d Author: FavioVazquez Date: 2015-04-29T17:58:09Z - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version) commit 660decce9d3c2300aee493b605da0da8a74b3ea6 Author: FavioVazquez Date: 2015-04-29T19:16:04Z - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons commit 7e9955df29b5d5c9cda950636d51da753e6d17ea Author: FavioVazquez Date: 2015-04-29T19:35:08Z - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons commit 6b4bfafbe4f98c92ac2fe7aeb5f36a37d27a9678 Author: FavioVazquez Date: 2015-04-30T21:41:08Z - Cleanup in hadoop-2.x profiles since they contained mostly redundant stuff. commit 13542929c9cb3ddfec31bbb794e490b44c273df4 Author: FavioVazquez Date: 2015-04-30T22:13:50Z - Fixed hadoop-1 version to match jenkins build profile in hadoop1.0 tests and documentation commit 287fa2ffc31bb0c9eaf5daf80825ff0093f3f20d Author: FavioVazquez Date: 2015-04-30T22:17:44Z - Updated documentation about specifying the hadoop version in building-spark. Now is clear that Spark will build against Hadoop 2.2.0 by default. - Added Cloudera CDH 5.3.3 without MapReduce example in the building-spark doc. commit 70b8344dcad8f6de71bd6356cd6eec375211fdb3 Author: FavioVazquez Date: 2015-04-30T22:57:16Z - Fixed typo in the make-distribution.sh file and added hadoop-1 in the Related profiles commit 88a8b88a13a02cbde04792cb63e3c6a81407d915 Author: FavioVazquez Date: 2015-05-01T16:48:27Z - Simplified Hadoop profiles due to new setting of global properties in the pom.xml file - Added comment to specify that the hadoop-2.2 profile is now the default hadoop profile in the pom.xml file - Erased hadoop-2.2 from related hadoop profiles now that is a no-op in the make-distribution.sh file commit 199f40b1733015a414eb928b2090f3bf4d0b7a7e Author: FavioVazquez Date: 2015-05-01T20:44:30Z - Erased unnecessary CDH5-specific note in docs/building-spark.md - Remove example of instance -Phadoop-2.2 -Dhadoop.version=2.2.0 in docs/building-spark.md - Enabled hadoop-2.2 profile when the Hadoop version is 2.2.0, which is now the default .Added comment in the yarn/pom.xml to specify that. commit a6507792cc12fc03139be825357f22329773c823 Author: FavioVazquez Date: 2015-05-01T20:50:46Z - Default value of avro.mapred.classifier has been set to hadoop2 in pom.xml - Cleaned up hadoop-2.3 and 2.4 profiles due to change in the default set in avro.mapred.classifier in pom.xml commit 0470587ad7af93041e25dcb07954b835d9508a10 Author: FavioVazquez Date: 2015-05-01T21:06:52Z - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in create-release.sh - Updated how the releases are made in the create-release.sh no that the default hadoop version is the 2.2.0 - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in scalastyle - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in run-tests - Better example given in the hadoop-third-party-distributions.md now that the default hadoop version is 2.2.0 commit fda6a51986ed4a656d37539502cd4684c46c8cfe Author: FavioVazquez Date: 2015-05-01T21:42:04Z - Updated hadoop1 releases in create-release.sh due to changes in the default hadoop version set - Erased unnecessary instance of -Dyarn.version=2.2.0 in create-release.sh - Prettify
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101474861 Perfect, I've been whatching all of your conversations. I wil make th --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101134292 I think that you guys are all right, you have suggested some great changes and I think I'll let @srowen and @vanzin, with you @pwendell decide for the future of this PR, in my humble opinion it could be good, but is all up to you guys. I'll be alert to the comments of this PR and please let me know if there is something I could help, making this patch better, or fixing this issues in another way. Thanks for teaching me great stuff. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-101126703 I think @srowen saw this PR as a cleaned up of old dependencies and updating of spark's defaults to a currently used Hadoop version. This started as a minor fix for inconsistencies in the Hadoop defaults when using the latest CDH5 distribution, and grew to be a upgrading of the Hadoop default version, updating of docs, cleaned up yarn's POM and main POM. I still face problems when building Spark for CDH5 without this changes, and I think it would be helpful to update the versions, since Hadoop-1 is really old, and I really believe it pumps up Spark to the newest technologies. I'm no expert in this field, but I think this PR could be interesting and useful for a lot of people that's starting with this technologies and would like to build Spark with the newest Hadoop version. I have to remark that if you use the actual building process and main POM, you'll get errors when try to connect to Cloudera's newest HDFS, yo can see that in the beginning of the PR. It's really awkward to build Spark with lots of ad hoc and in situ dependencies just to keep old versions, Idk maybe it's just me. I really appreciated @srowen and @vanzin help with this, and would like to now if you think this is the right track to Spark 1.4.0 @pwendell. I'm up to making any more changes and updates if you think is necessary, and I repeat, I think this could be a good refresh to spark dependencies, I know this is really a minor change, but it could grow to be even a better update. Thanks for your comments, I'll wait for your replies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-99705433 Hello @srowen any advances in the coordination (if/when) of mergin this PR? Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98835084 Ok I see @srowen thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98831666 Hello @srowen, I'm not sure about the next steps you mentioned, could you please explain me what's going to happen now with the PR. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98498096 Great @srowen please let me know if I can help with something else in this patch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98394731 @srowen @vanzin everything seems fine and test passed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98371065 Hello @srowen I was noticing some of that things. Thank you for making it easy for me to change it. I just pushed your suggested changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98277843 @vanzin @srowen thank you for helping me a lot with this patch. Let me know if this finishes the patch please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98268024 Hope that's what you have been talking about @srowen @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98262625 So should I erase the yarn profile from the root POM and move the entire profile into the yarn/POM? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98261176 You are right @vanzin, I just pushed those changes. I'll wait for @srowen comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98260176 @vanzin I think that's what you ment. Check if everithing is OK now please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98259020 I see. I'll do that right away --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98258799 OK @vanzin so I should change the yarn confing in the main POM and leave the yarn/pom.xml as it was? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98257656 Great @srowen. Lets wait for @vanzin comments on this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98251739 Please let me now if this completes the changes in the patch. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98244148 Yes I figured it out a second before you told me looking at the other implementations. Thanks. I'll be pushing the changes in a second --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98242483 Great. My confusion is with the create-release, what have to be change in there? So the names of the releases should be kept but change the implementation? I'm not quite sure what you ment there @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98242063 Yes I saw the hadoop1.jar there also. Should I keep it like this or change the hadoop-1 profile to the hadoop2 also? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98239441 Please check if this are the changes that you suggested @vanzin @srowen. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98212186 Great, I'll do that right away --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-98009452 Are this changes what is needed to complete the patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-97959873 Oh I see. Thanks. Should I keep the PR open then? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-97934646 Hi, now that the test have passed what will happen now? I've read the documentation in https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-PullRequest but I'm not sure, will this be merge to master? Should I do something else? keep the PR open? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-97618915 In the link https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31337/, there is no failure in the tests, but in the console output the error thrown was this: FAIL: test_count_by_value_and_window (__main__.WindowFunctionTests) -- Traceback (most recent call last): File "pyspark/streaming/tests.py", line 418, in test_count_by_value_and_window self._test_func(input, func, expected) File "pyspark/streaming/tests.py", line 133, in _test_func self.assertEqual(expected, result) AssertionError: Lists differ: [[1], [2], [3], [4], [5], [6], [6], [6], [6], [6]] != [[1], [2], [3], [4], [5], [6], [6], [6], [6]] First list contains 1 additional elements. First extra element 9: [6] - [[1], [2], [3], [4], [5], [6], [6], [6], [6], [6]] ? - + [[1], [2], [3], [4], [5], [6], [6], [6], [6]] -- Ran 40 tests in 134.429s FAILED (failures=1) ('timeout after', 20) ('timeout after', 20) ('timeout after', 20) ('timeout after', 5) Had test failures; see logs. [error] Got a return code of 255 on line 240 of the run-tests script. Archiving unit tests logs... So I'm not sure what happened. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5786#issuecomment-97618173 Can someone please explain me what failed? I'm kinda new to this, and I'm not sure what failed, I want to know to fix it and maybe create another pull request. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Updated Hadoop dependencies due to inconsisten...
GitHub user FavioVazquez opened a pull request: https://github.com/apache/spark/pull/5786 Updated Hadoop dependencies due to inconsistency in the versions Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons. Changes proposed by @vanzin resulting from previous pull-request https://github.com/apache/spark/pull/5783 that did not fixed the problem correctly. Please let me know if this is the correct way of doing this, the comments of @vanzin are in the pull-request mentioned. You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark update-hadoop-dependencies Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5786 commit ec91ce3c405123818a4c56ef361d9cc82951677d Author: FavioVazquez Date: 2015-04-29T17:58:09Z - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version) commit 660decce9d3c2300aee493b605da0da8a74b3ea6 Author: FavioVazquez Date: 2015-04-29T19:16:04Z - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons commit 7e9955df29b5d5c9cda950636d51da753e6d17ea Author: FavioVazquez Date: 2015-04-29T19:35:08Z - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7238] Update protobuf-java version of c...
Github user FavioVazquez closed the pull request at: https://github.com/apache/spark/pull/5783 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7238] Update protobuf-java version of c...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5783#issuecomment-97545038 Thank you for clearing that up for me. I'm doing the changes that you suggested and will make soon a pull request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7238] Update protobuf-java version of c...
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5783#issuecomment-97529157 I see. So it should be done in the command line when building? The problem is that the pre-compiled version for CDH doesn't work for the 2.5.0-cdh5.3.3 version because the protbuf inherits from the global property that is fixed for 2.4.1 and throws an error, doing this it worked. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7238] Update protobuf-java version of c...
GitHub user FavioVazquez opened a pull request: https://github.com/apache/spark/pull/5783 [SPARK-7238] Update protobuf-java version of com.google.protobuf dependancy This upgrade is needed when building spark for CDH5 2.5.0-cdh5.3.3 due to incompatibilities in the protobuf version used by com.google.protobuf and the one used in hadoop. The default version of protobuf is set to 2.4.1 in the global properties, and this is stated in the pom.xml file: So this upgrade will only be affecting the com.google.protobuf version of java-protobuf. Tested for the Cloudera distribution 2.5.0-cdh5.3.3 using Mesos 0.22.0 in cluster mode. You can merge this pull request into a Git repository by running: $ git pull https://github.com/FavioVazquez/spark upgrade-protobuf-version Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5783.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5783 commit ec91ce3c405123818a4c56ef361d9cc82951677d Author: FavioVazquez Date: 2015-04-29T17:58:09Z - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6972][SQL] Add Coalesce to DataFrame
Github user FavioVazquez commented on the pull request: https://github.com/apache/spark/pull/5545#issuecomment-97453694 How do I can implement this? I've built spark using the newest version of master, that contains this code, but IntelliJ still doesn't recognize the coalesce(1) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org