Re: PSA: Maven 3.3.3 now required to build
Yeah the best bet is to use ./build/mvn --force (otherwise we'll still use your system maven). - Patrick On Mon, Aug 3, 2015 at 1:26 PM, Sean Owen so...@cloudera.com wrote: That statement is true for Spark 1.4.x. But you've reminded me that I failed to update this doc for 1.5, to say Maven 3.3.3 is required. Patch coming up. On Mon, Aug 3, 2015 at 9:12 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. Reason I asked this is, in Building Spark documentation of 1.4.1, I still see this. https://spark.apache.org/docs/latest/building-spark.html Building Spark using Maven requires Maven 3.0.4 or newer and Java 6+. But I noticed the following warnings from the build of Spark version 1.5.0-snapshot. So I was wondering if the changes you mentioned relate to newer versions of Spark or for 1.4.1 version as well. [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. [WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message: Detected JDK Version: 1.6.0-36 is not in the allowed range 1.7. Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 2:38 PM, Sean Owen so...@cloudera.com wrote: Using ./build/mvn should always be fine. Your local mvn is fine too if it's 3.3.3 or later (3.3.3 is the latest). That's what any brew users on OS X out there will have, by the way. On Mon, Aug 3, 2015 at 8:37 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Make ML Developer APIs public (post-1.4)
Hello, In developing new third-party pipeline components for Spark ML 1.4 (see dl4j-spark-ml), I encountered a few gaps in the earlier effort to make the ML Developer APIs public (SPARK-5995).I plan to file issues after we discuss on this thread. The below is a list of types that are presently private but might best be made public. VectorUDT.To define a relation with a vector field, VectorUDT must be instantiated. SchemaUtils. Third-party pipeline components have a need for checking column types and appending columns. Identifiable trait. The trait generates a unique identifier for the associated pipeline component. Nice to have a consistent format by reusing the trait. ProbabilisticClassifier. Third-party components should leverage the complex logic around computing only selected columns. Shared Params (HasLabel, HasFeatures). This is covered in SPARK-7146 but reiterating it here. Thanks, Eron Wright
Consistent recommendation for submitting spark apps to YARN, -master yarn --deploy-mode x vs -master yarn-x'
Hi, I was looking at the spark-submit and spark-shell --help on both (Spark 1.3.1 and Spark 1.5-snapshot) versions and the Spark documentation for submitting Spark applications to YARN. It seems to be there is some mismatch in the preferred syntax and documentation. Spark documentation http://spark.apache.org/docs/latest/submitting-applications.html#master-urls says that we need to specify either yarn-cluster or yarn-client to connect to a yarn cluster. yarn-client Connect to a YARN http://spark.apache.org/docs/latest/running-on-yarn.htmlcluster in client mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable. yarn-clusterConnect to a YARN http://spark.apache.org/docs/latest/running-on-yarn.htmlcluster in cluster mode. The cluster location will be found based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable. In the spark-submit --help it says the following Options: --master yarn --deploy-mode cluster or client. Usage: spark-submit [options] app jar | python file [app arguments] Usage: spark-submit --kill [submission ID] --master [spark://...] Usage: spark-submit --status [submission ID] --master [spark://...] Options: --master MASTER_URL spark://host:port, mesos://host:port, yarn, or local. --deploy-mode DEPLOY_MODE Whether to launch the driver program locally (client) or on one of the worker machines inside the cluster (cluster) (Default: client). I want to bring this to your attention as this is a bit confusing for someone running Spark on YARN. For example, they look at the spark-submit help command and start using the syntax, but when they look at online documentation or user-group mailing list, they see different spark-submit syntax. From a quick discussion with other engineers at Cloudera it seems like —deploy-mode is preferred as it is more consistent with the way things are done with other cluster managers, i.e. there is no standalone-cluster or standalone-client masters. This applies to Mesos as well. Either syntax works, but I would like to propose to use ‘-master yarn —deploy-mode x’ instead of ‘-master yarn-cluster or -master yarn-client’ as it is consistent with other cluster managers . This would require updating all Spark pages related to submitting Spark applications to YARN. So far I’ve identified the following pages. 1) http://spark.apache.org/docs/latest/running-on-yarn.html http://spark.apache.org/docs/latest/running-on-yarn.html 2) http://spark.apache.org/docs/latest/submitting-applications.html#master-urls http://spark.apache.org/docs/latest/submitting-applications.html#master-urls There is a JIRA to track the progress on this as well. https://issues.apache.org/jira/browse/SPARK-9570 https://issues.apache.org/jira/browse/SPARK-9570 The option we choose dictates whether we update the documentation or spark-submit and spark-shell help pages. Any thoughts which direction we should go? Guru Medasani gdm...@gmail.com
Unsubscribe
Please drop me from this list Trevor Grant Data Scientist https://github.com/rawkintrevo http://stackexchange.com/users/3002022/rawkintrevo *Fortunate is he, who is able to know the causes of things. -Virgil*
Re: [ANNOUNCE] Spark branch-1.5
Are these about the right rules of engagement for now until the release candidate? - Don't merge new features or improvements into 1.5 unless they're Important and Have Been Discussed - Docs and tests are OK to merge into 1.5 - Bug fixes can be merged into 1.5, with increasing conservativeness as the release candidate approaches FWIW there are now 331 JIRAs targeted at 1.5.0. Would it be reasonable to start un-targeting non-bug non-blocker issues? like, would anyone yell if I started doing that? that would leave ~100 JIRAs, which still seems like more than can actually go into the release. And anyone can re-target as desired. I'm interested with using this to communicate about release planning so we can actually see how things are moving along and decide if 1.5 has to be pushed back or not; otherwise it seems pretty unpredictable what's coming, going in, and when the process stops and outputs a release. On Mon, Aug 3, 2015 at 7:11 PM, Reynold Xin r...@databricks.com wrote: Hi Devs, Just an announcement that I've cut Spark's branch-1.5 to form the basis of the 1.5 release. Other than a few stragglers, this represents the end of active feature development for Spark 1.5. If committers are merging any features (outside of alpha modules), please shoot me an email so I can help coordinate. Any new commits will need to be explicitly merged into branch-1.5. In the next few days, we should come up with testing plans for the release and create umbrella JIRAs for testing various components and changes. I plan to cut a preview package for 1.5 towards the end of this week or early next week. - R - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: [ANNOUNCE] Spark branch-1.5
I agree that it's high time to start changing/removing target versions, especially if component maintainers have a good idea of what is not needed for 1.5. I'll start doing that on ML. On Mon, Aug 3, 2015 at 12:05 PM, Sean Owen so...@cloudera.com wrote: Are these about the right rules of engagement for now until the release candidate? - Don't merge new features or improvements into 1.5 unless they're Important and Have Been Discussed - Docs and tests are OK to merge into 1.5 - Bug fixes can be merged into 1.5, with increasing conservativeness as the release candidate approaches FWIW there are now 331 JIRAs targeted at 1.5.0. Would it be reasonable to start un-targeting non-bug non-blocker issues? like, would anyone yell if I started doing that? that would leave ~100 JIRAs, which still seems like more than can actually go into the release. And anyone can re-target as desired. I'm interested with using this to communicate about release planning so we can actually see how things are moving along and decide if 1.5 has to be pushed back or not; otherwise it seems pretty unpredictable what's coming, going in, and when the process stops and outputs a release. On Mon, Aug 3, 2015 at 7:11 PM, Reynold Xin r...@databricks.com wrote: Hi Devs, Just an announcement that I've cut Spark's branch-1.5 to form the basis of the 1.5 release. Other than a few stragglers, this represents the end of active feature development for Spark 1.5. If committers are merging any features (outside of alpha modules), please shoot me an email so I can help coordinate. Any new commits will need to be explicitly merged into branch-1.5. In the next few days, we should come up with testing plans for the release and create umbrella JIRAs for testing various components and changes. I plan to cut a preview package for 1.5 towards the end of this week or early next week. - R - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Should spark-ec2 get its own repo?
I sent a note to the Mesos developers and created https://github.com/apache/spark/pull/7899 to change the repository pointer. There are 3-4 open PRs right now in the mesos/spark-ec2 repository and I'll work on migrating them to amplab/spark-ec2 later today. My thoughts on moving the python script is that we should have a wrapper shell script that just fetches the latest version of spark_ec2.py for the corresponding Spark branch. We already have separate branches in our spark-ec2 repository for different Spark versions so it can just be a call to `wget https://github.com/amplab/spark-ec2/tree/spark-version/driver/spark_ec2.py`. Thanks Shivaram On Sun, Aug 2, 2015 at 11:34 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: On Sat, Aug 1, 2015 at 1:09 PM Matt Goodman meawo...@gmail.com wrote: I am considering porting some of this to a more general spark-cloud launcher, including google/aliyun/rackspace. It shouldn't be hard at all given the current approach for setup/install. FWIW, there are already some tools for launching Spark clusters on GCE and Azure: http://spark-packages.org/?q=tags%3A%22Deployment%22 Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Package Release Annoucement: Spark SQL on HBase Astro
When I tried to compile against hbase 1.1.1, I got: [ERROR] /home/hbase/ssoh/src/main/scala/org/apache/spark/sql/hbase/SparkSqlRegionObserver.scala:124: overloaded method next needs result type [ERROR] override def next(result: java.util.List[Cell], limit: Int) = next(result) Is there plan to support hbase 1.x ? Thanks On Wed, Jul 22, 2015 at 4:53 PM, Bing Xiao (Bing) bing.x...@huawei.com wrote: We are happy to announce the availability of the Spark SQL on HBase 1.0.0 release. http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase The main features in this package, dubbed “Astro”, include: · Systematic and powerful handling of data pruning and intelligent scan, based on partial evaluation technique · HBase pushdown capabilities like custom filters and coprocessor to support ultra low latency processing · SQL, Data Frame support · More SQL capabilities made possible (Secondary index, bloom filter, Primary Key, Bulk load, Update) · Joins with data from other sources · Python/Java/Scala support · Support latest Spark 1.4.0 release The tests by Huawei team and community contributors covered the areas: bulk load; projection pruning; partition pruning; partial evaluation; code generation; coprocessor; customer filtering; DML; complex filtering on keys and non-keys; Join/union with non-Hbase data; Data Frame; multi-column family test. We will post the test results including performance tests the middle of August. You are very welcomed to try out or deploy the package, and help improve the integration tests with various combinations of the settings, extensive Data Frame tests, complex join/union test and extensive performance tests. Please use the “Issues” “Pull Requests” links at this package homepage, if you want to report bugs, improvement or feature requests. Special thanks to project owner and technical leader Yan Zhou, Huawei global team, community contributors and Databricks. Databricks has been providing great assistance from the design to the release. “Astro”, the Spark SQL on HBase package will be useful for ultra low latency* query and analytics of large scale data sets in vertical enterprises**.* We will continue to work with the community to develop new features and improve code base. Your comments and suggestions are greatly appreciated. Yan Zhou / Bing Xiao Huawei Big Data team
Re: Unsubscribe
The way to do that is to follow the Unsubscribe link here for dev@spark: http://spark.apache.org/community.html We can't drop you. You have to do it yourself. Nick On Mon, Aug 3, 2015 at 1:54 PM Trevor Grant trevor.d.gr...@gmail.com wrote: Please drop me from this list Trevor Grant Data Scientist https://github.com/rawkintrevo http://stackexchange.com/users/3002022/rawkintrevo *Fortunate is he, who is able to know the causes of things. -Virgil*
Moving spark-ec2 to amplab github organization
Hi Mesos developers The Apache Spark project has been hosting using https://github.com/mesos/spark-ec2 as a supporting repository for some of our EC2 scripts. This is a remnant from the days when the Spark project itself was hosted at github.com/mesos/spark. Based on discussions in the Spark Developer mailing list [1], we plan to move the repository to github.com/amplab/spark-ec2 to enable a better development workflow. As these scripts are not used by the Apache Mesos project I don’t think any action is required from the Mesos developers, but please let me know if you have any thoughts about this. Thanks Shivaram [1] http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Should-spark-ec2-get-its-own-repo-td13151.html - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
RE: Package Release Annoucement: Spark SQL on HBase Astro
HBase 1.0 should work fine even though we have not completed full tests yet. Support of 1.1 should be able to be added with a minimal effort. Thanks, Yan From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Monday, August 03, 2015 10:33 AM To: Bing Xiao (Bing) Cc: dev@spark.apache.org; u...@spark.apache.org; Yan Zhou.sc Subject: Re: Package Release Annoucement: Spark SQL on HBase Astro When I tried to compile against hbase 1.1.1, I got: [ERROR] /home/hbase/ssoh/src/main/scala/org/apache/spark/sql/hbase/SparkSqlRegionObserver.scala:124: overloaded method next needs result type [ERROR] override def next(result: java.util.List[Cell], limit: Int) = next(result) Is there plan to support hbase 1.x ? Thanks On Wed, Jul 22, 2015 at 4:53 PM, Bing Xiao (Bing) bing.x...@huawei.commailto:bing.x...@huawei.com wrote: We are happy to announce the availability of the Spark SQL on HBase 1.0.0 release. http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase The main features in this package, dubbed “Astro”, include: • Systematic and powerful handling of data pruning and intelligent scan, based on partial evaluation technique • HBase pushdown capabilities like custom filters and coprocessor to support ultra low latency processing • SQL, Data Frame support • More SQL capabilities made possible (Secondary index, bloom filter, Primary Key, Bulk load, Update) • Joins with data from other sources • Python/Java/Scala support • Support latest Spark 1.4.0 release The tests by Huawei team and community contributors covered the areas: bulk load; projection pruning; partition pruning; partial evaluation; code generation; coprocessor; customer filtering; DML; complex filtering on keys and non-keys; Join/union with non-Hbase data; Data Frame; multi-column family test. We will post the test results including performance tests the middle of August. You are very welcomed to try out or deploy the package, and help improve the integration tests with various combinations of the settings, extensive Data Frame tests, complex join/union test and extensive performance tests. Please use the “Issues” “Pull Requests” links at this package homepage, if you want to report bugs, improvement or feature requests. Special thanks to project owner and technical leader Yan Zhou, Huawei global team, community contributors and Databricks. Databricks has been providing great assistance from the design to the release. “Astro”, the Spark SQL on HBase package will be useful for ultra low latency query and analytics of large scale data sets in vertical enterprises. We will continue to work with the community to develop new features and improve code base. Your comments and suggestions are greatly appreciated. Yan Zhou / Bing Xiao Huawei Big Data team
PSA: Maven 3.3.3 now required to build
If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
[ANNOUNCE] Spark branch-1.5
Hi Devs, Just an announcement that I've cut Spark's branch-1.5 to form the basis of the 1.5 release. Other than a few stragglers, this represents the end of active feature development for Spark 1.5. *If committers are merging any features (outside of alpha modules), please shoot me an email so I can help coordinate. Any new commits will need to be explicitly merged into branch-1.5*. In the next few days, we should come up with testing plans for the release and create umbrella JIRAs for testing various components and changes. I plan to cut a preview package for 1.5 towards the end of this week or early next week. - R
Re: Came across Spark SQL hang/Error issue with Spark 1.5 Tungsten feature
Based on the latest spark code(commit 608353c8e8e50461fafff91a2c885dca8af3aaa8) and used the same Spark SQL query to test two group of combined configuration and seemed that currently it don't work fine in tungsten-sort shuffle manager from below results: *Test 1# (PASSED)* spark.shuffle.manager=sort spark.sql.codegen=true spark.sql.unsafe.enabled=true *Test 2#(FAILED)* spark.shuffle.manager=tungsten-sort spark.sql.codegen=true spark.sql.unsafe.enabled=true 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode4:50313 15/08/03 16:46:02 INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 3 is 586 bytes 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode2:60490 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode2:56319 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode1:58179 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode1:32816 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode3:55840 15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to bignode3:46874 15/08/03 16:46:02 WARN scheduler.TaskSetManager: Lost task 42.0 in stage 158.0 (TID 1548, bignode4): java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:118) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:107) at scala.collection.Iterator$$anon$13.next(Iterator.scala:372) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30) at org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:167) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:140) at org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:120) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:71) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Came-across-Spark-SQL-hang-Error-issue-with-Spark-1-5-Tungsten-feature-tp13537p13563.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: PSA: Maven 3.3.3 now required to build
Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: PSA: Maven 3.3.3 now required to build
Thanks Sean. Reason I asked this is, in Building Spark documentation of 1.4.1, I still see this. https://spark.apache.org/docs/latest/building-spark.html https://spark.apache.org/docs/latest/building-spark.html Building Spark using Maven requires Maven 3.0.4 or newer and Java 6+. But I noticed the following warnings from the build of Spark version 1.5.0-snapshot. So I was wondering if the changes you mentioned relate to newer versions of Spark or for 1.4.1 version as well. [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. [WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message: Detected JDK Version: 1.6.0-36 is not in the allowed range 1.7. Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 2:38 PM, Sean Owen so...@cloudera.com wrote: Using ./build/mvn should always be fine. Your local mvn is fine too if it's 3.3.3 or later (3.3.3 is the latest). That's what any brew users on OS X out there will have, by the way. On Mon, Aug 3, 2015 at 8:37 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: PSA: Maven 3.3.3 now required to build
Using ./build/mvn should always be fine. Your local mvn is fine too if it's 3.3.3 or later (3.3.3 is the latest). That's what any brew users on OS X out there will have, by the way. On Mon, Aug 3, 2015 at 8:37 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: PSA: Maven 3.3.3 now required to build
That statement is true for Spark 1.4.x. But you've reminded me that I failed to update this doc for 1.5, to say Maven 3.3.3 is required. Patch coming up. On Mon, Aug 3, 2015 at 9:12 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. Reason I asked this is, in Building Spark documentation of 1.4.1, I still see this. https://spark.apache.org/docs/latest/building-spark.html Building Spark using Maven requires Maven 3.0.4 or newer and Java 6+. But I noticed the following warnings from the build of Spark version 1.5.0-snapshot. So I was wondering if the changes you mentioned relate to newer versions of Spark or for 1.4.1 version as well. [WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. [WARNING] Rule 1: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message: Detected JDK Version: 1.6.0-36 is not in the allowed range 1.7. Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 2:38 PM, Sean Owen so...@cloudera.com wrote: Using ./build/mvn should always be fine. Your local mvn is fine too if it's 3.3.3 or later (3.3.3 is the latest). That's what any brew users on OS X out there will have, by the way. On Mon, Aug 3, 2015 at 8:37 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: PSA: Maven 3.3.3 now required to build
Just note that if you have mvn in your path, you need to use build/mvn --force. On Mon, Aug 3, 2015 at 12:38 PM, Sean Owen so...@cloudera.com wrote: Using ./build/mvn should always be fine. Your local mvn is fine too if it's 3.3.3 or later (3.3.3 is the latest). That's what any brew users on OS X out there will have, by the way. On Mon, Aug 3, 2015 at 8:37 PM, Guru Medasani gdm...@gmail.com wrote: Thanks Sean. I noticed this one while building Spark version 1.5.0-SNAPSHOT this morning. WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireMavenVersion failed with message: Detected Maven Version: 3.2.5 is not in the allowed range 3.3.3. Should we be using maven 3.3.3 locally or build/mvn starting from Spark 1.4.1 or Spark version 1.5? Guru Medasani gdm...@gmail.com On Aug 3, 2015, at 1:01 PM, Sean Owen so...@cloudera.com wrote: If you use build/mvn or are already using Maven 3.3.3 locally (i.e. via brew on OS X), then this won't affect you, but I wanted to call attention to https://github.com/apache/spark/pull/7852 which makes Maven 3.3.3 the minimum required to build Spark. This heads off problems from some behavior differences that Patrick and I observed between 3.3 and 3.2 last week, on top of the dependency reduced POM glitch from the 1.4.1 release window. Again all you need to do is use build/mvn if you don't already have the latest Maven installed and all will be well. Sean - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org -- Marcelo
Re: [ANNOUNCE] Spark branch-1.5
Would it be reasonable to start un-targeting non-bug non-blocker issues? like, would anyone yell if I started doing that? that would leave ~100 JIRAs, which still seems like more than can actually go into the release. And anyone can re-target as desired. I think the maintainers of the various components should take care of this. Reynold and I just did a pass over SQL and I think that by Friday there should only be blocker bugs / documentation remaining.