This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git
The following commit(s) were added to refs/heads/asf-site by this push: new 814b3d0 Remove links to dead orgs / meetups; fix some broken links 814b3d0 is described below commit 814b3d05f6f3379a749ae78bd450024d594fb385 Author: Sean Owen <sean.o...@databricks.com> AuthorDate: Tue Apr 16 09:16:37 2019 -0500 Remove links to dead orgs / meetups; fix some broken links Author: Sean Owen <sean.o...@databricks.com> Closes #194 from srowen/BrokenLinks. --- community.md | 21 --------------------- developer-tools.md | 17 ++++------------- documentation.md | 2 +- downloads.md | 4 ++-- examples.md | 2 +- index.md | 2 +- powered-by.md | 25 ++++--------------------- release-process.md | 4 ++-- site/community.html | 21 --------------------- site/developer-tools.html | 19 ++++--------------- site/documentation.html | 2 +- site/downloads.html | 4 ++-- site/examples.html | 2 +- site/index.html | 2 +- site/powered-by.html | 31 ++++--------------------------- site/release-process.html | 4 ++-- site/third-party-projects.html | 5 ++--- site/trademarks.html | 2 +- third-party-projects.md | 5 ++--- trademarks.md | 2 +- 20 files changed, 36 insertions(+), 140 deletions(-) diff --git a/community.md b/community.md index 39e1a73..58c1ee2 100644 --- a/community.md +++ b/community.md @@ -139,33 +139,18 @@ Spark Meetups are grass-roots events organized and hosted by individuals in the <a href="https://www.meetup.com/Spark_big_data_analytics/">Bangalore Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Berlin-Apache-Spark-Meetup/">Berlin Spark Meetup</a> - </li> - <li> - <a href="https://www.meetup.com/spark-user-beijing-Meetup/">Beijing Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/Boston-Apache-Spark-User-Group/">Boston Spark Meetup</a> </li> <li> <a href="https://www.meetup.com/Boulder-Denver-Spark-Meetup/">Boulder/Denver Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Chicago-Spark-Users/">Chicago Spark Users</a> - </li> - <li> <a href="https://www.meetup.com/Christchurch-Apache-Spark-Meetup/">Christchurch Apache Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Cincinnati-Apache-Spark-Meetup/">Cincinanati Apache Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/Hangzhou-Apache-Spark-Meetup/">Hangzhou Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Spark-User-Group-Hyderabad/">Hyderabad Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/israel-spark-users/">Israel Spark Users</a> </li> <li> @@ -196,9 +181,6 @@ Spark Meetups are grass-roots events organized and hosted by individuals in the <a href="https://www.meetup.com/Shenzhen-Apache-Spark-Meetup/">Shenzhen Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Toronto-Apache-Spark">Toronto Apache Spark</a> - </li> - <li> <a href="https://www.meetup.com/Tokyo-Spark-Meetup/">Tokyo Spark Meetup</a> </li> <li> @@ -207,9 +189,6 @@ Spark Meetups are grass-roots events organized and hosted by individuals in the <li> <a href="https://www.meetup.com/Washington-DC-Area-Spark-Interactive/">Washington DC Area Spark Meetup</a> </li> - <li> - <a href="https://www.meetup.com/Apache-Spark-Zagreb-Meetup/">Zagreb Spark Meetup</a> - </li> </ul> <p>If you'd like your meetup or conference added, please email <a href="mailto:u...@spark.apache.org">u...@spark.apache.org</a>.</p> diff --git a/developer-tools.md b/developer-tools.md index 00d57cd..29a9f92 100644 --- a/developer-tools.md +++ b/developer-tools.md @@ -110,7 +110,7 @@ If you'd prefer, you can run all of these commands on the command line (but this $ build/sbt "core/testOnly *DAGSchedulerSuite -- -z SPARK-12345" ``` -For more about how to run individual tests with sbt, see the [sbt documentation](http://www.scala-sbt.org/0.13/docs/Testing.html). +For more about how to run individual tests with sbt, see the [sbt documentation](https://www.scala-sbt.org/0.13/docs/Testing.html). <h4>Testing with Maven</h4> @@ -463,16 +463,7 @@ in the Eclipse install directory. Increase the following setting as needed: <a name="nightly-builds"></a> <h3>Nightly Builds</h3> -Packages are built regularly off of Spark's master branch and release branches. These provide -Spark developers access to the bleeding-edge of Spark master or the most recent fixes not yet -incorporated into a maintenance release. These should only be used by Spark developers, as they -may have bugs and have not undergone the same level of testing as releases. Spark nightly packages -are available at: - -- Latest master build: <a href="https://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest">https://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest</a> -- All nightly builds: <a href="https://people.apache.org/~pwendell/spark-nightly/">https://people.apache.org/~pwendell/spark-nightly/</a> - -Spark also publishes SNAPSHOT releases of its Maven artifacts for both master and maintenance +Spark publishes SNAPSHOT releases of its Maven artifacts for both master and maintenance branches on a nightly basis. To link to a SNAPSHOT you need to add the ASF snapshot repository to your build. Note that SNAPSHOT artifacts are ephemeral and may change or be removed. To use these you must add the ASF snapshot repository at @@ -480,8 +471,8 @@ be removed. To use these you must add the ASF snapshot repository at ``` groupId: org.apache.spark -artifactId: spark-core_2.10 -version: 1.5.0-SNAPSHOT +artifactId: spark-core_2.12 +version: 3.0.0-SNAPSHOT ``` <a name="profiling"></a> diff --git a/documentation.md b/documentation.md index 3bf3754..74f5968 100644 --- a/documentation.md +++ b/documentation.md @@ -171,7 +171,7 @@ Slides, videos and EC2-based exercises from each of these are available online: <li><a href="http://shop.oreilly.com/product/0636920028512.do">Learning Spark</a>, by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia (O'Reilly Media)</li> <li><a href="http://www.manning.com/bonaci/">Spark in Action</a>, by Marko Bonaci and Petar Zecevic (Manning)</li> <li><a href="http://shop.oreilly.com/product/0636920035091.do">Advanced Analytics with Spark</a>, by Juliet Hougland, Uri Laserson, Sean Owen, Sandy Ryza and Josh Wills (O'Reilly Media)</li> - <li><a href="http://manning.com/malak/">Spark GraphX in Action</a>, by Michael Malak (Manning)</li> + <li><a href="https://www.manning.com/books/spark-graphx-in-action">Spark GraphX in Action</a>, by Michael Malak (Manning)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/fast-data-processing-spark-second-edition">Fast Data Processing with Spark</a>, by Krishna Sankar and Holden Karau (Packt Publishing)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-spark">Machine Learning with Spark</a>, by Nick Pentreath (Packt Publishing)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/spark-cookbook">Spark Cookbook</a>, by Rishi Yadav (Packt Publishing)</li> diff --git a/downloads.md b/downloads.md index 36155ed..b5ca71e 100644 --- a/downloads.md +++ b/downloads.md @@ -27,14 +27,14 @@ $(document).ready(function() { 4. Verify this release using the <span id="sparkDownloadVerify"></span> and [project release KEYS](https://www.apache.org/dist/spark/KEYS). ### Link with Spark -Spark artifacts are [hosted in Maven Central](https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22). You can add a Maven dependency with the following coordinates: +Spark artifacts are [hosted in Maven Central](https://search.maven.org/search?q=g:org.apache.spark). You can add a Maven dependency with the following coordinates: groupId: org.apache.spark artifactId: spark-core_2.11 version: 2.4.1 ### Installing with PyPi -<a href="https://pypi.python.org/pypi/pyspark">PySpark</a> is now available in pypi. To install just run `pip install pyspark`. +<a href="https://pypi.org/project/pyspark/">PySpark</a> is now available in pypi. To install just run `pip install pyspark`. ### Release Notes for Stable Releases diff --git a/examples.md b/examples.md index 1bc45d0..3698758 100644 --- a/examples.md +++ b/examples.md @@ -16,7 +16,7 @@ In the RDD API, there are two types of operations: <em>transformations</em>, which define a new dataset based on previous ones, and <em>actions</em>, which kick off a job to execute on a cluster. On top of Spark’s RDD API, high level APIs are provided, e.g. -[DataFrame API](https://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes) and +[DataFrame API](https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes) and [Machine Learning API](https://spark.apache.org/docs/latest/mllib-guide.html). These high level APIs provide a concise way to conduct certain data operations. In this page, we will show examples using RDD API as well as examples using high level APIs. diff --git a/index.md b/index.md index e3a9557..a852d1b 100644 --- a/index.md +++ b/index.md @@ -115,7 +115,7 @@ df.<span style="color: #000000;">where</span><span style="color: #F78811;">( on <a href="https://mesos.apache.org">Mesos</a>, or on <a href="https://kubernetes.io/">Kubernetes</a>. Access data in <a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS</a>, - <a href="https://alluxio.org">Alluxio</a>, + <a href="https://www.alluxio.org/">Alluxio</a>, <a href="https://cassandra.apache.org">Apache Cassandra</a>, <a href="https://hbase.apache.org">Apache HBase</a>, <a href="https://hive.apache.org">Apache Hive</a>, diff --git a/powered-by.md b/powered-by.md index 8345ab2..6609074 100644 --- a/powered-by.md +++ b/powered-by.md @@ -47,7 +47,6 @@ initially launched Spark - <a href="http://alluxio.com/">Alluxio</a> - Alluxio, formerly Tachyon, is the world's first system that unifies disparate storage systems at memory speed. -- <a href="http://alpinenow.com/">Alpine Data Labs</a> - <a href="http://amazon.com">Amazon</a> - <a href="http://www.art.com/">Art.com</a> - Trending analytics and personalization @@ -55,8 +54,6 @@ initially launched Spark - We are using Spark Core, Streaming, MLlib and Graphx. We leverage Spark and Hadoop ecosystem to build cost effective data center solution for our customer in telco industry as well as other industrial sectors. -- <a href="http://www.atigeo.com">Atigeo</a> – integrated Spark in xPatterns, our big data -analytics platform, as a replacement for Hadoop MR - <a href="https://atp.io">atp</a> - Predictive models and learning algorithms to improve the relevance of programmatic marketing. - Components used: Spark SQL, MLLib. @@ -67,9 +64,6 @@ exploration of large datasets - <a href="http://www.bigindustries.be/">Big Industries</a> - using Spark Streaming: The Big Content Platform is a business-to-business content asset management service providing a searchable, aggregated source of live news feeds, public domain media and archives of content. -- <a href="http://www.bizo.com">Bizo</a> - - Check out our talk on <a href="http://www.meetup.com/spark-users/events/139804022/">Spark at Bizo</a> - at Spark user meetup - <a href="http://www.celtra.com">Celtra</a> - <a href="http://www.clearstorydata.com">ClearStory Data</a> – ClearStory's platform and integrated Data Intelligence application leverages Spark to speed analysis across internal @@ -95,7 +89,6 @@ and external data sources, driving holistic and actionable insights. to run Spark and ML applications on Amazon Web Services and Azure, as well as a comprehensive <a href="https://databricks.com/training">training program</a>. - <a href="http://dianping.com">Dianping.com</a> -- <a href="http://www.digby.com">Digby</a> - <a href="http://www.drawbrid.ge/">Drawbridge</a> - <a href="http://www.ebay.com/">eBay Inc.</a> - Using Spark core for log transaction aggregation and analytics @@ -116,11 +109,10 @@ and external data sources, driving holistic and actionable insights. - We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain activity in real time - <a href="http://www.fundacionctic.org">Fundacion CTIC</a> -- <a href="http://graphflow.com">GraphFlow, Inc.</a> - <a href="https://www.groupon.com">Groupon</a> - <a href="http://www.guavus.com/">Guavus</a> - Stream processing of network machine data -- <a href="http://www.hitachi-solutions.com/">Hitachi Solutions</a> +- <a href="http://us.hitachi-solutions.com">Hitachi Solutions</a> - <a href="http://hivedata.com/">The Hive</a> - <a href="http://www.research.ibm.com/labs/almaden/index.shtml">IBM Almaden</a> - <a href="http://www.infoobjects.com">InfoObjects</a> @@ -137,7 +129,6 @@ and external data sources, driving holistic and actionable insights. - Batch, real-time, and predictive analytics driving our mobile app analytics and marketing automation product. - Components used: Spark, Spark Streaming, MLLib. -- <a href="http://magine.com">Magine TV</a> - <a href="http://mediacrossing.com">MediaCrossing</a> – Digital Media Trading Experts in the New York and Boston areas - We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer @@ -148,7 +139,6 @@ New York and Boston areas - Using Spark to build different recommendation systems for recipes and foods. - <a href="http://deepspace.jpl.nasa.gov/">NASA JPL - Deep Space Network</a> - <a href="http://www.163.com/">Netease</a> -- <a href="http://www.nflabs.com">NFLabs</a> - <a href="http://nsn.com">Nokia Solutions and Networks</a> - <a href="http://www.nttdata.com/global/en/">NTT DATA</a> - <a href="http://www.nubetech.co">Nube Technologies</a> @@ -170,9 +160,8 @@ across all screens - PanTera is a tool for exploring large datasets. It uses Spark to create XY and geographic scatterplots from millions to billions of datapoints. - Components we are using: Spark Core (Scala API), Spark SQL, and GraphX -- <a href="http://www.peerialism.com">Peerialism</a> - <a href="http://www.planbmedia.com">PlanBMedia</a> -- <a href="http://prediction.io/">PredicitionIo</a> +- <a href="http://predictionio.apache.org/index.html/">Apache PredicitionIO</a> - PredictionIO currently offers two engine templates for Apache Spark MLlib for recommendation (MLlib ALS) and classification (MLlib Naive Bayes). With these templates, you can create a custom predictive engine for production deployment @@ -194,12 +183,11 @@ efficiently. and personalization. - <a href="http://www.sisa.samsung.com/">Samsung Research America</a> - <a href="http://www.shopify.com/">Shopify</a> -- <a href="http://www.simba.com/">Simba Technologies</a> +- <a href="https://www.simba.com/">Simba Technologies</a> - BI/reporting/ETL for Spark and beyond - <a href="http://www.sinnia.com">Sinnia</a> -- <a href="http://www.sktelecom.com/en/main/index.do">SK Telecom</a> +- <a href="https://www.sktelecom.com/index_en.html">SK Telecom</a> - SK Telecom analyses mobile usage patterns of customer with Spark and Shark. -- <a href="http://socialmetrix.com/">Socialmetrix</a> - <a href="http://www.sohu.com">Sohu</a> - <a href="https://dawn.cs.stanford.edu">Stanford DAWN</a> - Research lab on infrastructure for usable machine learning, with multiple research projects that run over or @@ -207,15 +195,10 @@ efficiently. - <a href="http://www.stratio.com/">Stratio</a> - Offers an open-source Big Data platform centered around Apache Spark. - <a href="https://www.taboola.com/">Taboola</a> – Powering 'Content You May Like' around the web -- <a href="http://www.techbase.com.tr">Techbase</a> - <a href="http://tencent.com/">Tencent</a> - <a href="http://www.tetraconcepts.com/">Tetra Concepts</a> - <a href="http://www.trendmicro.com/us/index.html">TrendMicro</a> - <a href="http://engineering.tripadvisor.com/using-apache-spark-for-massively-parallel-nlp/">TripAdvisor</a> -- <a href="http://truedash.io">truedash</a> - - Automatic pulling of all your data in to Spark for enterprise visualisation, predictive - analytics and data exploration at a low cost. -- <a href="http://www.trueffect.com">TruEffect Inc</a> - <a href="http://www.ucsc.edu">UC Santa Cruz</a> - <a href="http://missouri.edu/">University of Missouri Data Analytics and Discover Lab</a> - <a href="http://videoamp.com/">VideoAmp</a> diff --git a/release-process.md b/release-process.md index 31d395c..028fd4e 100644 --- a/release-process.md +++ b/release-process.md @@ -60,7 +60,7 @@ svn ci --username $ASF_USERNAME --password "$ASF_PASSWORD" -m"Update KEYS" The scripts to create release candidates are run through docker. You need to install docker before running these scripts. Please make sure that you can run docker as non-root users. See -<a href="https://docs.docker.com/install/linux/linux-postinstall">https://docs.docker.com/install/linux/linux-postinstall</a> +<a href="https://docs.docker.com/install/linux/linux-postinstall/">https://docs.docker.com/install/linux/linux-postinstall</a> for more details. <h2>Preparing Spark for Release</h2> @@ -159,7 +159,7 @@ You'll need the credentials for the `spark-upload` account, which can be found i <a href="https://lists.apache.org/thread.html/2789e448cd8a95361a3164b48f3f8b73a6d9d82aeb228bae2bc4dc7f@%3Cprivate.spark.apache.org%3E">this message</a> (only visible to PMC members). -The artifacts can be uploaded using <a href="https://pypi.python.org/pypi/twine">twine</a>. Just run: +The artifacts can be uploaded using <a href="https://pypi.org/project/twine/">twine</a>. Just run: ``` twine upload --repository-url https://upload.pypi.org/legacy/ pyspark-{version}.tar.gz pyspark-{version}.tar.gz.asc diff --git a/site/community.html b/site/community.html index 716f8ce..153152a 100644 --- a/site/community.html +++ b/site/community.html @@ -345,33 +345,18 @@ vulnerabilities, and for information on known security issues.</p> <a href="https://www.meetup.com/Spark_big_data_analytics/">Bangalore Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Berlin-Apache-Spark-Meetup/">Berlin Spark Meetup</a> - </li> - <li> - <a href="https://www.meetup.com/spark-user-beijing-Meetup/">Beijing Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/Boston-Apache-Spark-User-Group/">Boston Spark Meetup</a> </li> <li> <a href="https://www.meetup.com/Boulder-Denver-Spark-Meetup/">Boulder/Denver Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Chicago-Spark-Users/">Chicago Spark Users</a> - </li> - <li> <a href="https://www.meetup.com/Christchurch-Apache-Spark-Meetup/">Christchurch Apache Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Cincinnati-Apache-Spark-Meetup/">Cincinanati Apache Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/Hangzhou-Apache-Spark-Meetup/">Hangzhou Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Spark-User-Group-Hyderabad/">Hyderabad Spark Meetup</a> - </li> - <li> <a href="https://www.meetup.com/israel-spark-users/">Israel Spark Users</a> </li> <li> @@ -402,9 +387,6 @@ vulnerabilities, and for information on known security issues.</p> <a href="https://www.meetup.com/Shenzhen-Apache-Spark-Meetup/">Shenzhen Spark Meetup</a> </li> <li> - <a href="https://www.meetup.com/Toronto-Apache-Spark">Toronto Apache Spark</a> - </li> - <li> <a href="https://www.meetup.com/Tokyo-Spark-Meetup/">Tokyo Spark Meetup</a> </li> <li> @@ -413,9 +395,6 @@ vulnerabilities, and for information on known security issues.</p> <li> <a href="https://www.meetup.com/Washington-DC-Area-Spark-Interactive/">Washington DC Area Spark Meetup</a> </li> - <li> - <a href="https://www.meetup.com/Apache-Spark-Zagreb-Meetup/">Zagreb Spark Meetup</a> - </li> </ul> <p>If you'd like your meetup or conference added, please email <a href="mailto:u...@spark.apache.org">u...@spark.apache.org</a>.</p> diff --git a/site/developer-tools.html b/site/developer-tools.html index 5f6ba93..7d5b35c 100644 --- a/site/developer-tools.html +++ b/site/developer-tools.html @@ -295,7 +295,7 @@ $ build/mvn package -DskipTests -pl core <pre><code>$ build/sbt "core/testOnly *DAGSchedulerSuite -- -z SPARK-12345" </code></pre> -<p>For more about how to run individual tests with sbt, see the <a href="http://www.scala-sbt.org/0.13/docs/Testing.html">sbt documentation</a>.</p> +<p>For more about how to run individual tests with sbt, see the <a href="https://www.scala-sbt.org/0.13/docs/Testing.html">sbt documentation</a>.</p> <h4>Testing with Maven</h4> @@ -640,26 +640,15 @@ in the Eclipse install directory. Increase the following setting as needed:</p> <p><a name="nightly-builds"></a></p> <h3>Nightly Builds</h3> -<p>Packages are built regularly off of Spark’s master branch and release branches. These provide -Spark developers access to the bleeding-edge of Spark master or the most recent fixes not yet -incorporated into a maintenance release. These should only be used by Spark developers, as they -may have bugs and have not undergone the same level of testing as releases. Spark nightly packages -are available at:</p> - -<ul> - <li>Latest master build: <a href="https://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest">https://people.apache.org/~pwendell/spark-nightly/spark-master-bin/latest</a></li> - <li>All nightly builds: <a href="https://people.apache.org/~pwendell/spark-nightly/">https://people.apache.org/~pwendell/spark-nightly/</a></li> -</ul> - -<p>Spark also publishes SNAPSHOT releases of its Maven artifacts for both master and maintenance +<p>Spark publishes SNAPSHOT releases of its Maven artifacts for both master and maintenance branches on a nightly basis. To link to a SNAPSHOT you need to add the ASF snapshot repository to your build. Note that SNAPSHOT artifacts are ephemeral and may change or be removed. To use these you must add the ASF snapshot repository at <a href=”https://repository.apache.org/snapshots/<a>.</a></p> <pre><code>groupId: org.apache.spark -artifactId: spark-core_2.10 -version: 1.5.0-SNAPSHOT +artifactId: spark-core_2.12 +version: 3.0.0-SNAPSHOT </code></pre> <p><a name="profiling"></a></p> diff --git a/site/documentation.html b/site/documentation.html index e0a8077..9401a91 100644 --- a/site/documentation.html +++ b/site/documentation.html @@ -363,7 +363,7 @@ Slides, videos and EC2-based exercises from each of these are available online: <li><a href="http://shop.oreilly.com/product/0636920028512.do">Learning Spark</a>, by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia (O'Reilly Media)</li> <li><a href="http://www.manning.com/bonaci/">Spark in Action</a>, by Marko Bonaci and Petar Zecevic (Manning)</li> <li><a href="http://shop.oreilly.com/product/0636920035091.do">Advanced Analytics with Spark</a>, by Juliet Hougland, Uri Laserson, Sean Owen, Sandy Ryza and Josh Wills (O'Reilly Media)</li> - <li><a href="http://manning.com/malak/">Spark GraphX in Action</a>, by Michael Malak (Manning)</li> + <li><a href="https://www.manning.com/books/spark-graphx-in-action">Spark GraphX in Action</a>, by Michael Malak (Manning)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/fast-data-processing-spark-second-edition">Fast Data Processing with Spark</a>, by Krishna Sankar and Holden Karau (Packt Publishing)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-spark">Machine Learning with Spark</a>, by Nick Pentreath (Packt Publishing)</li> <li><a href="https://www.packtpub.com/big-data-and-business-intelligence/spark-cookbook">Spark Cookbook</a>, by Rishi Yadav (Packt Publishing)</li> diff --git a/site/downloads.html b/site/downloads.html index 538cad1..bd570f1 100644 --- a/site/downloads.html +++ b/site/downloads.html @@ -227,7 +227,7 @@ $(document).ready(function() { </ol> <h3 id="link-with-spark">Link with Spark</h3> -<p>Spark artifacts are <a href="https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22">hosted in Maven Central</a>. You can add a Maven dependency with the following coordinates:</p> +<p>Spark artifacts are <a href="https://search.maven.org/search?q=g:org.apache.spark">hosted in Maven Central</a>. You can add a Maven dependency with the following coordinates:</p> <pre><code>groupId: org.apache.spark artifactId: spark-core_2.11 @@ -235,7 +235,7 @@ version: 2.4.1 </code></pre> <h3 id="installing-with-pypi">Installing with PyPi</h3> -<p><a href="https://pypi.python.org/pypi/pyspark">PySpark</a> is now available in pypi. To install just run <code>pip install pyspark</code>.</p> +<p><a href="https://pypi.org/project/pyspark/">PySpark</a> is now available in pypi. To install just run <code>pip install pyspark</code>.</p> <h3 id="release-notes-for-stable-releases">Release Notes for Stable Releases</h3> diff --git a/site/examples.html b/site/examples.html index 9305db0..a59cb7f 100644 --- a/site/examples.html +++ b/site/examples.html @@ -210,7 +210,7 @@ In the RDD API, there are two types of operations: <em>transformations</em>, which define a new dataset based on previous ones, and <em>actions</em>, which kick off a job to execute on a cluster. On top of Spark’s RDD API, high level APIs are provided, e.g. -<a href="https://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes">DataFrame API</a> and +<a href="https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes">DataFrame API</a> and <a href="https://spark.apache.org/docs/latest/mllib-guide.html">Machine Learning API</a>. These high level APIs provide a concise way to conduct certain data operations. In this page, we will show examples using RDD API as well as examples using high level APIs.</p> diff --git a/site/index.html b/site/index.html index 22ba00e..9521fc5 100644 --- a/site/index.html +++ b/site/index.html @@ -302,7 +302,7 @@ df.<span style="color: #000000;">where</span><span style="color: #F78811;">( on <a href="https://mesos.apache.org">Mesos</a>, or on <a href="https://kubernetes.io/">Kubernetes</a>. Access data in <a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS</a>, - <a href="https://alluxio.org">Alluxio</a>, + <a href="https://www.alluxio.org/">Alluxio</a>, <a href="https://cassandra.apache.org">Apache Cassandra</a>, <a href="https://hbase.apache.org">Apache HBase</a>, <a href="https://hive.apache.org">Apache Hive</a>, diff --git a/site/powered-by.html b/site/powered-by.html index ba8ffb3..f4a1f4b 100644 --- a/site/powered-by.html +++ b/site/powered-by.html @@ -256,7 +256,6 @@ providing faster and more meaningful insights and actionable data to the operato at memory speed.</li> </ul> </li> - <li><a href="http://alpinenow.com/">Alpine Data Labs</a></li> <li><a href="http://amazon.com">Amazon</a></li> <li><a href="http://www.art.com/">Art.com</a> <ul> @@ -270,8 +269,6 @@ to build cost effective data center solution for our customer in telco industry other industrial sectors.</li> </ul> </li> - <li><a href="http://www.atigeo.com">Atigeo</a> – integrated Spark in xPatterns, our big data -analytics platform, as a replacement for Hadoop MR</li> <li><a href="https://atp.io">atp</a> <ul> <li>Predictive models and learning algorithms to improve the relevance of programmatic marketing.</li> @@ -285,12 +282,6 @@ exploration of large datasets</li> <li><a href="http://www.bigindustries.be/">Big Industries</a> - using Spark Streaming: The Big Content Platform is a business-to-business content asset management service providing a searchable, aggregated source of live news feeds, public domain media and archives of content.</li> - <li><a href="http://www.bizo.com">Bizo</a> - <ul> - <li>Check out our talk on <a href="http://www.meetup.com/spark-users/events/139804022/">Spark at Bizo</a> -at Spark user meetup</li> - </ul> - </li> <li><a href="http://www.celtra.com">Celtra</a></li> <li><a href="http://www.clearstorydata.com">ClearStory Data</a> – ClearStory’s platform and integrated Data Intelligence application leverages Spark to speed analysis across internal @@ -331,7 +322,6 @@ to run Spark and ML applications on Amazon Web Services and Azure, as well as a </ul> </li> <li><a href="http://dianping.com">Dianping.com</a></li> - <li><a href="http://www.digby.com">Digby</a></li> <li><a href="http://www.drawbrid.ge/">Drawbridge</a></li> <li><a href="http://www.ebay.com/">eBay Inc.</a> <ul> @@ -367,14 +357,13 @@ activity in real time</li> </ul> </li> <li><a href="http://www.fundacionctic.org">Fundacion CTIC</a></li> - <li><a href="http://graphflow.com">GraphFlow, Inc.</a></li> <li><a href="https://www.groupon.com">Groupon</a></li> <li><a href="http://www.guavus.com/">Guavus</a> <ul> <li>Stream processing of network machine data</li> </ul> </li> - <li><a href="http://www.hitachi-solutions.com/">Hitachi Solutions</a></li> + <li><a href="http://us.hitachi-solutions.com">Hitachi Solutions</a></li> <li><a href="http://hivedata.com/">The Hive</a></li> <li><a href="http://www.research.ibm.com/labs/almaden/index.shtml">IBM Almaden</a></li> <li><a href="http://www.infoobjects.com">InfoObjects</a> @@ -403,7 +392,6 @@ automation product.</li> <li>Components used: Spark, Spark Streaming, MLLib.</li> </ul> </li> - <li><a href="http://magine.com">Magine TV</a></li> <li><a href="http://mediacrossing.com">MediaCrossing</a> – Digital Media Trading Experts in the New York and Boston areas <ul> @@ -420,7 +408,6 @@ with the final goal of identifying high-quality food items.</li> </li> <li><a href="http://deepspace.jpl.nasa.gov/">NASA JPL - Deep Space Network</a></li> <li><a href="http://www.163.com/">Netease</a></li> - <li><a href="http://www.nflabs.com">NFLabs</a></li> <li><a href="http://nsn.com">Nokia Solutions and Networks</a></li> <li><a href="http://www.nttdata.com/global/en/">NTT DATA</a></li> <li><a href="http://www.nubetech.co">Nube Technologies</a> @@ -454,9 +441,8 @@ scatterplots from millions to billions of datapoints.</li> <li>Components we are using: Spark Core (Scala API), Spark SQL, and GraphX</li> </ul> </li> - <li><a href="http://www.peerialism.com">Peerialism</a></li> <li><a href="http://www.planbmedia.com">PlanBMedia</a></li> - <li><a href="http://prediction.io/">PredicitionIo</a> + <li><a href="http://predictionio.apache.org/index.html/">Apache PredicitionIO</a> <ul> <li>PredictionIO currently offers two engine templates for Apache Spark MLlib for recommendation (MLlib ALS) and classification (MLlib Naive @@ -493,18 +479,17 @@ and personalization.</li> </li> <li><a href="http://www.sisa.samsung.com/">Samsung Research America</a></li> <li><a href="http://www.shopify.com/">Shopify</a></li> - <li><a href="http://www.simba.com/">Simba Technologies</a> + <li><a href="https://www.simba.com/">Simba Technologies</a> <ul> <li>BI/reporting/ETL for Spark and beyond</li> </ul> </li> <li><a href="http://www.sinnia.com">Sinnia</a></li> - <li><a href="http://www.sktelecom.com/en/main/index.do">SK Telecom</a> + <li><a href="https://www.sktelecom.com/index_en.html">SK Telecom</a> <ul> <li>SK Telecom analyses mobile usage patterns of customer with Spark and Shark.</li> </ul> </li> - <li><a href="http://socialmetrix.com/">Socialmetrix</a></li> <li><a href="http://www.sohu.com">Sohu</a></li> <li><a href="https://dawn.cs.stanford.edu">Stanford DAWN</a> <ul> @@ -518,18 +503,10 @@ accelerate Apache Spark.</li> </ul> </li> <li><a href="https://www.taboola.com/">Taboola</a> – Powering ‘Content You May Like’ around the web</li> - <li><a href="http://www.techbase.com.tr">Techbase</a></li> <li><a href="http://tencent.com/">Tencent</a></li> <li><a href="http://www.tetraconcepts.com/">Tetra Concepts</a></li> <li><a href="http://www.trendmicro.com/us/index.html">TrendMicro</a></li> <li><a href="http://engineering.tripadvisor.com/using-apache-spark-for-massively-parallel-nlp/">TripAdvisor</a></li> - <li><a href="http://truedash.io">truedash</a> - <ul> - <li>Automatic pulling of all your data in to Spark for enterprise visualisation, predictive -analytics and data exploration at a low cost.</li> - </ul> - </li> - <li><a href="http://www.trueffect.com">TruEffect Inc</a></li> <li><a href="http://www.ucsc.edu">UC Santa Cruz</a></li> <li><a href="http://missouri.edu/">University of Missouri Data Analytics and Discover Lab</a></li> <li><a href="http://videoamp.com/">VideoAmp</a> diff --git a/site/release-process.html b/site/release-process.html index 19b1b39..328f56e 100644 --- a/site/release-process.html +++ b/site/release-process.html @@ -265,7 +265,7 @@ svn ci --username $ASF_USERNAME --password "$ASF_PASSWORD" -m"Update KEYS" <p>The scripts to create release candidates are run through docker. You need to install docker before running these scripts. Please make sure that you can run docker as non-root users. See -<a href="https://docs.docker.com/install/linux/linux-postinstall">https://docs.docker.com/install/linux/linux-postinstall</a> +<a href="https://docs.docker.com/install/linux/linux-postinstall/">https://docs.docker.com/install/linux/linux-postinstall</a> for more details.</p> <h2>Preparing Spark for Release</h2> @@ -360,7 +360,7 @@ and the same under https://repository.apache.org/content/groups/maven-staging-gr <a href="https://lists.apache.org/thread.html/2789e448cd8a95361a3164b48f3f8b73a6d9d82aeb228bae2bc4dc7f@%3Cprivate.spark.apache.org%3E">this message</a> (only visible to PMC members).</p> -<p>The artifacts can be uploaded using <a href="https://pypi.python.org/pypi/twine">twine</a>. Just run:</p> +<p>The artifacts can be uploaded using <a href="https://pypi.org/project/twine/">twine</a>. Just run:</p> <pre><code>twine upload --repository-url https://upload.pypi.org/legacy/ pyspark-{version}.tar.gz pyspark-{version}.tar.gz.asc </code></pre> diff --git a/site/third-party-projects.html b/site/third-party-projects.html index 7ae92b7..6152c27 100644 --- a/site/third-party-projects.html +++ b/site/third-party-projects.html @@ -226,12 +226,11 @@ for details)</li> <li><a href="http://mlbase.org/">MLbase</a> - Machine Learning research project on top of Spark</li> <li><a href="https://mesos.apache.org/">Apache Mesos</a> - Cluster management system that supports running Spark</li> - <li><a href="http://alluxio.org/">Alluxio</a> (née Tachyon) - Memory speed virtual distributed + <li><a href="https://www.alluxio.org/">Alluxio</a> (née Tachyon) - Memory speed virtual distributed storage system that supports running Spark</li> <li><a href="https://github.com/filodb/FiloDB">FiloDB</a> - a Spark integrated analytical/columnar database, with in-memory option capable of sub-second concurrent queries</li> - <li><a href="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/spark.html#spark-sql">ElasticSearch - -Spark SQL</a> Integration</li> + <li><a href=”https://www.elastic.co/guide/en/elasticsearch/hadoop/7.x/spark.html</li> <li><a href="http://zeppelin-project.org/">Zeppelin</a> - Multi-purpose notebook which supports 20+ language backends, including Apache Spark</li> <li><a href="https://github.com/EclairJS/eclairjs-node">EclairJS</a> - enables Node.js developers to code diff --git a/site/trademarks.html b/site/trademarks.html index dec2242..73eb949 100644 --- a/site/trademarks.html +++ b/site/trademarks.html @@ -213,7 +213,7 @@ distinguished from third-party products.</p> <p>If you would like to provide software, services, events, or other products based on Apache Spark, please refer to the <a href="https://www.apache.org/foundation/marks/">ASF trademark policy</a> -and <a href="https://www.apache.org/foundation/marks/faq">FAQ</a>. +and <a href="https://www.apache.org/foundation/marks/faq/">FAQ</a>. This page summarizes the key points of the policy, but please note that the official policy always takes precedence.</p> diff --git a/third-party-projects.md b/third-party-projects.md index 607398d..5d64d92 100644 --- a/third-party-projects.md +++ b/third-party-projects.md @@ -32,12 +32,11 @@ for details) - <a href="http://mlbase.org/">MLbase</a> - Machine Learning research project on top of Spark - <a href="https://mesos.apache.org/">Apache Mesos</a> - Cluster management system that supports running Spark -- <a href="http://alluxio.org/">Alluxio</a> (née Tachyon) - Memory speed virtual distributed +- <a href="https://www.alluxio.org/">Alluxio</a> (née Tachyon) - Memory speed virtual distributed storage system that supports running Spark - <a href="https://github.com/filodb/FiloDB">FiloDB</a> - a Spark integrated analytical/columnar database, with in-memory option capable of sub-second concurrent queries -- <a href="http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/spark.html#spark-sql">ElasticSearch - -Spark SQL</a> Integration +- <a href="https://www.elastic.co/guide/en/elasticsearch/hadoop/7.x/spark.html - <a href="http://zeppelin-project.org/">Zeppelin</a> - Multi-purpose notebook which supports 20+ language backends, including Apache Spark - <a href="https://github.com/EclairJS/eclairjs-node">EclairJS</a> - enables Node.js developers to code diff --git a/trademarks.md b/trademarks.md index 69575f1..bd457b2 100644 --- a/trademarks.md +++ b/trademarks.md @@ -19,7 +19,7 @@ distinguished from third-party products. If you would like to provide software, services, events, or other products based on Apache Spark, please refer to the <a href="https://www.apache.org/foundation/marks/">ASF trademark policy</a> -and <a href="https://www.apache.org/foundation/marks/faq">FAQ</a>. +and <a href="https://www.apache.org/foundation/marks/faq/">FAQ</a>. This page summarizes the key points of the policy, but please note that the official policy always takes precedence. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org