Repository: spark Updated Branches: refs/heads/master c2aeddf9e -> 77988a9d0
[MINOR][DOC] Fix the link of 'Getting Started' ## What changes were proposed in this pull request? Easy fix in the link. ## How was this patch tested? Tested manually Author: Mahmut CAVDAR <mahmutc...@gmail.com> Closes #19996 from mcavdar/master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/77988a9d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/77988a9d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/77988a9d Branch: refs/heads/master Commit: 77988a9d0d553f26034f7206e5e6314acab2dec5 Parents: c2aeddf Author: Mahmut CAVDAR <mahmutc...@gmail.com> Authored: Sun Dec 17 10:52:01 2017 -0600 Committer: Sean Owen <so...@cloudera.com> Committed: Sun Dec 17 10:52:01 2017 -0600 ---------------------------------------------------------------------- docs/mllib-decision-tree.md | 2 +- docs/running-on-mesos.md | 4 ++-- docs/spark-standalone.md | 2 +- docs/sql-programming-guide.md | 1 + docs/tuning.md | 2 +- 5 files changed, 6 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/77988a9d/docs/mllib-decision-tree.md ---------------------------------------------------------------------- diff --git a/docs/mllib-decision-tree.md b/docs/mllib-decision-tree.md index 0e753b8..ec13b81 100644 --- a/docs/mllib-decision-tree.md +++ b/docs/mllib-decision-tree.md @@ -91,7 +91,7 @@ For a categorical feature with `$M$` possible values (categories), one could com `$2^{M-1}-1$` split candidates. For binary (0/1) classification and regression, we can reduce the number of split candidates to `$M-1$` by ordering the categorical feature values by the average label. (See Section 9.2.4 in -[Elements of Statistical Machine Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) for +[Elements of Statistical Machine Learning](https://web.stanford.edu/~hastie/ElemStatLearn/) for details.) For example, for a binary classification problem with one categorical feature with three categories A, B and C whose corresponding proportions of label 1 are 0.2, 0.6 and 0.4, the categorical features are ordered as A, C, B. The two split candidates are A \| C, B http://git-wip-us.apache.org/repos/asf/spark/blob/77988a9d/docs/running-on-mesos.md ---------------------------------------------------------------------- diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md index 19ec7c1..382cbfd 100644 --- a/docs/running-on-mesos.md +++ b/docs/running-on-mesos.md @@ -47,7 +47,7 @@ To install Apache Mesos from source, follow these steps: 1. Download a Mesos release from a [mirror](http://www.apache.org/dyn/closer.lua/mesos/{{site.MESOS_VERSION}}/) -2. Follow the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for compiling and +2. Follow the Mesos [Getting Started](http://mesos.apache.org/getting-started) page for compiling and installing Mesos **Note:** If you want to run Mesos without installing it into the default paths on your system @@ -159,7 +159,7 @@ By setting the Mesos proxy config property (requires mesos version >= 1.4), `--c If you like to run the `MesosClusterDispatcher` with Marathon, you need to run the `MesosClusterDispatcher` in the foreground (i.e: `bin/spark-class org.apache.spark.deploy.mesos.MesosClusterDispatcher`). Note that the `MesosClusterDispatcher` not yet supports multiple instances for HA. The `MesosClusterDispatcher` also supports writing recovery state into Zookeeper. This will allow the `MesosClusterDispatcher` to be able to recover all submitted and running containers on relaunch. In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env by configuring `spark.deploy.recoveryMode` and related spark.deploy.zookeeper.* configurations. -For more information about these configurations please refer to the configurations [doc](configurations.html#deploy). +For more information about these configurations please refer to the configurations [doc](configuration.html#deploy). You can also specify any additional jars required by the `MesosClusterDispatcher` in the classpath by setting the environment variable SPARK_DAEMON_CLASSPATH in spark-env. http://git-wip-us.apache.org/repos/asf/spark/blob/77988a9d/docs/spark-standalone.md ---------------------------------------------------------------------- diff --git a/docs/spark-standalone.md b/docs/spark-standalone.md index f51c5cc..8fa643a 100644 --- a/docs/spark-standalone.md +++ b/docs/spark-standalone.md @@ -364,7 +364,7 @@ By default, standalone scheduling clusters are resilient to Worker failures (ins Utilizing ZooKeeper to provide leader election and some state storage, you can launch multiple Masters in your cluster connected to the same ZooKeeper instance. One will be elected "leader" and the others will remain in standby mode. If the current leader dies, another Master will be elected, recover the old Master's state, and then resume scheduling. The entire recovery process (from the time the first leader goes down) should take between 1 and 2 minutes. Note that this delay only affects scheduling _new_ applications -- applications that were already running during Master failover are unaffected. -Learn more about getting started with ZooKeeper [here](http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html). +Learn more about getting started with ZooKeeper [here](http://zookeeper.apache.org/doc/current/zookeeperStarted.html). **Configuration** http://git-wip-us.apache.org/repos/asf/spark/blob/77988a9d/docs/sql-programming-guide.md ---------------------------------------------------------------------- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index b76be91..f02f462 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -501,6 +501,7 @@ To load a CSV file you can use: </div> </div> + ### Run SQL on files directly Instead of using read API to load a file into DataFrame and query it, you can also query that http://git-wip-us.apache.org/repos/asf/spark/blob/77988a9d/docs/tuning.md ---------------------------------------------------------------------- diff --git a/docs/tuning.md b/docs/tuning.md index 7d5f97a..fc27713 100644 --- a/docs/tuning.md +++ b/docs/tuning.md @@ -219,7 +219,7 @@ temporary objects created during task execution. Some steps which may be useful * Try the G1GC garbage collector with `-XX:+UseG1GC`. It can improve performance in some situations where garbage collection is a bottleneck. Note that with large executor heap sizes, it may be important to - increase the [G1 region size](https://blogs.oracle.com/g1gc/entry/g1_gc_tuning_a_case) + increase the [G1 region size](http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html) with `-XX:G1HeapRegionSize` * As an example, if your task is reading data from HDFS, the amount of memory used by the task can be estimated using --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org