spark git commit: Fixing a few basic typos in the Programming Guide.

srowen Tue, 19 May 2015 01:00:51 -0700

Repository: spark
Updated Branches:
  refs/heads/master 6008ec14e -> 61f164d3f



Fixing a few basic typos in the Programming Guide.

Just a few minor fixes in the guide, so a new JIRA issue was not created per 
the guidelines.

Author: Mike Dusenberry <dusenberr...@gmail.com>

Closes #6240 from dusenberrymw/Fix_Programming_Guide_Typos and squashes the 
following commits:

ffa76eb [Mike Dusenberry] Fixing a few basic typos in the Programming Guide.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/61f164d3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/61f164d3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/61f164d3

Branch: refs/heads/master
Commit: 61f164d3fdd1c8dcdba8c9d66df05ff4069aa6e6
Parents: 6008ec1
Author: Mike Dusenberry <dusenberr...@gmail.com>
Authored: Tue May 19 08:59:45 2015 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue May 19 08:59:45 2015 +0100

----------------------------------------------------------------------
 docs/programming-guide.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/61f164d3/docs/programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 2781651..0c27376 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -1071,7 +1071,7 @@ for details.
 </tr>
 <tr>
   <td> <b>saveAsSequenceFile</b>(<i>path</i>) <br /> (Java and Scala) </td>
-  <td> Write the elements of the dataset as a Hadoop SequenceFile in a given 
path in the local filesystem, HDFS or any other Hadoop-supported file system. 
This is available on RDDs of key-value pairs that either implement Hadoop's 
Writable interface. In Scala, it is also
+  <td> Write the elements of the dataset as a Hadoop SequenceFile in a given 
path in the local filesystem, HDFS or any other Hadoop-supported file system. 
This is available on RDDs of key-value pairs that implement Hadoop's Writable 
interface. In Scala, it is also
    available on types that are implicitly convertible to Writable (Spark 
includes conversions for basic types like Int, Double, String, etc). </td>
 </tr>
 <tr>
@@ -1122,7 +1122,7 @@ ordered data following shuffle then it's possible to use:
 * `sortBy` to make a globally ordered RDD
 
 Operations which can cause a shuffle include **repartition** operations like
-[`repartition`](#RepartitionLink), and [`coalesce`](#CoalesceLink), **'ByKey** 
operations
+[`repartition`](#RepartitionLink) and [`coalesce`](#CoalesceLink), **'ByKey** 
operations
 (except for counting) like [`groupByKey`](#GroupByLink) and 
[`reduceByKey`](#ReduceByLink), and
 **join** operations like [`cogroup`](#CogroupLink) and [`join`](#JoinLink).
 
@@ -1138,7 +1138,7 @@ read the relevant sorted blocks.
         
 Certain shuffle operations can consume significant amounts of heap memory 
since they employ 
 in-memory data structures to organize records before or after transferring 
them. Specifically, 
-`reduceByKey` and `aggregateByKey` create these structures on the map side and 
`'ByKey` operations 
+`reduceByKey` and `aggregateByKey` create these structures on the map side, 
and `'ByKey` operations 
 generate these on the reduce side. When data does not fit in memory Spark will 
spill these tables 
 to disk, incurring the additional overhead of disk I/O and increased garbage 
collection.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Fixing a few basic typos in the Programming Guide.

Reply via email to