Repository: spark-website
Updated Branches:
  refs/heads/asf-site a78faf582 -> eee58685c


replace with valid url to rdd paper


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/eee58685
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/eee58685
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/eee58685

Branch: refs/heads/asf-site
Commit: eee58685c39269c191a921c39f1520c747a42318
Parents: a78faf5
Author: Xin Ren <iamsh...@126.com>
Authored: Fri Sep 16 16:31:23 2016 -0700
Committer: Xin Ren <iamsh...@126.com>
Committed: Fri Sep 16 16:31:23 2016 -0700

----------------------------------------------------------------------
 research.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark-website/blob/eee58685/research.md
----------------------------------------------------------------------
diff --git a/research.md b/research.md
index 41841a1..ec7dd54 100644
--- a/research.md
+++ b/research.md
@@ -27,7 +27,7 @@ Traditional MapReduce and DAG engines are suboptimal for 
these applications beca
 </p>
 
 <p>
-Spark offers an abstraction called <a 
href="http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf";><em>resilient
 distributed datasets (RDDs)</em></a> to support these applications 
efficiently. RDDs can be stored in memory between queries <em>without</em> 
requiring replication.  Instead, they rebuild lost data on failure using 
<em>lineage</em>: each RDD remembers how it was built from other datasets (by 
transformations like <code>map</code>, <code>join</code> or 
<code>groupBy</code>) to rebuild itself.  RDDs allow Spark to outperform 
existing models by up to 100x in multi-pass analytics. We showed that RDDs can 
support a wide variety of iterative algorithms, as well as interactive data 
mining and a highly efficient SQL engine (<a 
href="http://shark.cs.berkeley.edu";>Shark</a>).
+Spark offers an abstraction called <a 
href="http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf";><em>resilient
 distributed datasets (RDDs)</em></a> to support these applications 
efficiently. RDDs can be stored in memory between queries <em>without</em> 
requiring replication.  Instead, they rebuild lost data on failure using 
<em>lineage</em>: each RDD remembers how it was built from other datasets (by 
transformations like <code>map</code>, <code>join</code> or 
<code>groupBy</code>) to rebuild itself.  RDDs allow Spark to outperform 
existing models by up to 100x in multi-pass analytics. We showed that RDDs can 
support a wide variety of iterative algorithms, as well as interactive data 
mining and a highly efficient SQL engine (<a 
href="http://shark.cs.berkeley.edu";>Shark</a>).
 </p>
 
 <p class="noskip">You can find more about the research behind Spark in the 
following papers:</p>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to