Update site for 0.8.0

Project: http://git-wip-us.apache.org/repos/asf/arrow-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow-site/commit/61e9ea7e
Tree: http://git-wip-us.apache.org/repos/asf/arrow-site/tree/61e9ea7e
Diff: http://git-wip-us.apache.org/repos/asf/arrow-site/diff/61e9ea7e

Branch: refs/heads/asf-site
Commit: 61e9ea7e23eced764d5d327f469bf513fe2f37d6
Parents: 35611f8
Author: Jacques Nadeau <[email protected]>
Authored: Sun Dec 17 22:03:06 2017 -0800
Committer: Jacques Nadeau <[email protected]>
Committed: Mon Dec 18 13:33:54 2017 -0800

----------------------------------------------------------------------
 blog/2017/05/07/0.3-release-japanese/index.html | 288 ++++++++++++
 blog/2017/05/07/0.3-release/index.html          | 364 +++++++++++++++
 blog/2017/05/22/0.4.0-release/index.html        | 225 +++++++++
 blog/2017/06/14/0.4.1-release/index.html        |   1 +
 blog/2017/06/16/turbodbc-arrow/index.html       |   1 +
 blog/2017/07/24/0.5.0-release/index.html        | 235 ++++++++++
 blog/2017/07/26/spark-arrow/index.html          |   7 +-
 .../07/plasma-in-memory-object-store/index.html | 273 +++++++++++
 blog/2017/08/15/0.6.0-release/index.html        | 234 ++++++++++
 blog/2017/09/18/0.7.0-release/index.html        | 311 ++++++++++++
 .../index.html                                  |   1 +
 blog/index.html                                 |  33 +-
 committers/index.html                           |   1 +
 css/main.css                                    |   2 +-
 docs/ipc.html                                   |  48 +-
 docs/memory_layout.html                         |  12 +-
 docs/metadata.html                              |   4 +-
 feed.xml                                        |  28 +-
 index.html                                      |   3 +-
 install/index.html                              |  57 ++-
 powered_by/index.html                           | 231 +++++++++
 release/0.1.0.html                              |   1 +
 release/0.2.0.html                              |   1 +
 release/0.3.0.html                              |   1 +
 release/0.4.0.html                              |   1 +
 release/0.4.1.html                              |   1 +
 release/0.5.0.html                              |   1 +
 release/0.6.0.html                              |   1 +
 release/0.7.0.html                              |   1 +
 release/0.7.1.html                              |   1 +
 release/0.8.0.html                              | 468 +++++++++++++++++++
 release/index.html                              |   2 +
 32 files changed, 2770 insertions(+), 68 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/05/07/0.3-release-japanese/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/05/07/0.3-release-japanese/index.html 
b/blog/2017/05/07/0.3-release-japanese/index.html
new file mode 100644
index 0000000..20aaabd
--- /dev/null
+++ b/blog/2017/05/07/0.3-release-japanese/index.html
@@ -0,0 +1,288 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+    
+    <!-- Global Site Tag (gtag.js) - Google Analytics -->
+<script async 
src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1";></script>
+<script>
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments)};
+  gtag('js', new Date());
+
+  gtag('config', 'UA-107500873-1');
+</script>
+
+    
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Apache Arrow 0.3.0リリース
+      <a href="/blog/2017/05/07/0.3-release-japanese/" class="permalink" 
title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            07 May 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com";><i class="fa fa-user"></i> Wes 
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p><a href="/blog/2017/05/07/0.3-release/">原文(English)</a></p>
+
+<p>Apache Arrowチーム
は0.3.0のリリースをアナウンスできてうれしいです。2月にリリースした0.2.0から10週間の活発な開発の結果が今回のリリースです。<a
 
href="https://github.com/apache/arrow/graphs/contributors";><strong>23人のコントリビューター</strong></a>が<a
 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.3.0"><strong>306個のJIRAのissueを解決</strong></a>しました。</p>
+
+<p>複数のArrowの実装にたくさんの新しい機能を追加
しています。2017年、特に注力して開発するのは、インメモリー用のフォーマット、型のメタデータ、メッセージング用のプロトコルです。これは、ビッグデータアプリケーションに<strong>安定していてプロダクションで使える基盤</strong>を提供するためです。高性能IOとインメモリーデータ処理にArrowを活用するために、<a
 href="http://spark.apache.org";>Apache Spark</a>・<a 
href="http://www.geomesa.org/";>GeoMesa</a>コミュニティーと協力していてとてもエキサイティングです。</p>
+
+<p>それぞれのプラットフォームでArrowを使う方法は<a 
href="http://arrow.apache.org/install";>インストールページ</a>を見てくã
 ã•い。</p>
+
+<p>Arrowでビッグデータシステム
を高速化するケースを増やすために、近いうちにApache 
Arrowのロードマップを公開する予定です。</p>
+
+<p>Arrowの開発に参加
するコントリビューターを募集しています。すでにArrowの開発に参åŠ
 
しているコミュニティーからのコントリビューターもそうですし、まã
 å‚加
していないGo、R、Juliaといったコミュニティーからのコントリビューターも募集しています。</p>
+
+<h3 
id="ファイルフォーマットとストリーミングフォーマットの強化">ファイルフォーマットとストリーミングフォーマットの強化</h3>
+
+<p>0.2.0では<strong>ランダム
アクセス</strong>用と<strong>ストリーミング</strong>用のArrowのワイヤーフォーマットを導å
…¥ã—ました。実装の詳細は<a 
href="http://arrow.apache.org/docs/ipc.html";>IPC仕様</a>を見てくだ
さい。ユースケースは<a 
href="http://wesmckinney.com/blog/arrow-streaming-columnar/";>使用例を紹介したブログ</a>を見てくã
 
さい。これらのフォーマットを使うと低オーバーヘッド・コピーなしでArrowのレコードバッチのペイロードにアクセスできます。</p>
+
+<p>0.3.0ではこのバイナリーフォマットの細かい詳細をたくさん固めました。Java、C++、Python間の連携のテストおよびそれぞれ言語での単体テストの整備も進めました。<a
 href="http://github.com/google/flatbuffers";>Google 
Flatbuffers</a>は、前方互換性を壊さずにメタデータに新しい機能を追åŠ
 ã™ã‚‹ã®ã«éžå¸¸ã«åŠ©ã‹ã‚Šã¾ã—ãŸã€‚</p>
+
+<p>まだバイナリーフォーマットの前方互換性を必
ず壊さないと約束できる状æ…
‹ã§ã¯ã‚りませんが(もしかしたら変更する必
要があるなにかが見つかるかもしれない)、メジャーリリース間では不å¿
…要に互換性を壊さないように努力するつもりです。Apache 
ArrowのWebサイト、各コンポーネントのユーザー向けのドキュメントおよびAPIドキュメントへのコントリビューションを非常に歓迎します。</p>
+
+<h3 
id="辞書エンコーディングのサポート">辞書エンコーディングのサポート</h3>
+
+<p><a href="http://www.geomesa.org/";>GeoMesa</a>プロジェクトの<a 
href="https://github.com/elahrvivaz";>Emilio Lahr-Vivaz</a>はJavaのArrow実装
に辞書エンコード対応ベクターをコントリビュートしました。これを受けて、C++とPythonでもサポートしました。(<code
 
class="highlighter-rouge">pandas.Categorical</code>とも連携できます。)辞書エンコーディング用のインテグレーションテスト(C++とJava間でこのデータを送受信するテスト)はまã
 å®Œæˆã—ていませんが、0.4.0までには完成させたいです。</p>
+
+<p>これはカテゴリーデータ用の一般的なデータ表現テクニックです。これを使うと、複数のレコードバッチでå
…±é€šã®ã€Œè¾žæ›¸ã€ã‚’å…
±æœ‰ã—、各レコードバッチの値はこの辞書を参ç…
§ã™ã‚‹æ•´æ•°ã«ãªã‚Šã¾ã™ã€‚このデータは統計的言語(statistical 
language)の分野では「カテゴリー(categorical)」や「因
子(factor)」と呼ばれています。Apache 
Parquetのようなファイルフォーマットの分野ではデータ圧縮のためã
 ã‘に使われています。</p>
+
+<h3 
id="日付時刻固定長型の拡張">日付、時刻、固定長型の拡張</h3>
+
+<p>0.2.0では現実に使われている日付・時刻型をインテグレーションテスト付きで完å
…¨ã«ã‚µãƒãƒ¼ãƒˆã™ã‚‹ã“とを諦めました。これらは<a 
href="http://parquet.apache.org";>Apache Parquet</a>とApache 
Sparkとの連携に必要な機能です。</p>
+
+<ul>
+  <li><strong>日付</strong>: 
32-bit(日単位)と64-bit(ミリ秒単位)</li>
+  <li><strong>時刻</strong>: 
単位付き64-bit整数(単位:秒、ミリ秒、マイクロ秒、ナノ秒)</li>
+  <li><strong>タイム
スタンプ(UNIXエポックからの経過時間)</strong>: 
単位付き64-bit整数のタイムゾーン付きとタイム
ゾーンなし</li>
+  <li><strong>固定長バイナリー</strong>: 
決まったバイト数のプリミティブな値</li>
+  <li><strong>固定長リスト</strong>: 各要素
が同じサイズのリスト(要素
のベクターとは別にオフセットのベクターを持つ必
要がない)</li>
+</ul>
+
+<p>C++のArrow実装では、<a 
href="https://github.com/boostorg/multiprecision";>Boost.Multiprecision</a>を使ったexactな小数のサポートを実験的に追åŠ
 ã—ました。ただし、Java実装とC++実装
間での小数のメモリーフォーマットはまだ
固まっていません。</p>
+
+<h3 
id="cとpythonのwindowsサポート">C++とPythonのWindowsサポート</h3>
+
+<p>一般的なC++とPythonでの開発用に、パッケージ周りの改良も多数å
…¥ã£ã¦ã„ます。0.3.0はVisual 
Studio(MSVC)2015と2017を使ってWindowsを完å…
¨ã«ã‚µãƒãƒ¼ãƒˆã—た最初のバージョンです。AppveyorでMSVC用のCIを実行しています。Windows上でソースからビルドするためのガイドも書きました。<a
 
href="https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md";>C++</a>用と<a
 
href="https://github.com/apache/arrow/blob/master/python/doc/source/development.rst";>Python</a>用。</p>
+
+<p><a 
href="https://conda-forge.github.io";>conda-forge</a>からWindows用のArrowのPythonライブラリーをインストールできます。</p>
+
+<div class="language-shell highlighter-rouge"><pre 
class="highlight"><code>conda install pyarrow -c conda-forge
+</code></pre>
+</div>
+
+<h3 
id="cglibバインディングとrubylua他のサポート">C(GLib)バインディングとRuby・Lua・他のサポート</h3>
+
+<p><a href="http://github.com/kou";>Kouhei Sutou</a>は新しいApache 
Arrowのコントリビューターです。Linux用の(ArrowのC++実装
の)GLibを使ったCバインディングをコントリビュートしました。<a
 href="https://wiki.gnome.org/Projects/GObjectIntrospection";>GObject 
Introspection</a>というCのミドルウェアを使うことでRuby、Lua、Goや<a
 
href="https://wiki.gnome.org/Projects/GObjectIntrospection/Users";>他にも様ã€
…なプログラミング言語</a>でシーム
レスにバインディングを使うことができます。これらのバインディングがどのように動いているか、これらのバインディングをどのように使うかを説明するブログ記事が別途å¿
…要な気がします。</p>
+
+<h3 id="pysparkを使ったapache-sparkとの連携">PySparkを使ったApache 
Sparkとの連携</h3>
+
+<p><a 
href="https://issues.apache.org/jira/browse/SPARK-13534";>SPARK-13534</a>でApache
 Sparkコミュニティーと協力しています。PySparkでの<code 
class="highlighter-rouge">DataFrame.toPandas</code>をArrowを使って高速化しようとしています。効率的なデータのシリアライズにより<a
 
href="https://github.com/apache/spark/pull/15821#issuecomment-282175163";><strong>40倍以上高速化</strong></a>できるケースがあります。</p>
+
+<p>PySparkでArrowを使うことでこれまでできなかったパフォーマンス最適化の道が開けました。特に、UDFの評価まわりでいろいろやれることがあるでしょう。(たとえば、Pythonのラãƒ
 ãƒ€é–¢æ•°ã‚’使って<code class="highlighter-rouge">map</code>・<code 
class="highlighter-rouge">filter</code>を実行するケース。)</p>
+
+<h3 id="python実装
での新しい機能メモリービューfeatherapache-parquetのサポート">Python実è£
…での新しい機能:メモリービュー、Feather、Apache 
Parquetのサポート</h3>
+
+<p>ArrowのPythonライブラリーである<code 
class="highlighter-rouge">pyarrow</code>は<code 
class="highlighter-rouge">libarrow</code>と<code 
class="highlighter-rouge">libarrow_python</code>というC++ライブラリーのCythonバインディングです。<code
 class="highlighter-rouge">pyarrow</code>はNumPyと<a 
href="http://pandas.pydata.org";>pandas</a>とPythonの標準ライブラリー間のシーãƒ
 ãƒ¬ã‚¹ãªé€£æºã‚’実現します。</p>
+
+<p>ArrowのC++ライブラリーで最も重要なものは<code 
class="highlighter-rouge">arrow::Buffer</code>オブジェクトです。これはメモリービューを管理します。コピーなしの読み込みとスライスをサポートしている点が重要です。<a
 href="https://github.com/JeffKnupp";>Jeff 
Knupp</a>はArrowのバッファーとPythonのバッファープロトコルとmemoryviewの連携処理をコントリビュートしました。これにより次のようなことができるようになりました。</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="n">In</span> <span class="p">[</span><span 
class="mi">6</span><span class="p">]:</span> <span class="kn">import</span> 
<span class="nn">pyarrow</span> <span class="kn">as</span> <span 
class="nn">pa</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">7</span><span class="p">]:</span> <span class="n">buf</span> <span 
class="o">=</span> <span class="n">pa</span><span class="o">.</span><span 
class="n">frombuffer</span><span class="p">(</span><span 
class="n">b</span><span class="s">'foobarbaz'</span><span class="p">)</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">8</span><span class="p">]:</span> <span class="n">buf</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">8</span><span class="p">]:</span> <span class="o">&lt;</span><span 
class="n">pyarrow</span><span class="o">.</span><span class="n">_io</span><span 
class="o">.</span><span class="n">Buffer</span> <span class="n">at</span> <span 
class="mh">0x7f6c0a84b538</span><span class="o">&gt;</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">9</span><span class="p">]:</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">buf</span><span class="p">)</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">9</span><span class="p">]:</span> <span class="o">&lt;</span><span 
class="n">memory</span> <span class="n">at</span> <span 
class="mh">0x7f6c0a8c5e88</span><span class="o">&gt;</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">10</span><span class="p">]:</span> <span class="n">buf</span><span 
class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">10</span><span class="p">]:</span> <span class="n">b</span><span 
class="s">'foobarbaz'</span>
+</code></pre>
+</div>
+
+<p>C++でのParquet実装である<a 
href="https://github.com/apache/parquet-cpp";>parquet-cpp</a>を使うことで大å¹
…に<a href="http://parquet.apache.org";><strong>Apache 
Parquet</strong></a>サポートを改良しました。たとえば、ディスク上にあるかHDFS上にあるか関係なく、パーティションされたデータセットをサポートしました。<a
 
href="https://github.com/dask/dask/commit/68f9e417924a985c1f2e2a587126833c70a2e9f4";>Daskプロジェクト</a>はArrowを使ったParquetサポートを実è£
…した最初のプロジェクトです。Dask開発者
とはpandsデータを分散処理する文脈でさらに協力できることを楽しみにしています。</p>
+
+<p>pandasを成熟させるためにArrowを改良することもあり、<a 
href="https://github.com/wesm/feather";><strong>Featherフォーマット</strong></a>の実è£
…
をマージしたのもその1つです。Featherフォーマットは本質的にはArrowのランダãƒ
 
アクセスフォーマットの特別なケースの1つです。ArrowのコードベースでFeatherの開発を続けます。たとえば、今のFeatherはArrowのPythonバインディングのレイヤーを使うことでPythonのファイルオブジェクトを読み書きできるようになっています。</p>
+
+<p><code class="highlighter-rouge">DatetimeTZ</code>や<code 
class="highlighter-rouge">Categorical</code>といったpandas固有のデータ型のちゃんとした(robust)サポートも実è£
…しました。</p>
+
+<h3 
id="cライブラリーでのテンソルサポート">C++ライブラリーでのテンソルサポート</h3>
+
+<p>Apache Arrowはコピーなしでå…
±æœ‰ãƒ¡ãƒ¢ãƒªãƒ¼ã‚’管理するツールという側面があります。機械学習アプリケーションの文脈でこの機能への関心が増えています。UCバークレーæ
 ¡ã®<a href="https://rise.cs.berkeley.edu/";>RISELab</a>の<a 
href="https://github.com/ray-project/ray";>Rayプロジェクト</a>が最初の例です。</p>
+
+<p>機械学習ではは「テンソル」とも呼ばれる多次元é…
åˆ—というデータ構造を扱います。このようなデータ構造
はArrowのカラムフォーマットがサポートしているデータ構造
の範囲を超えています。今回のケースでは、<a 
href="http://arrow.apache.org/docs/cpp/classarrow_1_1_tensor.html";><code 
class="highlighter-rouge">arrow::Tensor</code></a>というC++の型を追加
で実装しました。これはArrowのコピーなしのå…
±æœ‰ãƒ¡ãƒ¢ãƒªãƒ¼æ©Ÿèƒ½ã‚’活用して実装
しました。(メモリーの生存期間の管理に<code 
class="highlighter-rouge">arrow::Buffer</code>を使いました。)C++実装
では、これからも、å…
±é€šã®IO・メモリー管理ツールとしてArrowを活用できるようにするため、追åŠ
 ã®ãƒ‡ãƒ¼ã‚¿æ§‹é€ ã‚’提供するつもりです。</p>
+
+<h3 id="javascripttypescript実装の開始">JavaScript(TypeScript)実装
の開始</h3>
+
+<p><a href="https://github.com/TheNeuralBit";>Brian 
Hulette</a>はNodeJSとWebブラウザー上で動くアプリケーションで使うために<a
 
href="https://github.com/apache/arrow/tree/master/js";>TypeScript</a>でのArrowの実è£
…
を始めました。FlatBuffersがJavaScriptをファーストクラスでサポートしているので実è£
…が捗ります。</p>
+
+<h3 id="webサイトと開発者
用ドキュメントの改良">Webサイトと開発者
用ドキュメントの改良</h3>
+
+<p>0.2.0をリリースしてからドキュメントとブログをå…
¬é–‹ã™ã‚‹ãŸã‚ã«Webサイトのシステムを<a 
href="https://jekyllrb.com";>Jekyll</a>ベースで作りました。Kouhei 
Sutouは<a 
href="https://github.com/red-data-tools/jekyll-jupyter-notebook";>Jekyll Jupyter 
Notebookプラグイン</a>を作りました。これによりArrowのWebサイトのコンテンツを作るためにJupyterを使うことができます。</p>
+
+<p>WebサイトにはC、C++、Java、PythonのAPIドキュメントをå…
¬é–‹ã—ました。これらの中にArrowを使い始めるための有益なæƒ
…報を見つけられるでしょう。</p>
+
+<h3 id="コントリビューター">コントリビューター</h3>
+
+<p>このリリースにパッチをコントリビュートしたみなさんに感謝します。</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>$ git shortlog -sn 
apache-arrow-0.2.0..apache-arrow-0.3.0
+    119 Wes McKinney
+     55 Kouhei Sutou
+     18 Uwe L. Korn
+     17 Julien Le Dem
+      9 Phillip Cloud
+      6 Bryan Cutler
+      5 Philipp Moritz
+      5 Emilio Lahr-Vivaz
+      4 Max Risuhin
+      4 Johan Mabille
+      4 Jeff Knupp
+      3 Steven Phillips
+      3 Miki Tebeka
+      2 Leif Walsh
+      2 Jeff Reback
+      2 Brian Hulette
+      1 Tsuyoshi Ozawa
+      1 rvernica
+      1 Nong Li
+      1 Julien Lafaye
+      1 Itai Incze
+      1 Holden Karau
+      1 Deepak Majeti
+</code></pre>
+</div>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/05/07/0.3-release/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/05/07/0.3-release/index.html 
b/blog/2017/05/07/0.3-release/index.html
new file mode 100644
index 0000000..1b6e0f3
--- /dev/null
+++ b/blog/2017/05/07/0.3-release/index.html
@@ -0,0 +1,364 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+    
+    <!-- Global Site Tag (gtag.js) - Google Analytics -->
+<script async 
src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1";></script>
+<script>
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments)};
+  gtag('js', new Date());
+
+  gtag('config', 'UA-107500873-1');
+</script>
+
+    
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Apache Arrow 0.3.0 Release
+      <a href="/blog/2017/05/07/0.3-release/" class="permalink" 
title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            07 May 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com";><i class="fa fa-user"></i> Wes 
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p>Translations: <a 
href="/blog/2017/05/07/0.3-release-japanese/">日本語</a></p>
+
+<p>The Apache Arrow team is pleased to announce the 0.3.0 release of the
+project. It is the product of an intense 10 weeks of development since the
+0.2.0 release from this past February. It includes <a 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.3.0"><strong>306
 resolved JIRAs</strong></a>
+from <a href="https://github.com/apache/arrow/graphs/contributors";><strong>23 
contributors</strong></a>.</p>
+
+<p>While we have added many new features to the different Arrow 
implementations,
+one of the major development focuses in 2017 has been hardening the in-memory
+format, type metadata, and messaging protocol to provide a <strong>stable,
+production-ready foundation</strong> for big data applications. We are excited 
to be
+collaborating with the <a href="http://spark.apache.org";>Apache Spark</a> and 
<a href="http://www.geomesa.org/";>GeoMesa</a> communities on
+utilizing Arrow for high performance IO and in-memory data processing.</p>
+
+<p>See the <a href="http://arrow.apache.org/install";>Install Page</a> to learn 
how to get the libraries for your platform.</p>
+
+<p>We will be publishing more information about the Apache Arrow roadmap as we
+forge ahead with using Arrow to accelerate big data systems.</p>
+
+<p>We are looking for more contributors from within our existing communities 
and
+from other communities (such as Go, R, or Julia) to get involved in Arrow
+development.</p>
+
+<h3 id="file-and-streaming-format-hardening">File and Streaming Format 
Hardening</h3>
+
+<p>The 0.2.0 release brought with it the first iterations of the 
<strong>random access</strong>
+and <strong>streaming</strong> Arrow wire formats. See the <a 
href="http://arrow.apache.org/docs/ipc.html";>IPC specification</a> for
+implementation details and <a 
href="http://wesmckinney.com/blog/arrow-streaming-columnar/";>example blog 
post</a> with some use cases. These
+provide low-overhead, zero-copy access to Arrow record batch payloads.</p>
+
+<p>In 0.3.0 we have solidified a number of small details with the binary format
+and improved our integration and unit testing particularly in the Java, C++,
+and Python libraries. Using the <a 
href="http://github.com/google/flatbuffers";>Google Flatbuffers</a> project has 
helped with
+adding new features to our metadata without breaking forward compatibility.</p>
+
+<p>We are not yet ready to make a firm commitment to strong forward 
compatibility
+(in case we find something needs to change) in the binary format, but we will
+make efforts between major releases to not make unnecessary
+breakages. Contributions to the website and component user and API
+documentation would also be most welcome.</p>
+
+<h3 id="dictionary-encoding-support">Dictionary Encoding Support</h3>
+
+<p><a href="https://github.com/elahrvivaz";>Emilio Lahr-Vivaz</a> from the <a 
href="http://www.geomesa.org/";>GeoMesa</a> project contributed Java support
+for dictionary-encoded Arrow vectors. We followed up with C++ and Python
+support (and <code class="highlighter-rouge">pandas.Categorical</code> 
integration). We have not yet implemented
+full integration tests for dictionaries (for sending this data between C++ and
+Java), but hope to achieve this in the 0.4.0 Arrow release.</p>
+
+<p>This common data representation technique for categorical data allows 
multiple
+record batches to share a common “dictionary”, with the values in the 
batches
+being represented as integers referencing the dictionary. This data is called
+“categorical” or “factor” in statistical languages, while in file 
formats like
+Apache Parquet it is strictly used for data compression.</p>
+
+<h3 id="expanded-date-time-and-fixed-size-types">Expanded Date, Time, and 
Fixed Size Types</h3>
+
+<p>A notable omission from the 0.2.0 release was complete and 
integration-tested
+support for the gamut of date and time types that occur in the wild. These are
+needed for <a href="http://parquet.apache.org";>Apache Parquet</a> and Apache 
Spark integration.</p>
+
+<ul>
+  <li><strong>Date</strong>: 32-bit (days unit) and 64-bit (milliseconds 
unit)</li>
+  <li><strong>Time</strong>: 64-bit integer with unit (second, millisecond, 
microsecond, nanosecond)</li>
+  <li><strong>Timestamp</strong>: 64-bit integer with unit, with or without 
timezone</li>
+  <li><strong>Fixed Size Binary</strong>: Primitive values occupying certain 
number of bytes</li>
+  <li><strong>Fixed Size List</strong>: List values with constant size (no 
separate offsets vector)</li>
+</ul>
+
+<p>We have additionally added experimental support for exact decimals in C++ 
using
+<a href="https://github.com/boostorg/multiprecision";>Boost.Multiprecision</a>, 
though we have not yet hardened the Decimal memory
+format between the Java and C++ implementations.</p>
+
+<h3 id="c-and-python-support-on-windows">C++ and Python Support on Windows</h3>
+
+<p>We have made many general improvements to development and packaging for 
general
+C++ and Python development. 0.3.0 is the first release to bring full C++ and
+Python support for Windows on Visual Studio (MSVC) 2015 and 2017. In addition
+to adding Appveyor continuous integration for MSVC, we have also written guides
+for building from source on Windows: <a 
href="https://github.com/apache/arrow/blob/master/cpp/apidoc/Windows.md";>C++</a>
 and <a 
href="https://github.com/apache/arrow/blob/master/python/doc/source/development.rst";>Python</a>.</p>
+
+<p>For the first time, you can install the Arrow Python library on Windows from
+<a href="https://conda-forge.github.io";>conda-forge</a>:</p>
+
+<div class="language-shell highlighter-rouge"><pre 
class="highlight"><code>conda install pyarrow -c conda-forge
+</code></pre>
+</div>
+
+<h3 id="c-glib-bindings-with-support-for-ruby-lua-and-more">C (GLib) Bindings, 
with support for Ruby, Lua, and more</h3>
+
+<p><a href="http://github.com/kou";>Kouhei Sutou</a> is a new Apache Arrow 
contributor and has contributed GLib C
+bindings (to the C++ libraries) for Linux. Using a C middleware framework
+called <a href="https://wiki.gnome.org/Projects/GObjectIntrospection";>GObject 
Introspection</a>, it is possible to use these bindings
+seamlessly in Ruby, Lua, Go, and <a 
href="https://wiki.gnome.org/Projects/GObjectIntrospection/Users";>other 
programming languages</a>. We will
+probably need to publish some follow up blogs explaining how these bindings
+work and how to use them.</p>
+
+<h3 id="apache-spark-integration-for-pyspark">Apache Spark Integration for 
PySpark</h3>
+
+<p>We have been collaborating with the Apache Spark community on <a 
href="https://issues.apache.org/jira/browse/SPARK-13534";>SPARK-13534</a>
+to add support for using Arrow to accelerate <code 
class="highlighter-rouge">DataFrame.toPandas</code> in
+PySpark. We have observed over <a 
href="https://github.com/apache/spark/pull/15821#issuecomment-282175163";><strong>40x
 speedup</strong></a> from the more efficient
+data serialization.</p>
+
+<p>Using Arrow in PySpark opens the door to many other performance 
optimizations,
+particularly around UDF evaluation (e.g. <code 
class="highlighter-rouge">map</code> and <code 
class="highlighter-rouge">filter</code> operations with
+Python lambda functions).</p>
+
+<h3 id="new-python-feature-memory-views-feather-apache-parquet-support">New 
Python Feature: Memory Views, Feather, Apache Parquet support</h3>
+
+<p>Arrow’s Python library <code class="highlighter-rouge">pyarrow</code> is 
a Cython binding for the <code class="highlighter-rouge">libarrow</code> and
+<code class="highlighter-rouge">libarrow_python</code> C++ libraries, which 
handle inteoperability with NumPy,
+<a href="http://pandas.pydata.org";>pandas</a>, and the Python standard 
library.</p>
+
+<p>At the heart of Arrow’s C++ libraries is the <code 
class="highlighter-rouge">arrow::Buffer</code> object, which is a
+managed memory view supporting zero-copy reads and slices. <a 
href="https://github.com/JeffKnupp";>Jeff Knupp</a>
+contributed integration between Arrow buffers and the Python buffer protocol
+and memoryviews, so now code like this is possible:</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="n">In</span> <span class="p">[</span><span 
class="mi">6</span><span class="p">]:</span> <span class="kn">import</span> 
<span class="nn">pyarrow</span> <span class="kn">as</span> <span 
class="nn">pa</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">7</span><span class="p">]:</span> <span class="n">buf</span> <span 
class="o">=</span> <span class="n">pa</span><span class="o">.</span><span 
class="n">frombuffer</span><span class="p">(</span><span 
class="n">b</span><span class="s">'foobarbaz'</span><span class="p">)</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">8</span><span class="p">]:</span> <span class="n">buf</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">8</span><span class="p">]:</span> <span class="o">&lt;</span><span 
class="n">pyarrow</span><span class="o">.</span><span class="n">_io</span><span 
class="o">.</span><span class="n">Buffer</span> <span class="n">at</span> <span 
class="mh">0x7f6c0a84b538</span><span class="o">&gt;</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">9</span><span class="p">]:</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">buf</span><span class="p">)</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">9</span><span class="p">]:</span> <span class="o">&lt;</span><span 
class="n">memory</span> <span class="n">at</span> <span 
class="mh">0x7f6c0a8c5e88</span><span class="o">&gt;</span>
+
+<span class="n">In</span> <span class="p">[</span><span 
class="mi">10</span><span class="p">]:</span> <span class="n">buf</span><span 
class="o">.</span><span class="n">to_pybytes</span><span class="p">()</span>
+<span class="n">Out</span><span class="p">[</span><span 
class="mi">10</span><span class="p">]:</span> <span class="n">b</span><span 
class="s">'foobarbaz'</span>
+</code></pre>
+</div>
+
+<p>We have significantly expanded <a 
href="http://parquet.apache.org";><strong>Apache Parquet</strong></a> support 
via the C++
+Parquet implementation <a 
href="https://github.com/apache/parquet-cpp";>parquet-cpp</a>. This includes 
support for partitioned
+datasets on disk or in HDFS. We added initial Arrow-powered Parquet support <a 
href="https://github.com/dask/dask/commit/68f9e417924a985c1f2e2a587126833c70a2e9f4";>in
+the Dask project</a>, and look forward to more collaborations with the Dask
+developers on distributed processing of pandas data.</p>
+
+<p>With Arrow’s support for pandas maturing, we were able to merge in the
+<a href="https://github.com/wesm/feather";><strong>Feather format</strong></a> 
implementation, which is essentially a special case of
+the Arrow random access format. We’ll be continuing Feather development 
within
+the Arrow codebase. For example, Feather can now read and write with Python
+file objects using Arrow’s Python binding layer.</p>
+
+<p>We also implemented more robust support for pandas-specific data types, like
+<code class="highlighter-rouge">DatetimeTZ</code> and <code 
class="highlighter-rouge">Categorical</code>.</p>
+
+<h3 id="support-for-tensors-and-beyond-in-c-library">Support for Tensors and 
beyond in C++ Library</h3>
+
+<p>There has been increased interest in using Apache Arrow as a tool for 
zero-copy
+shared memory management for machine learning applications. A flagship example
+is the <a href="https://github.com/ray-project/ray";>Ray project</a> from the 
UC Berkeley <a href="https://rise.cs.berkeley.edu/";>RISELab</a>.</p>
+
+<p>Machine learning deals in additional kinds of data structures beyond what 
the
+Arrow columnar format supports, like multidimensional arrays aka 
“tensors”. As
+such, we implemented the <a 
href="http://arrow.apache.org/docs/cpp/classarrow_1_1_tensor.html";><code 
class="highlighter-rouge">arrow::Tensor</code></a> C++ type which can utilize 
the
+rest of Arrow’s zero-copy shared memory machinery (using <code 
class="highlighter-rouge">arrow::Buffer</code> for
+managing memory lifetime). In C++ in particular, we will want to provide for
+additional data structures utilizing common IO and memory management tools.</p>
+
+<h3 id="start-of-javascript-typescript-implementation">Start of JavaScript 
(TypeScript) Implementation</h3>
+
+<p><a href="https://github.com/TheNeuralBit";>Brian Hulette</a> started 
developing an Arrow implementation in
+<a href="https://github.com/apache/arrow/tree/master/js";>TypeScript</a> for 
use in NodeJS and browser-side applications. We are
+benefitting from Flatbuffers’ first class support for JavaScript.</p>
+
+<h3 id="improved-website-and-developer-documentation">Improved Website and 
Developer Documentation</h3>
+
+<p>Since 0.2.0 we have implemented a new website stack for publishing
+documentation and blogs based on <a href="https://jekyllrb.com";>Jekyll</a>. 
Kouhei Sutou developed a <a 
href="https://github.com/red-data-tools/jekyll-jupyter-notebook";>Jekyll
+Jupyter Notebook plugin</a> so that we can use Jupyter to author content for
+the Arrow website.</p>
+
+<p>On the website, we have now published API documentation for the C, C++, 
Java,
+and Python subcomponents. Within these you will find easier-to-follow developer
+instructions for getting started.</p>
+
+<h3 id="contributors">Contributors</h3>
+
+<p>Thanks to all who contributed patches to this release.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>$ git shortlog -sn 
apache-arrow-0.2.0..apache-arrow-0.3.0
+    119 Wes McKinney
+     55 Kouhei Sutou
+     18 Uwe L. Korn
+     17 Julien Le Dem
+      9 Phillip Cloud
+      6 Bryan Cutler
+      5 Philipp Moritz
+      5 Emilio Lahr-Vivaz
+      4 Max Risuhin
+      4 Johan Mabille
+      4 Jeff Knupp
+      3 Steven Phillips
+      3 Miki Tebeka
+      2 Leif Walsh
+      2 Jeff Reback
+      2 Brian Hulette
+      1 Tsuyoshi Ozawa
+      1 rvernica
+      1 Nong Li
+      1 Julien Lafaye
+      1 Itai Incze
+      1 Holden Karau
+      1 Deepak Majeti
+</code></pre>
+</div>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/05/22/0.4.0-release/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/05/22/0.4.0-release/index.html 
b/blog/2017/05/22/0.4.0-release/index.html
new file mode 100644
index 0000000..a6d3406
--- /dev/null
+++ b/blog/2017/05/22/0.4.0-release/index.html
@@ -0,0 +1,225 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+    
+    <!-- Global Site Tag (gtag.js) - Google Analytics -->
+<script async 
src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1";></script>
+<script>
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments)};
+  gtag('js', new Date());
+
+  gtag('config', 'UA-107500873-1');
+</script>
+
+    
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Apache Arrow 0.4.0 Release
+      <a href="/blog/2017/05/22/0.4.0-release/" class="permalink" 
title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            22 May 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com";><i class="fa fa-user"></i> Wes 
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p>The Apache Arrow team is pleased to announce the 0.4.0 release of the
+project. While only 17 days since the release, it includes <a 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.4.0"><strong>77
 resolved
+JIRAs</strong></a> with some important new features and bug fixes.</p>
+
+<p>See the <a href="http://arrow.apache.org/install";>Install Page</a> to learn 
how to get the libraries for your platform.</p>
+
+<h3 id="expanded-javascript-implementation">Expanded JavaScript 
Implementation</h3>
+
+<p>The TypeScript Arrow implementation has undergone some work since 0.3.0 and 
can
+now read a substantial portion of the Arrow streaming binary format. As this
+implementation develops, we will eventually want to include JS in the
+integration test suite along with Java and C++ to ensure wire
+cross-compatibility.</p>
+
+<h3 id="python-support-for-apache-parquet-on-windows">Python Support for 
Apache Parquet on Windows</h3>
+
+<p>With the <a 
href="https://github.com/apache/parquet-cpp/releases/tag/apache-parquet-cpp-1.1.0";>1.1.0
 C++ release</a> of <a href="http://parquet.apache.org";>Apache Parquet</a>, we 
have enabled the
+<code class="highlighter-rouge">pyarrow.parquet</code> extension on Windows 
for Python 3.5 and 3.6. This should
+appear in conda-forge packages and PyPI in the near future. Developers can
+follow the <a 
href="http://arrow.apache.org/docs/python/development.html";>source build 
instructions</a>.</p>
+
+<h3 id="generalizing-arrow-streams">Generalizing Arrow Streams</h3>
+
+<p>In the 0.2.0 release, we defined the first version of the Arrow streaming
+binary format for low-cost messaging with columnar data. These streams presume
+that the message components are written as a continuous byte stream over a
+socket or file.</p>
+
+<p>We would like to be able to support other other transport protocols, like
+<a href="http://grpc.io/";>gRPC</a>, for the message components of Arrow 
streams. To that end, in C++ we
+defined an abstract stream reader interface, for which the current contiguous
+streaming format is one implementation:</p>
+
+<figure class="highlight"><pre><code class="language-cpp" 
data-lang="cpp"><span class="k">class</span> <span 
class="nc">RecordBatchReader</span> <span class="p">{</span>
+ <span class="k">public</span><span class="o">:</span>
+  <span class="k">virtual</span> <span class="n">std</span><span 
class="o">::</span><span class="n">shared_ptr</span><span 
class="o">&lt;</span><span class="n">Schema</span><span class="o">&gt;</span> 
<span class="n">schema</span><span class="p">()</span> <span 
class="k">const</span> <span class="o">=</span> <span class="mi">0</span><span 
class="p">;</span>
+  <span class="k">virtual</span> <span class="n">Status</span> <span 
class="n">GetNextRecordBatch</span><span class="p">(</span><span 
class="n">std</span><span class="o">::</span><span 
class="n">shared_ptr</span><span class="o">&lt;</span><span 
class="n">RecordBatch</span><span class="o">&gt;*</span> <span 
class="n">batch</span><span class="p">)</span> <span class="o">=</span> <span 
class="mi">0</span><span class="p">;</span>
+<span class="p">};</span></code></pre></figure>
+
+<p>It would also be good to define abstract stream reader and writer 
interfaces in
+the Java implementation.</p>
+
+<p>In an upcoming blog post, we will explain in more depth how Arrow streams 
work,
+but you can learn more about them by reading the <a 
href="http://arrow.apache.org/docs/ipc.html";>IPC specification</a>.</p>
+
+<h3 id="c-and-cython-api-for-python-extensions">C++ and Cython API for Python 
Extensions</h3>
+
+<p>As other Python libraries with C or C++ extensions use Apache Arrow, they 
will
+need to be able to return Python objects wrapping the underlying C++
+objects. In this release, we have implemented a prototype C++ API which enables
+Python wrapper objects to be constructed from C++ extension code:</p>
+
+<figure class="highlight"><pre><code class="language-cpp" 
data-lang="cpp"><span class="cp">#include "arrow/python/pyarrow.h"
+</span>
+<span class="k">if</span> <span class="p">(</span><span 
class="o">!</span><span class="n">arrow</span><span class="o">::</span><span 
class="n">py</span><span class="o">::</span><span 
class="n">import_pyarrow</span><span class="p">())</span> <span 
class="p">{</span>
+  <span class="c1">// Error
+</span><span class="p">}</span>
+
+<span class="n">std</span><span class="o">::</span><span 
class="n">shared_ptr</span><span class="o">&lt;</span><span 
class="n">arrow</span><span class="o">::</span><span 
class="n">RecordBatch</span><span class="o">&gt;</span> <span 
class="n">cpp_batch</span> <span class="o">=</span> <span 
class="n">GetData</span><span class="p">(...);</span>
+<span class="n">PyObject</span><span class="o">*</span> <span 
class="n">py_batch</span> <span class="o">=</span> <span 
class="n">arrow</span><span class="o">::</span><span class="n">py</span><span 
class="o">::</span><span class="n">wrap_batch</span><span 
class="p">(</span><span class="n">cpp_batch</span><span 
class="p">);</span></code></pre></figure>
+
+<p>This API is intended to be usable from Cython code as well:</p>
+
+<figure class="highlight"><pre><code class="language-cython" 
data-lang="cython">cimport pyarrow
+pyarrow.import_pyarrow()</code></pre></figure>
+
+<h3 id="python-wheel-installers-on-macos">Python Wheel Installers on macOS</h3>
+
+<p>With this release, <code class="highlighter-rouge">pip install 
pyarrow</code> works on macOS (OS X) as well as
+Linux. We are working on providing binary wheel installers for Windows as 
well.</p>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/06/14/0.4.1-release/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/06/14/0.4.1-release/index.html 
b/blog/2017/06/14/0.4.1-release/index.html
index 7bb6afa..733fcc8 100644
--- a/blog/2017/06/14/0.4.1-release/index.html
+++ b/blog/2017/06/14/0.4.1-release/index.html
@@ -64,6 +64,7 @@
             <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
             <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
             <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
           </ul>
         </li>
         <li class="dropdown">

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/06/16/turbodbc-arrow/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/06/16/turbodbc-arrow/index.html 
b/blog/2017/06/16/turbodbc-arrow/index.html
index 0578000..1cdb16e 100644
--- a/blog/2017/06/16/turbodbc-arrow/index.html
+++ b/blog/2017/06/16/turbodbc-arrow/index.html
@@ -64,6 +64,7 @@
             <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
             <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
             <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
           </ul>
         </li>
         <li class="dropdown">

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/07/24/0.5.0-release/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/07/24/0.5.0-release/index.html 
b/blog/2017/07/24/0.5.0-release/index.html
new file mode 100644
index 0000000..8e99201
--- /dev/null
+++ b/blog/2017/07/24/0.5.0-release/index.html
@@ -0,0 +1,235 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+    
+    <!-- Global Site Tag (gtag.js) - Google Analytics -->
+<script async 
src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1";></script>
+<script>
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments)};
+  gtag('js', new Date());
+
+  gtag('config', 'UA-107500873-1');
+</script>
+
+    
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Apache Arrow 0.5.0 Release
+      <a href="/blog/2017/07/24/0.5.0-release/" class="permalink" 
title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            24 Jul 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://wesmckinney.com";><i class="fa fa-user"></i> Wes 
McKinney (wesm)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p>The Apache Arrow team is pleased to announce the 0.5.0 release. It includes
+<a 
href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.5.0"><strong>130
 resolved JIRAs</strong></a> with some new features, expanded integration
+testing between implementations, and bug fixes. The Arrow memory format remains
+stable since the 0.3.x and 0.4.x releases.</p>
+
+<p>See the <a href="http://arrow.apache.org/install";>Install Page</a> to learn 
how to get the libraries for your
+platform. The <a href="http://arrow.apache.org/release/0.5.0.html";>complete 
changelog</a> is also available.</p>
+
+<h2 id="expanded-integration-testing">Expanded Integration Testing</h2>
+
+<p>In this release, we added compatibility tests for dictionary-encoded data
+between Java and C++. This enables the distinct values (the 
<em>dictionary</em>) in a
+vector to be transmitted as part of an Arrow schema while the record batches
+contain integers which correspond to the dictionary.</p>
+
+<p>So we might have:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>data (string): 
['foo', 'bar', 'foo', 'bar']
+</code></pre>
+</div>
+
+<p>In dictionary-encoded form, this could be represented as:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>indices (int8): 
[0, 1, 0, 1]
+dictionary (string): ['foo', 'bar']
+</code></pre>
+</div>
+
+<p>In upcoming releases, we plan to complete integration testing for the 
remaining
+data types (including some more complicated types like unions and decimals) on
+the road to a 1.0.0 release in the future.</p>
+
+<h2 id="c-activity">C++ Activity</h2>
+
+<p>We completed a number of significant pieces of work in the C++ part of 
Apache
+Arrow.</p>
+
+<h3 id="using-jemalloc-as-default-memory-allocator">Using jemalloc as default 
memory allocator</h3>
+
+<p>We decided to use <a 
href="https://github.com/jemalloc/jemalloc";>jemalloc</a> as the default memory 
allocator unless it is
+explicitly disabled. This memory allocator has significant performance
+advantages in Arrow workloads over the default <code 
class="highlighter-rouge">malloc</code> implementation. We will
+publish a blog post going into more detail about this and why you might 
care.</p>
+
+<h3 id="sharing-more-c-code-with-apache-parquet">Sharing more C++ code with 
Apache Parquet</h3>
+
+<p>We imported the compression library interfaces and dictionary encoding
+algorithms from the <a href="http://github.com/apache/parquet-cpp";>Apache 
Parquet C++ library</a>. The Parquet library now
+depends on this code in Arrow, and we will be able to use it more easily for
+data compression in Arrow use cases.</p>
+
+<p>As part of incorporating Parquet’s dictionary encoding utilities, we have
+developed an <code class="highlighter-rouge">arrow::DictionaryBuilder</code> 
class to enable building
+dictionary-encoded arrays iteratively. This can help save memory and yield
+better performance when interacting with databases, Parquet files, or other
+sources which may have columns having many duplicates.</p>
+
+<h3 id="support-for-lz4-and-zstd-compressors">Support for LZ4 and ZSTD 
compressors</h3>
+
+<p>We added LZ4 and ZSTD compression library support. In ARROW-300 and other
+planned work, we intend to add some compression features for data sent via 
RPC.</p>
+
+<h2 id="python-activity">Python Activity</h2>
+
+<p>We fixed many bugs which were affecting Parquet and Feather users and fixed
+several other rough edges with normal Arrow use. We also added some additional
+Arrow type conversions: structs, lists embedded in pandas objects, and Arrow
+time types (which deserialize to the <code 
class="highlighter-rouge">datetime.time</code> type).</p>
+
+<p>In upcoming releases we plan to continue to improve <a 
href="http://github.com/dask/dask";>Dask</a> support and
+performance for distributed processing of Apache Parquet files with 
pyarrow.</p>
+
+<h2 id="the-road-ahead">The Road Ahead</h2>
+
+<p>We have much work ahead of us to build out Arrow integrations in other data
+systems to improve their processing performance and interoperability with other
+systems.</p>
+
+<p>We are discussing the roadmap to a future 1.0.0 release on the <a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>developer
+mailing list</a>. Please join the discussion there.</p>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/07/26/spark-arrow/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/07/26/spark-arrow/index.html 
b/blog/2017/07/26/spark-arrow/index.html
index 425ffbf..e43a7da 100644
--- a/blog/2017/07/26/spark-arrow/index.html
+++ b/blog/2017/07/26/spark-arrow/index.html
@@ -64,6 +64,7 @@
             <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
             <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
             <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
           </ul>
         </li>
         <li class="dropdown">
@@ -173,7 +174,7 @@ the conversion to Arrow data can be done on the JVM and 
pushed back for the Spar
 executors to perform in parallel, drastically reducing the load on the 
driver.</p>
 
 <p>As of the merging of <a 
href="https://issues.apache.org/jira/browse/SPARK-13534";>SPARK-13534</a>, the 
use of Arrow when calling <code class="highlighter-rouge">toPandas()</code>
-needs to be enabled by setting the SQLConf 
“spark.sql.execution.arrow.enable” to
+needs to be enabled by setting the SQLConf 
“spark.sql.execution.arrow.enabled” to
 “true”.  Let’s look at a simple usage example.</p>
 
 <div class="highlighter-rouge"><pre class="highlight"><code>Welcome to
@@ -199,7 +200,7 @@ In [2]: %time pdf = df.toPandas()
 CPU times: user 17.4 s, sys: 792 ms, total: 18.1 s
 Wall time: 20.7 s
 
-In [3]: spark.conf.set("spark.sql.execution.arrow.enable", "true")
+In [3]: spark.conf.set("spark.sql.execution.arrow.enabled", "true")
 
 In [4]: %time pdf = df.toPandas()
 CPU times: user 40 ms, sys: 32 ms, total: 72 ms                                
 
@@ -234,7 +235,7 @@ It is planned to add pyarrow as a pyspark dependency so that
 
 <p>Currently, the controlling SQLConf is disabled by default. This can be 
enabled
 programmatically as in the example above or by adding the line
-“spark.sql.execution.arrow.enable=true” to <code 
class="highlighter-rouge">SPARK_HOME/conf/spark-defaults.conf</code>.</p>
+“spark.sql.execution.arrow.enabled=true” to <code 
class="highlighter-rouge">SPARK_HOME/conf/spark-defaults.conf</code>.</p>
 
 <p>Also, not all Spark data types are currently supported and limited to 
primitive
 types. Expanded type support is in the works and expected to also be in the 
Spark

http://git-wip-us.apache.org/repos/asf/arrow-site/blob/61e9ea7e/blog/2017/08/07/plasma-in-memory-object-store/index.html
----------------------------------------------------------------------
diff --git a/blog/2017/08/07/plasma-in-memory-object-store/index.html 
b/blog/2017/08/07/plasma-in-memory-object-store/index.html
new file mode 100644
index 0000000..d2f25da
--- /dev/null
+++ b/blog/2017/08/07/plasma-in-memory-object-store/index.html
@@ -0,0 +1,273 @@
+<!DOCTYPE html>
+<html lang="en-US">
+  <head>
+    <meta charset="UTF-8">
+    <title>Apache Arrow Homepage</title>
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="generator" content="Jekyll v3.4.3">
+    <!-- The above 3 meta tags *must* come first in the head; any other head 
content must come *after* these tags -->
+    <link rel="icon" type="image/x-icon" href="/favicon.ico">
+
+    <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+
+    <link href="/css/main.css" rel="stylesheet">
+    <link href="/css/syntax.css" rel="stylesheet">
+    <script src="https://code.jquery.com/jquery-3.2.1.min.js";
+            integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4="
+            crossorigin="anonymous"></script>
+    <script src="/assets/javascripts/bootstrap.min.js"></script>
+    
+    <!-- Global Site Tag (gtag.js) - Google Analytics -->
+<script async 
src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1";></script>
+<script>
+  window.dataLayer = window.dataLayer || [];
+  function gtag(){dataLayer.push(arguments)};
+  gtag('js', new Date());
+
+  gtag('config', 'UA-107500873-1');
+</script>
+
+    
+  </head>
+
+
+
+<body class="wrap">
+  <div class="container">
+    <nav class="navbar navbar-default">
+  <div class="container-fluid">
+    <div class="navbar-header">
+      <button type="button" class="navbar-toggle" data-toggle="collapse" 
data-target="#arrow-navbar">
+        <span class="sr-only">Toggle navigation</span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+        <span class="icon-bar"></span>
+      </button>
+      <a class="navbar-brand" href="/">Apache 
Arrow&#8482;&nbsp;&nbsp;&nbsp;</a>
+    </div>
+
+    <!-- Collect the nav links, forms, and other content for toggling -->
+    <div class="collapse navbar-collapse" id="arrow-navbar">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Project Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/install/">Install</a></li>
+            <li><a href="/blog/">Blog</a></li>
+            <li><a href="/release/">Releases</a></li>
+            <li><a href="https://issues.apache.org/jira/browse/ARROW";>Issue 
Tracker</a></li>
+            <li><a href="https://github.com/apache/arrow";>Source Code</a></li>
+            <li><a 
href="http://mail-archives.apache.org/mod_mbox/arrow-dev/";>Mailing List</a></li>
+            <li><a href="https://apachearrowslackin.herokuapp.com";>Slack 
Channel</a></li>
+            <li><a href="/committers/">Committers</a></li>
+            <li><a href="/powered_by/">Powered By</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Specification<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/memory_layout.html">Memory Layout</a></li>
+            <li><a href="/docs/metadata.html">Metadata</a></li>
+            <li><a href="/docs/ipc.html">Messaging / IPC</a></li>
+          </ul>
+        </li>
+
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">Documentation<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="/docs/python">Python</a></li>
+            <li><a href="/docs/cpp">C++ API</a></li>
+            <li><a href="/docs/java">Java API</a></li>
+            <li><a href="/docs/c_glib">C GLib API</a></li>
+          </ul>
+        </li>
+        <!-- <li><a href="/blog">Blog</a></li> -->
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown"
+             role="button" aria-haspopup="true"
+             aria-expanded="false">ASF Links<span class="caret"></span>
+          </a>
+          <ul class="dropdown-menu">
+            <li><a href="http://www.apache.org/";>ASF Website</a></li>
+            <li><a href="http://www.apache.org/licenses/";>License</a></li>
+            <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Donate</a></li>
+            <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+            <li><a href="http://www.apache.org/security/";>Security</a></li>
+          </ul>
+        </li>
+      </ul>
+      <a href="http://www.apache.org/";>
+        <img style="float:right;" src="/img/asf_logo.svg" width="120px"/>
+      </a>
+      </div><!-- /.navbar-collapse -->
+    </div>
+  </nav>
+
+
+    <h2>
+      Plasma In-Memory Object Store
+      <a href="/blog/2017/08/07/plasma-in-memory-object-store/" 
class="permalink" title="Permalink">∞</a>
+    </h2>
+
+    
+
+    <div class="panel">
+      <div class="panel-body">
+        <div>
+          <span class="label label-default">Published</span>
+          <span class="published">
+            <i class="fa fa-calendar"></i>
+            07 Aug 2017
+          </span>
+        </div>
+        <div>
+          <span class="label label-default">By</span>
+          <a href="http://people.apache.org/~Philipp Moritz and Robert 
Nishihara"><i class="fa fa-user"></i>  (Philipp Moritz and Robert Nishihara)</a>
+        </div>
+      </div>
+    </div>
+
+    <!--
+
+-->
+
+<p><em><a href="https://people.eecs.berkeley.edu/~pcmoritz/";>Philipp 
Moritz</a> and <a href="http://www.robertnishihara.com";>Robert Nishihara</a> 
are graduate students at UC
+ Berkeley.</em></p>
+
+<h2 id="plasma-a-high-performance-shared-memory-object-store">Plasma: A 
High-Performance Shared-Memory Object Store</h2>
+
+<h3 id="motivating-plasma">Motivating Plasma</h3>
+
+<p>This blog post presents Plasma, an in-memory object store that is being
+developed as part of Apache Arrow. <strong>Plasma holds immutable objects in 
shared
+memory so that they can be accessed efficiently by many clients across process
+boundaries.</strong> In light of the trend toward larger and larger multicore 
machines,
+Plasma enables critical performance optimizations in the big data regime.</p>
+
+<p>Plasma was initially developed as part of <a 
href="https://github.com/ray-project/ray";>Ray</a>, and has recently been moved
+to Apache Arrow in the hopes that it will be broadly useful.</p>
+
+<p>One of the goals of Apache Arrow is to serve as a common data layer enabling
+zero-copy data exchange between multiple frameworks. A key component of this
+vision is the use of off-heap memory management (via Plasma) for storing and
+sharing Arrow-serialized objects between applications.</p>
+
+<p><strong>Expensive serialization and deserialization as well as data copying 
are a
+common performance bottleneck in distributed computing.</strong> For example, a
+Python-based execution framework that wishes to distribute computation across
+multiple Python “worker” processes and then aggregate the results in a 
single
+“driver” process may choose to serialize data using the built-in <code 
class="highlighter-rouge">pickle</code>
+library. Assuming one Python process per core, each worker process would have 
to
+copy and deserialize the data, resulting in excessive memory usage. The driver
+process would then have to deserialize results from each of the workers,
+resulting in a bottleneck.</p>
+
+<p>Using Plasma plus Arrow, the data being operated on would be placed in the
+Plasma store once, and all of the workers would read the data without copying 
or
+deserializing it (the workers would map the relevant region of memory into 
their
+own address spaces). The workers would then put the results of their 
computation
+back into the Plasma store, which the driver could then read and aggregate
+without copying or deserializing the data.</p>
+
+<h3 id="the-plasma-api">The Plasma API:</h3>
+
+<p>Below we illustrate a subset of the API. The C++ API is documented more 
fully
+<a 
href="https://github.com/apache/arrow/blob/master/cpp/apidoc/tutorials/plasma.md";>here</a>,
 and the Python API is documented <a 
href="https://github.com/apache/arrow/blob/master/python/doc/source/plasma.rst";>here</a>.</p>
+
+<p><strong>Object IDs:</strong> Each object is associated with a string of 
bytes.</p>
+
+<p><strong>Creating an object:</strong> Objects are stored in Plasma in two 
stages. First, the
+object store <em>creates</em> the object by allocating a buffer for it. At 
this point,
+the client can write to the buffer and construct the object within the 
allocated
+buffer. When the client is done, the client <em>seals</em> the buffer making 
the object
+immutable and making it available to other Plasma clients.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Create an object.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="n">object_size</span> <span class="o">=</span> <span 
class="mi">1000</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">client</span><span class="o">.</span><span 
class="n">create</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">,</span> <span 
class="n">object_size</span><span class="p">))</span>
+
+<span class="c"># Write to the buffer.</span>
+<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> 
<span class="nb">range</span><span class="p">(</span><span 
class="mi">1000</span><span class="p">):</span>
+    <span class="nb">buffer</span><span class="p">[</span><span 
class="n">i</span><span class="p">]</span> <span class="o">=</span> <span 
class="mi">0</span>
+
+<span class="c"># Seal the object making it immutable and available to other 
clients.</span>
+<span class="n">client</span><span class="o">.</span><span 
class="n">seal</span><span class="p">(</span><span 
class="n">object_id</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p><strong>Getting an object:</strong> After an object has been sealed, any 
client who knows the
+object ID can get the object.</p>
+
+<div class="language-python highlighter-rouge"><pre 
class="highlight"><code><span class="c"># Get the object from the store. This 
blocks until the object has been sealed.</span>
+<span class="n">object_id</span> <span class="o">=</span> <span 
class="n">pyarrow</span><span class="o">.</span><span 
class="n">plasma</span><span class="o">.</span><span 
class="n">ObjectID</span><span class="p">(</span><span class="mi">20</span> 
<span class="o">*</span> <span class="n">b</span><span 
class="s">'a'</span><span class="p">)</span>
+<span class="p">[</span><span class="n">buff</span><span class="p">]</span> 
<span class="o">=</span> <span class="n">client</span><span 
class="o">.</span><span class="n">get</span><span class="p">([</span><span 
class="n">object_id</span><span class="p">])</span>
+<span class="nb">buffer</span> <span class="o">=</span> <span 
class="n">memoryview</span><span class="p">(</span><span 
class="n">buff</span><span class="p">)</span>
+</code></pre>
+</div>
+
+<p>If the object has not been sealed yet, then the call to <code 
class="highlighter-rouge">client.get</code> will block
+until the object has been sealed.</p>
+
+<h3 id="a-sorting-application">A sorting application</h3>
+
+<p>To illustrate the benefits of Plasma, we demonstrate an <strong>11x 
speedup</strong> (on a
+machine with 20 physical cores) for sorting a large pandas DataFrame (one
+billion entries). The baseline is the built-in pandas sort function, which 
sorts
+the DataFrame in 477 seconds. To leverage multiple cores, we implement the
+following standard distributed sorting scheme.</p>
+
+<ul>
+  <li>We assume that the data is partitioned across K pandas DataFrames and 
that
+each one already lives in the Plasma store.</li>
+  <li>We subsample the data, sort the subsampled data, and use the result to 
define
+L non-overlapping buckets.</li>
+  <li>For each of the K data partitions and each of the L buckets, we find the
+subset of the data partition that falls in the bucket, and we sort that
+subset.</li>
+  <li>For each of the L buckets, we gather all of the K sorted subsets that 
fall in
+that bucket.</li>
+  <li>For each of the L buckets, we merge the corresponding K sorted 
subsets.</li>
+  <li>We turn each bucket into a pandas DataFrame and place it in the Plasma 
store.</li>
+</ul>
+
+<p>Using this scheme, we can sort the DataFrame (the data starts and ends in 
the
+Plasma store), in 44 seconds, giving an 11x speedup over the baseline.</p>
+
+<h3 id="design">Design</h3>
+
+<p>The Plasma store runs as a separate process. It is written in C++ and is
+designed as a single-threaded event loop based on the <a 
href="https://redis.io/";>Redis</a> event loop library.
+The plasma client library can be linked into applications. Clients communicate
+with the Plasma store via messages serialized using <a 
href="https://google.github.io/flatbuffers/";>Google Flatbuffers</a>.</p>
+
+<h3 id="call-for-contributions">Call for contributions</h3>
+
+<p>Plasma is a work in progress, and the API is currently unstable. Today 
Plasma is
+primarily used in <a href="https://github.com/ray-project/ray";>Ray</a> as an 
in-memory cache for Arrow serialized objects.
+We are looking for a broader set of use cases to help refine Plasma’s API. In
+addition, we are looking for contributions in a variety of areas including
+improving performance and building other language bindings. Please let us know
+if you are interested in getting involved with the project.</p>
+
+
+
+    <hr/>
+<footer class="footer">
+  <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache 
Arrow project logo are either registered trademarks or trademarks of The Apache 
Software Foundation in the United States and other countries.</p>
+  <p>&copy; 2017 Apache Software Foundation</p>
+</footer>
+
+  </div>
+</body>
+</html>

Reply via email to