http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3cd84682/build/blog/2017/12/18/0.8.0-release/index.html ---------------------------------------------------------------------- diff --git a/build/blog/2017/12/18/0.8.0-release/index.html b/build/blog/2017/12/18/0.8.0-release/index.html new file mode 100644 index 0000000..7ab1ec8 --- /dev/null +++ b/build/blog/2017/12/18/0.8.0-release/index.html @@ -0,0 +1,306 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + <li><a href="/docs/js">Javascript API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Apache Arrow 0.8.0 Release + <a href="/blog/2017/12/18/0.8.0-release/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 18 Dec 2017 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes McKinney (wesm)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.8.0 release. It is the +product of 10 weeks of development and includes <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.8.0"><strong>286 resolved JIRAs</strong></a> with +many new features and bug fixes to the various language implementations. This +is the largest release since 0.3.0 earlier this year.</p> + +<p>As part of work towards a stabilizing the Arrow format and making a 1.0.0 +release sometime in 2018, we made a series of backwards-incompatible changes to +the serialized Arrow metadata that requires Arrow readers and writers (0.7.1 +and earlier) to upgrade in order to be compatible with 0.8.0 and higher. We +expect future backwards-incompatible changes to be rare going forward.</p> + +<p>See the <a href="https://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="https://arrow.apache.org/release/0.8.0.html">complete changelog</a> is also available.</p> + +<p>We discuss some highlights from the release and other project news in this +post.</p> + +<h2 id="projects-powered-by-apache-arrow">Projects âPowered Byâ Apache Arrow</h2> + +<p>A growing ecosystem of projects are using Arrow to solve in-memory analytics +and data interchange problems. We have added a new <a href="http://arrow.apache.org/powered_by/">Powered By</a> page to the +Arrow website where we can acknowledge open source projects and companies which +are using Arrow. If you would like to add your project to the list as an Arrow +user, please let us know.</p> + +<h2 id="new-arrow-committers">New Arrow committers</h2> + +<p>Since the last release, we have added 5 new Apache committers:</p> + +<ul> + <li><a href="https://github.com/cpcloud">Phillip Cloud</a>, who has mainly contributed to C++ and Python</li> + <li><a href="https://github.com/BryanCutler">Bryan Cutler</a>, who has mainly contributed to Java and Spark integration</li> + <li><a href="https://github.com/icexelloss">Li Jin</a>, who has mainly contributed to Java and Spark integration</li> + <li><a href="https://github.com/trxcllnt">Paul Taylor</a>, who has mainly contributed to JavaScript</li> + <li><a href="https://github.com/siddharthteotia">Siddharth Teotia</a>, who has mainly contributed to Java</li> +</ul> + +<p>Welcome to the Arrow team, and thank you for your contributions!</p> + +<h2 id="improved-java-vector-api-performance-improvements">Improved Java vector API, performance improvements</h2> + +<p>Siddharth Teotia led efforts to revamp the Java vector API to make things +simpler and faster. As part of this, we removed the dichotomy between nullable +and non-nullable vectors.</p> + +<p>See <a href="https://arrow.apache.org/blog/2017/12/19/java-vector-improvements/">Siddâs blog post</a> for more about these changes.</p> + +<h2 id="decimal-support-in-c-python-consistency-with-java">Decimal support in C++, Python, consistency with Java</h2> + +<p><a href="https://github.com/cpcloud">Phillip Cloud</a> led efforts this release to harden details about exact +decimal values in the Arrow specification and ensure a consistent +implementation across Java, C++, and Python.</p> + +<p>Arrow now supports decimals represented internally as a 128-bit little-endian +integer, with a set precision and scale (as defined in many SQL-based +systems). As part of this work, we needed to change Javaâs internal +representation from big- to little-endian.</p> + +<p>We are now integration testing decimals between Java, C++, and Python, which +will facilitate Arrow adoption in Apache Spark and other systems that use both +Java and Python.</p> + +<p>Decimal data can now be read and written by the <a href="https://github.com/apache/parquet-cpp">Apache Parquet C++ +library</a>, including via pyarrow.</p> + +<p>In the future, we may implement support for smaller-precision decimals +represented by 32- or 64-bit integers.</p> + +<h2 id="c-improvements-expanded-kernels-library-and-more">C++ improvements: expanded kernels library and more</h2> + +<p>In C++, we have continued developing the new <code class="highlighter-rouge">arrow::compute</code> submodule +consisting of native computation fuctions for Arrow data. New contributor +<a href="https://github.com/licht-t">Licht Takeuchi</a> helped expand the supported types for type casting in +<code class="highlighter-rouge">compute::Cast</code>. We have also implemented new kernels <code class="highlighter-rouge">Unique</code> and +<code class="highlighter-rouge">DictionaryEncode</code> for computing the distinct elements of an array and +dictionary encoding (conversion to categorical), respectively.</p> + +<p>We expect the C++ computation âkernelâ library to be a major expansion area for +the project over the next year and beyond. Here, we can also implement SIMD- +and GPU-accelerated versions of basic in-memory analytics functionality.</p> + +<p>As minor breaking API change in C++, we have made the <code class="highlighter-rouge">RecordBatch</code> and <code class="highlighter-rouge">Table</code> +APIs âvirtualâ or abstract interfaces, to enable different implementations of a +record batch or table which conform to the standard interface. This will help +enable features like lazy IO or column loading.</p> + +<p>There was significant work improving the C++ library generally and supporting +work happening in Python and C. See the change log for full details.</p> + +<h2 id="glib-c-improvements-meson-build-gpu-support">GLib C improvements: Meson build, GPU support</h2> + +<p>Developing of the GLib-based C bindings has generally tracked work happening in +the C++ library. These bindings are being used to develop <a href="https://github.com/red-data-tools">data science tools +for Ruby users</a> and elsewhere.</p> + +<p>The C bindings now support the <a href="https://mesonbuild.com">Meson build system</a> in addition to +autotools, which enables them to be built on Windows.</p> + +<p>The Arrow GPU extension library is now also supported in the C bindings.</p> + +<h2 id="javascript-first-independent-release-on-npm">JavaScript: first independent release on NPM</h2> + +<p><a href="https://github.com/TheNeuralBit">Brian Hulette</a> and <a href="https://github.com/trxcllnt">Paul Taylor</a> have been continuing to drive efforts +on the TypeScript-based JavaScript implementation.</p> + +<p>Since the last release, we made a first JavaScript-only Apache release, version +0.2.0, which is <a href="http://npmjs.org/package/apache-arrow">now available on NPM</a>. We decided to make separate +JavaScript releases to enable the JS library to release more frequently than +the rest of the project.</p> + +<h2 id="python-improvements">Python improvements</h2> + +<p>In addition to some of the new features mentioned above, we have made a variety +of usability and performance improvements for integrations with pandas, NumPy, +Dask, and other Python projects which may make use of pyarrow, the Arrow Python +library.</p> + +<p>Some of these improvements include:</p> + +<ul> + <li><a href="http://arrow.apache.org/docs/python/ipc.html">Component-based serialization</a> for more flexible and memory-efficient +transport of large or complex Python objects</li> + <li>Substantially improved serialization performance for pandas objects when +using <code class="highlighter-rouge">pyarrow.serialize</code> and <code class="highlighter-rouge">pyarrow.deserialize</code>. This includes a special +<code class="highlighter-rouge">pyarrow.pandas_serialization_context</code> which further accelerates certain +internal details of pandas serialization * Support zero-copy reads for</li> + <li><code class="highlighter-rouge">pandas.DataFrame</code> using <code class="highlighter-rouge">pyarrow.deserialize</code> for objects without Python +objects</li> + <li>Multithreaded conversions from <code class="highlighter-rouge">pandas.DataFrame</code> to <code class="highlighter-rouge">pyarrow.Table</code> (we +already supported multithreaded conversions from Arrow back to pandas)</li> + <li>More efficient conversion from 1-dimensional NumPy arrays to Arrow format</li> + <li>New generic buffer compression and decompression APIs <code class="highlighter-rouge">pyarrow.compress</code> and +<code class="highlighter-rouge">pyarrow.decompress</code></li> + <li>Enhanced Parquet cross-compatibility with <a href="https://github.com/dask/fastparquet">fastparquet</a> and improved Dask +support</li> + <li>Python support for accessing Parquet row group column statistics</li> +</ul> + +<h2 id="upcoming-roadmap">Upcoming Roadmap</h2> + +<p>The 0.8.0 release includes some API and format changes, but upcoming releases +will focus on ompleting and stabilizing critical functionality to move the +project closer to a 1.0.0 release.</p> + +<p>With the ecosystem of projects using Arrow expanding rapidly, we will be +working to improve and expand the libraries in support of downstream use cases.</p> + +<p>We continue to look for more JavaScript, Julia, R, Rust, and other programming +language developers to join the project and expand the available +implementations and bindings to more languages.</p> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html>
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3cd84682/build/blog/2017/12/18/java-vector-improvements/index.html ---------------------------------------------------------------------- diff --git a/build/blog/2017/12/18/java-vector-improvements/index.html b/build/blog/2017/12/18/java-vector-improvements/index.html new file mode 100644 index 0000000..8e43378 --- /dev/null +++ b/build/blog/2017/12/18/java-vector-improvements/index.html @@ -0,0 +1,234 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + <li><a href="/docs/js">Javascript API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Improvements to Java Vector API in Apache Arrow 0.8.0 + <a href="/blog/2017/12/18/java-vector-improvements/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 18 Dec 2017 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://people.apache.org/~Siddharth Teotia"><i class="fa fa-user"></i> (Siddharth Teotia)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>This post gives insight into the major improvements in the Java implementation +of vectors. We undertook this work over the last 10 weeks since the last Arrow +release.</p> + +<h2 id="design-goals">Design Goals</h2> + +<ol> + <li>Improved maintainability and extensibility</li> + <li>Improved heap memory usage</li> + <li>No performance overhead on hot code paths</li> +</ol> + +<h2 id="background">Background</h2> + +<h3 id="improved-maintainability-and-extensibility">Improved maintainability and extensibility</h3> + +<p>We use templates in several places for compile time Java code generation for +different vector classes, readers, writers etc. Templates are helpful as the +developers donât have to write a lot of duplicate code.</p> + +<p>However, we realized that over a period of time some specific Java +templates became extremely complex with giant if-else blocks, poor code indentation +and documentation. All this impacted the ability to easily extend these templates +for adding new functionality or improving the existing infrastructure.</p> + +<p>So we evaluated the usage of templates for compile time code generation and +decided not to use complex templates in some places by writing small amount of +duplicate code which is elegant, well documented and extensible.</p> + +<h3 id="improved-heap-usage">Improved heap usage</h3> + +<p>We did extensive memory analysis downstream in <a href="https://www.dremio.com/">Dremio</a> where Arrow is used +heavily for in-memory query execution on columnar data. The general conclusion +was that Arrowâs Java vector classes have non-negligible heap overhead and +volume of objects was too high. There were places in code where we were +creating objects unnecessarily and using structures that could be substituted +with better alternatives.</p> + +<h3 id="no-performance-overhead-on-hot-code-paths">No performance overhead on hot code paths</h3> + +<p>Java vectors used delegation and abstraction heavily throughout the object +hierarchy. The performance critical get/set methods of vectors went through a +chain of function calls back and forth between different objects before doing +meaningful work. We also evaluated the usage of branches in vector APIs and +reimplemented some of them by avoiding branches completely.</p> + +<p>We took inspiration from how the Java memory code in <code class="highlighter-rouge">ArrowBuf</code> works. For all +the performance critical methods, <code class="highlighter-rouge">ArrowBuf</code> bypasses all the netty object +hierarchy, grabs the target virtual address and directly interacts with the +memory.</p> + +<p>There were cases where branches could be avoided all together.</p> + +<p>In case of nullable vectors, we were doing multiple checks to confirm if +the value at a given position in the vector is null or not.</p> + +<h2 id="our-implementation-approach">Our implementation approach</h2> + +<ul> + <li>For scalars, the inheritance tree was simplified by writing different +abstract base classes for fixed and variable width scalars.</li> + <li>The base classes contained all the common functionality across different +types.</li> + <li>The individual subclasses implemented type specific APIs for fixed and +variable width scalar vectors.</li> + <li>For the performance critical methods, all the work is done either in +the vector class or corresponding ArrowBuf. There is no delegation to any +internal object.</li> + <li>The mutator and accessor based access to vector APIs is removed. These +objects led to unnecessary heap overhead and complicated the use of APIs.</li> + <li>Both scalar and complex vectors directly interact with underlying buffers +that manage the offsets, data and validity. Earlier we were creating different +inner vectors for each vector and delegating all the functionality to inner +vectors. This introduced a lot of bugs in memory management, excessive heap +overhead and performance penalty due to chain of delegations.</li> + <li>We reduced the number of vector classes by removing non-nullable vectors. +In the new implementation, all vectors in Java are nullable in nature.</li> +</ul> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3cd84682/build/blog/2017/12/19/java-vector-improvements/index.html ---------------------------------------------------------------------- diff --git a/build/blog/2017/12/19/java-vector-improvements/index.html b/build/blog/2017/12/19/java-vector-improvements/index.html new file mode 100644 index 0000000..2c54256 --- /dev/null +++ b/build/blog/2017/12/19/java-vector-improvements/index.html @@ -0,0 +1,234 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + <li><a href="/docs/js">Javascript API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Improvements to Java Vector API in Apache Arrow 0.8.0 + <a href="/blog/2017/12/19/java-vector-improvements/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 18 Dec 2017 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://people.apache.org/~Siddharth Teotia"><i class="fa fa-user"></i> (Siddharth Teotia)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>This post gives insight into the major improvements in the Java implementation +of vectors. We undertook this work over the last 10 weeks since the last Arrow +release.</p> + +<h2 id="design-goals">Design Goals</h2> + +<ol> + <li>Improved maintainability and extensibility</li> + <li>Improved heap memory usage</li> + <li>No performance overhead on hot code paths</li> +</ol> + +<h2 id="background">Background</h2> + +<h3 id="improved-maintainability-and-extensibility">Improved maintainability and extensibility</h3> + +<p>We use templates in several places for compile time Java code generation for +different vector classes, readers, writers etc. Templates are helpful as the +developers donât have to write a lot of duplicate code.</p> + +<p>However, we realized that over a period of time some specific Java +templates became extremely complex with giant if-else blocks, poor code indentation +and documentation. All this impacted the ability to easily extend these templates +for adding new functionality or improving the existing infrastructure.</p> + +<p>So we evaluated the usage of templates for compile time code generation and +decided not to use complex templates in some places by writing small amount of +duplicate code which is elegant, well documented and extensible.</p> + +<h3 id="improved-heap-usage">Improved heap usage</h3> + +<p>We did extensive memory analysis downstream in <a href="https://www.dremio.com/">Dremio</a> where Arrow is used +heavily for in-memory query execution on columnar data. The general conclusion +was that Arrowâs Java vector classes have non-negligible heap overhead and +volume of objects was too high. There were places in code where we were +creating objects unnecessarily and using structures that could be substituted +with better alternatives.</p> + +<h3 id="no-performance-overhead-on-hot-code-paths">No performance overhead on hot code paths</h3> + +<p>Java vectors used delegation and abstraction heavily throughout the object +hierarchy. The performance critical get/set methods of vectors went through a +chain of function calls back and forth between different objects before doing +meaningful work. We also evaluated the usage of branches in vector APIs and +reimplemented some of them by avoiding branches completely.</p> + +<p>We took inspiration from how the Java memory code in <code class="highlighter-rouge">ArrowBuf</code> works. For all +the performance critical methods, <code class="highlighter-rouge">ArrowBuf</code> bypasses all the netty object +hierarchy, grabs the target virtual address and directly interacts with the +memory.</p> + +<p>There were cases where branches could be avoided all together.</p> + +<p>In case of nullable vectors, we were doing multiple checks to confirm if +the value at a given position in the vector is null or not.</p> + +<h2 id="our-implementation-approach">Our implementation approach</h2> + +<ul> + <li>For scalars, the inheritance tree was simplified by writing different +abstract base classes for fixed and variable width scalars.</li> + <li>The base classes contained all the common functionality across different +types.</li> + <li>The individual subclasses implemented type specific APIs for fixed and +variable width scalar vectors.</li> + <li>For the performance critical methods, all the work is done either in +the vector class or corresponding ArrowBuf. There is no delegation to any +internal object.</li> + <li>The mutator and accessor based access to vector APIs is removed. These +objects led to unnecessary heap overhead and complicated the use of APIs.</li> + <li>Both scalar and complex vectors directly interact with underlying buffers +that manage the offsets, data and validity. Earlier we were creating different +inner vectors for each vector and delegating all the functionality to inner +vectors. This introduced a lot of bugs in memory management, excessive heap +overhead and performance penalty due to chain of delegations.</li> + <li>We reduced the number of vector classes by removing non-nullable vectors. +In the new implementation, all vectors in Java are nullable in nature.</li> +</ul> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3cd84682/build/blog/2018/03/22/0.9.0-release/index.html ---------------------------------------------------------------------- diff --git a/build/blog/2018/03/22/0.9.0-release/index.html b/build/blog/2018/03/22/0.9.0-release/index.html new file mode 100644 index 0000000..b9f7815 --- /dev/null +++ b/build/blog/2018/03/22/0.9.0-release/index.html @@ -0,0 +1,222 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + <li><a href="/docs/js">Javascript API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + Apache Arrow 0.9.0 Release + <a href="/blog/2018/03/22/0.9.0-release/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 22 Mar 2018 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://wesmckinney.com"><i class="fa fa-user"></i> Wes McKinney (wesm)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>The Apache Arrow team is pleased to announce the 0.9.0 release. It is the +product of over 3 months of development and includes <a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.9.0"><strong>260 resolved +JIRAs</strong></a>.</p> + +<p>While we made some of backwards-incompatible columnar binary format changes in +last Decemberâs 0.8.0 release, the 0.9.0 release is backwards-compatible with +0.8.0. We will be working toward a 1.0.0 release this year, which will mark +longer-term binary stability for the Arrow columnar format and metadata.</p> + +<p>See the <a href="https://arrow.apache.org/install">Install Page</a> to learn how to get the libraries for your +platform. The <a href="https://arrow.apache.org/release/0.8.0.html">complete changelog</a> is also available.</p> + +<p>We discuss some highlights from the release and other project news in this +post. This release has been overall focused more on bug fixes, compatibility, +and stability compared with previous releases which have pushed more on new and +expanded features.</p> + +<h2 id="new-arrow-committers-and-pmc-members">New Arrow committers and PMC members</h2> + +<p>Since the last release, we have added 2 new Arrow committers: <a href="https://github.com/theneuralbit">Brian +Hulette</a> and <a href="https://github.com/robertnishihara">Robert Nishihara</a>. Additionally, <a href="https://github.com/cpcloud">Phillip Cloud</a> and +<a href="https://github.com/pcmoritz">Philipp Moritz</a> have been promoted from committer to PMC +member. Congratulations and thank you for your contributions!</p> + +<h2 id="plasma-object-store-improvements">Plasma Object Store Improvements</h2> + +<p>The Plasma Object Store now supports managing interprocess shared memory on +CUDA-enabled GPUs. We are excited to see more GPU-related functionality develop +in Apache Arrow, as this has become a key computing environment for scalable +machine learning.</p> + +<h2 id="python-improvements">Python Improvements</h2> + +<p><a href="https://github.com/pitrou">Antoine Pitrou</a> has joined the Python development efforts and helped +significantly this release with interoperability with built-in CPython data +structures and NumPy structured data types.</p> + +<ul> + <li>New experimental support for reading Apache ORC files</li> + <li><code class="highlighter-rouge">pyarrow.array</code> now accepts lists of tuples or Python dicts for creating +Arrow struct type arrays.</li> + <li>NumPy structured dtypes (which are row/record-oriented) can be directly +converted to Arrow struct (column-oriented) arrays</li> + <li>Python 3.6 <code class="highlighter-rouge">pathlib</code> objects for file paths are now accepted in many file +APIs, including for Parquet files</li> + <li>Arrow integer arrays with nulls can now be converted to NumPy object arrays +with <code class="highlighter-rouge">None</code> values</li> + <li>New <code class="highlighter-rouge">pyarrow.foreign_buffer</code> API for interacting with memory blocks located +at particular memory addresses</li> +</ul> + +<h2 id="java-improvements">Java Improvements</h2> + +<p>Java now fully supports the <code class="highlighter-rouge">FixedSizeBinary</code> data type.</p> + +<h2 id="javascript-improvements">JavaScript Improvements</h2> + +<p>The JavaScript library has been significantly refactored and expanded. We are +making separate Apache releases (most recently <code class="highlighter-rouge">JS-0.3.1</code>) for JavaScript, +which are being <a href="https://www.npmjs.com/package/apache-arrow">published to NPM</a>.</p> + +<h2 id="upcoming-roadmap">Upcoming Roadmap</h2> + +<p>In the coming months, we will be working to move Apache Arrow closer to a 1.0.0 +release. We will also be discussing plans to develop native Arrow-based +computational libraries within the project.</p> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/3cd84682/build/blog/2018/03/22/go-code-donation/index.html ---------------------------------------------------------------------- diff --git a/build/blog/2018/03/22/go-code-donation/index.html b/build/blog/2018/03/22/go-code-donation/index.html new file mode 100644 index 0000000..414ec4c --- /dev/null +++ b/build/blog/2018/03/22/go-code-donation/index.html @@ -0,0 +1,199 @@ +<!DOCTYPE html> +<html lang="en-US"> + <head> + <meta charset="UTF-8"> + <title>Apache Arrow Homepage</title> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <meta name="generator" content="Jekyll v3.4.3"> + <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags --> + <link rel="icon" type="image/x-icon" href="/favicon.ico"> + + <link rel="stylesheet" href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900"> + + <link href="/css/main.css" rel="stylesheet"> + <link href="/css/syntax.css" rel="stylesheet"> + <script src="https://code.jquery.com/jquery-3.2.1.min.js" + integrity="sha256-hwg4gsxgFZhOsEEamdOYGBf13FyQuiTwlAQgxVSNgt4=" + crossorigin="anonymous"></script> + <script src="/assets/javascripts/bootstrap.min.js"></script> + + <!-- Global Site Tag (gtag.js) - Google Analytics --> +<script async src="https://www.googletagmanager.com/gtag/js?id=UA-107500873-1"></script> +<script> + window.dataLayer = window.dataLayer || []; + function gtag(){dataLayer.push(arguments)}; + gtag('js', new Date()); + + gtag('config', 'UA-107500873-1'); +</script> + + + </head> + + + +<body class="wrap"> + <div class="container"> + <nav class="navbar navbar-default"> + <div class="container-fluid"> + <div class="navbar-header"> + <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#arrow-navbar"> + <span class="sr-only">Toggle navigation</span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + <span class="icon-bar"></span> + </button> + <a class="navbar-brand" href="/">Apache Arrow™ </a> + </div> + + <!-- Collect the nav links, forms, and other content for toggling --> + <div class="collapse navbar-collapse" id="arrow-navbar"> + <ul class="nav navbar-nav"> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Project Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/install/">Install</a></li> + <li><a href="/blog/">Blog</a></li> + <li><a href="/release/">Releases</a></li> + <li><a href="https://issues.apache.org/jira/browse/ARROW">Issue Tracker</a></li> + <li><a href="https://github.com/apache/arrow">Source Code</a></li> + <li><a href="http://mail-archives.apache.org/mod_mbox/arrow-dev/">Mailing List</a></li> + <li><a href="https://apachearrowslackin.herokuapp.com">Slack Channel</a></li> + <li><a href="/committers/">Committers</a></li> + <li><a href="/powered_by/">Powered By</a></li> + </ul> + </li> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Specification<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/memory_layout.html">Memory Layout</a></li> + <li><a href="/docs/metadata.html">Metadata</a></li> + <li><a href="/docs/ipc.html">Messaging / IPC</a></li> + </ul> + </li> + + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">Documentation<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="/docs/python">Python</a></li> + <li><a href="/docs/cpp">C++ API</a></li> + <li><a href="/docs/java">Java API</a></li> + <li><a href="/docs/c_glib">C GLib API</a></li> + <li><a href="/docs/js">Javascript API</a></li> + </ul> + </li> + <!-- <li><a href="/blog">Blog</a></li> --> + <li class="dropdown"> + <a href="#" class="dropdown-toggle" data-toggle="dropdown" + role="button" aria-haspopup="true" + aria-expanded="false">ASF Links<span class="caret"></span> + </a> + <ul class="dropdown-menu"> + <li><a href="http://www.apache.org/">ASF Website</a></li> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Donate</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </li> + </ul> + <a href="http://www.apache.org/"> + <img style="float:right;" src="/img/asf_logo.svg" width="120px"/> + </a> + </div><!-- /.navbar-collapse --> + </div> + </nav> + + + <h2> + A Native Go Library for Apache Arrow + <a href="/blog/2018/03/22/go-code-donation/" class="permalink" title="Permalink">â</a> + </h2> + + + + <div class="panel"> + <div class="panel-body"> + <div> + <span class="label label-default">Published</span> + <span class="published"> + <i class="fa fa-calendar"></i> + 22 Mar 2018 + </span> + </div> + <div> + <span class="label label-default">By</span> + <a href="http://github.com/pmc"><i class="fa fa-user"></i> The Apache Arrow PMC (pmc)</a> + </div> + </div> + </div> + + <!-- + +--> + +<p>Since launching in early 2016, Apache Arrow has been growing fast. We have made +nine major releases through the efforts of over 120 distinct contributors. The +projectâs scope has also expanded. We began by focusing on the development of +the standardized in-memory columnar data format, which now serves as a pillar +of the project. Since then, we have been growing into a more general +cross-language platform for in-memory data analysis through new additions to +the project like the <a href="http://arrow.apache.org/blog/2017/08/16/0.6.0-release/">Plasma shared memory object store</a>. A primary goal of +the project is to enable data system developers to process and move data fast.</p> + +<p>So far, we officially have developed native Arrow implementations in C++, Java, +and JavaScript. We have created binding layers for the C++ libraries in C +(using the GLib libraries) and Python. We have also seen efforts to develop +interfaces to the Arrow C++ libraries in Go, Lua, Ruby, and Rust. While binding +layers serve many purposes, there can be benefits to native implementations, +and so weâve been keen to see future work on native implementations in growing +systems languages like Go and Rust.</p> + +<p>This past October, engineers <a href="https://github.com/stuartcarnie">Stuart Carnie</a>, <a href="https://github.com/nathanielc">Nathaniel Cook</a>, and +<a href="https://github.com/goller">Chris Goller</a>, employees of <a href="https://influxdata.com">InfluxData</a>, began developing a native [Go +language implementation of the <a href="https://github.com/influxdata/arrow">Apache Arrow</a> in-memory columnar format for +use in Go-based database systems like InfluxDB. We are excited to announce that +InfluxData has <a href="https://www.businesswire.com/news/home/20180322005393/en/InfluxData-Announces-Language-Implementation-Contribution-Apache-Arrow">donated this native Go implementation to the Apache Arrow +project</a>, where it will continue to be developed. This work features +low-level integration with the Go runtime and native support for SIMD +instruction sets. We are looking forward to working more closely with the Go +community on solving in-memory analytics and data interoperability problems.</p> + +<div align="center"> +<img src="/img/native_go_implementation.png" alt="Apache Arrow implementations and bindings" width="60%" class="img-responsive" /> +</div> + +<p>One of the mantras in <a href="https://www.apache.org">The Apache Software Foundation</a> is âCommunity over +Codeâ. By building an open and collaborative development community across many +programming language ecosystems, we will be able to development better and +longer-lived solutions to the systems problems faced by data developers.</p> + +<p>We are excited for what the future holds for the Apache Arrow project. Adding +first-class support for a popular systems programming language like Go is an +important step along the way. We welcome others from the Go community to get +involved in the project. We also welcome others who wish to explore building +Arrow support for other programming languages not yet represented. Learn more +at <a href="https://arrow.apache.org">https://arrow.apache.org</a> and join the mailing list +<a href="https://lists.apache.org/[email protected]">[email protected]</a>.</p> + + + + <hr/> +<footer class="footer"> + <p>Apache Arrow, Arrow, Apache, the Apache feather logo, and the Apache Arrow project logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.</p> + <p>© 2017 Apache Software Foundation</p> +</footer> + + </div> +</body> +</html>
