[drill-site] branch asf-site updated: New blog post on HTTP streaming from the REST API.

dzamo Fri, 09 Jul 2021 07:10:46 -0700

This is an automated email from the ASF dual-hosted git repository.

dzamo pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/drill-site.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new ab4b80d  New blog post on HTTP streaming from the REST API.
ab4b80d is described below

commit ab4b80d6c6b1c47bd7f2391f1c7a31800745dcff
Author: James Turton <ja...@somecomputer.xyz>
AuthorDate: Fri Jul 9 16:10:19 2021 +0200

    New blog post on HTTP streaming from the REST API.
---
 README.md                                          |   2 +-
 .../09/streaming-data-from-the-rest-api/index.html | 235 +++++++++++++++++++++
 blog/index.html                                    |   5 +
 feed.xml                                           | 129 ++++++-----
 index.html                                         |   4 +-
 zh/README.md                                       |   2 +-
 .../09/streaming-data-from-the-rest-api/index.html | 235 +++++++++++++++++++++
 zh/blog/index.html                                 |   5 +
 zh/feed.xml                                        | 129 ++++++-----
 zh/index.html                                      |   4 +-
 10 files changed, 632 insertions(+), 118 deletions(-)

diff --git a/README.md b/README.md
index 5d87b02..e230317 100644
--- a/README.md
+++ b/README.md
@@ -144,7 +144,7 @@ The English versions of "site" pages such as index.html are 
stored in the root d
 
 ## Add translated collection pages
 
-The English versions of "collection" pages such as the markdown under _docs/ 
are stored in an en/ subdirectory of the collection root.  Create corresponding 
translated pages in the collection under `lang-code/` in which you translate 
both `title` and `parent` in the front matter but leave the `slug` the same as 
the English page and set `lang` to `lang-code`.
+The English versions of "collection" pages such as the markdown under _docs/ 
are stored in an en/ subdirectory of the collection root.  Create corresponding 
translated pages in the collection under `lang-code/` in which you translate 
both `title` and `parent` in the front matter but leave the `slug` the same as 
the English page and set `lang` to `lang-code`.  Once you've translated the 
`title` of a parent page, you will need to provide files for each of its 
children (which can still cont [...]
 
 # Compiling the Website
 
diff --git a/blog/2021/07/09/streaming-data-from-the-rest-api/index.html 
b/blog/2021/07/09/streaming-data-from-the-rest-api/index.html
new file mode 100644
index 0000000..04e3dcb
--- /dev/null
+++ b/blog/2021/07/09/streaming-data-from-the-rest-api/index.html
@@ -0,0 +1,235 @@
+<!DOCTYPE html>
+<html>
+
+<head>
+
+<meta charset="UTF-8">
+<meta name=viewport content="width=device-width, initial-scale=1">
+
+
+<title>Streaming data from Drill REST API - Apache Drill</title>
+
+<link 
href="https://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css";
 rel="stylesheet" type="text/css"/>
+<link href='https://fonts.googleapis.com/css?family=PT+Sans' rel='stylesheet' 
type='text/css'/>
+<link href="/css/site.css" rel="stylesheet" type="text/css"/>
+
+<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"/>
+<link rel="icon" href="/favicon.ico" type="image/x-icon"/>
+
+<script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"; 
language="javascript" type="text/javascript"></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/jquery-easing/1.3/jquery.easing.min.js";
 language="javascript" type="text/javascript"></script>
+<script language="javascript" type="text/javascript" 
src="/js/modernizr.custom.js"></script>
+<script language="javascript" type="text/javascript" 
src="/js/script.js"></script>
+<script language="javascript" type="text/javascript" 
src="/js/drill.js"></script>
+
+</head>
+
+
+<body onResize="resized();">
+  <div class="page-wrap">
+    <div class="bui"></div>
+
+<div id="menu" class="mw">
+<ul>
+  <li class='toc-categories'>
+  <a class="expand-toc-icon" href="javascript:void(0);"><i class="fa 
fa-bars"></i></a>
+  </li>
+  <li class="logo"><a href="/"></a></li>
+  <li class='expand-menu'>
+  <a href="javascript:void(0);"><span class='menu-text'>Menu</span><span 
class='expand-icon'><i class="fa fa-bars"></i></span></a>
+  </li>
+  <li class="clear-float"></li>
+  <li class="nav">
+       <a>Language</a>
+       <ul>
+               
+               <li>
+                       <a style="font-weight: bold;" 
href="/blog/2021/07/09/streaming-data-from-the-rest-api/" >en</a>
+               </li>
+               
+               <li>
+                       <a  
href="/zh/blog/2021/07/09/streaming-data-from-the-rest-api/" >zh</a>
+               </li>
+               
+       </ul>
+  </li>
+  <li class="apache-link">
+    <a href="/apacheASF/">Apache</a>
+  </li>
+  <li class="poweredby">
+    <a href="/poweredBy">Powered By</a>
+  </li>
+  <li class="documentation-menu">
+    <a href="/docs/">Documentation</a>
+    <ul>
+      
+        <li><a href="/docs/getting-started/">Getting Started</a></li>
+      
+        <li><a href="/docs/architecture/">Architecture</a></li>
+      
+        <li><a href="/docs/tutorials/">Tutorials</a></li>
+      
+        <li><a href="/docs/drill-on-yarn/">Drill-on-YARN</a></li>
+      
+        <li><a href="/docs/install-drill/">Install Drill</a></li>
+      
+        <li><a href="/docs/configure-drill/">Configure Drill</a></li>
+      
+        <li><a href="/docs/connect-a-data-source/">Connect a Data 
Source</a></li>
+      
+        <li><a href="/docs/odbc-jdbc-interfaces/">ODBC/JDBC Interfaces</a></li>
+      
+        <li><a href="/docs/query-data/">Query Data</a></li>
+      
+        <li><a href="/docs/performance-tuning/">Performance Tuning</a></li>
+      
+        <li><a href="/docs/log-and-debug/">Log and Debug</a></li>
+      
+        <li><a href="/docs/sql-reference/">SQL Reference</a></li>
+      
+        <li><a href="/docs/data-sources-and-file-formats/">Data Sources and 
File Formats</a></li>
+      
+        <li><a href="/docs/develop-custom-functions/">Develop Custom 
Functions</a></li>
+      
+        <li><a href="/docs/troubleshooting/">Troubleshooting</a></li>
+      
+        <li><a href="/docs/developer-information/">Developer 
Information</a></li>
+      
+        <li><a href="/docs/release-notes/">Release Notes</a></li>
+      
+        <li><a href="/docs/sample-datasets/">Sample Datasets</a></li>
+      
+        <li><a href="/docs/project-bylaws/">Project Bylaws</a></li>
+      
+        <li><a href="/docs/ecosystem/">Ecosystem</a></li>
+      
+    </ul>
+  </li>
+  <li class='nav'>
+    <a href="/community-resources/">Community</a>
+    <ul>
+      <li><a href="/team/">Team</a></li>
+      <li><a href="/mailinglists/">Mailing Lists</a></li>
+      <li><a href="/community-resources/">Community Resources</a></li>
+    </ul>
+  </li>
+  <li class='nav'><a href="/faq/">FAQ</a></li>
+  <li class='nav'><a href="/blog/">Blog</a></li>
+  <li class="social-menu-item"><a href="https://twitter.com/apachedrill"; 
title="apachedrill on twitter" target="_blank"><img 
src="/images/twitter_32_26_white.png" alt="twitter logo" align="center"></a> 
</li>
+  <li class="social-menu-item"><a 
href="https://join.slack.com/t/apache-drill/shared_invite/enQtNTQ4MjM1MDA3MzQ2LTJlYmUxMTRkMmUwYmQ2NTllYmFmMjU4MDk0NjYwZjBmYjg0MDZmOTE2ZDg0ZjBlYmI3Yjc4Y2I2NTQyNGVlZTc";
 title="Apache Drill Slack channels"
+      target="_blank"><img src="/images/slack-logo.svg" alt="Slack logo" 
align="center"></a> </li>
+  <li class='search-bar'>
+    <form id="drill-search-form">
+      <input type="text" placeholder="Search Apache Drill" 
id="drill-search-term" />
+      <button type="submit">
+        <i class="fa fa-search"></i>
+      </button>
+    </form>
+  </li>
+  <li class="d">
+    <a href="/download/">
+      <i class="fa fa-cloud-download"></i> Download
+    </a>
+  </li>
+</ul>
+</div>
+
+    <link href="/css/content.css" rel="stylesheet" type="text/css">
+
+<div class="post int_text">
+  <header class="post-header">
+    <div class="int_title">
+      <h1 class="post-title">Streaming data from Drill REST API</h1>
+    </div>
+    <p class="post-meta">
+    
+      
+      
+      <strong>Author:</strong> James Turton (Committer, Apache Software 
Foundation)<br />
+    
+<strong>Date:</strong> Jul 9, 2021
+</p>
+  </header>
+  <div class="addthis_sharing_toolbox"></div>
+
+  <article class="post-content">
+    <p>Anyone who’s used a UNIX pipe, or even just watched something on 
Netflix, is at least a little familiar with the idea of processing data in a 
streaming fashion.  While your data are in small volume compared to available 
memory and I/O speeds, streaming is something you can afford to dispense with.  
But when you cannot fit an entire dataset in RAM, or when if you have to 
download an entire 4K movie before you can start playing it, then streaming 
data processing can make a game chan [...]
+
+<p>With the relase of version 1.19, Drill will stream JSON query result data 
over an HTTP response to the client that initiated the query using the REST 
API.  And if anything can easily get big compared to your available RAM or 
network speed, it’s query results coming back from Drill.  It’s important to 
note here that JSON over HTTP is never going to be the most <em>efficient</em> 
way to move big data around<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" 
class="footnote">1</a></sup> [...]
+
+<p>Where JSON and HTTP <em>do</em> win is universality: today it’s hard to 
imagine a client hardware and software stack that doesn’t provide JSON and HTTP 
out of the box with minimal effort.  So it’s important that they work as well 
as is possible, in spite of the alternatives that exist.  The new streaming 
query results delivery on the server side means that Drill’s heap memory isn’t 
pressurised by having to buffer entire result sets before it starts to transmit 
them over the network.   [...]
+
+<p>To fully realise the benefits of streaming query result data, clients can 
<em>themselves</em> operate on the HTTP response they receive in a streaming 
fashion, thereby potentially starting to process records before Drill has even 
finished materialising them and avoiding holding the full set in memory if they 
choose.  At the transport level (when there are enough results to warrant it), 
the HTTP response headers from the Drill REST API will include a <code 
class="language-plaintext hig [...]
+
+<p>If you set out to develop a streaming HTTP client for the Drill REST API, 
do take note that the schema of the JSON query result is <em>not</em> a <a 
href="https://en.wikipedia.org/wiki/JSON_streaming";>streaming JSON format</a> 
like “JSON lines”.  This means that you must be careful about which JSON 
objects you parse entirely in a single call, particularly avoiding any parent 
of the <code class="language-plaintext highlighter-rouge">rows</code> property 
in the query result which would  [...]
+
+<p>In closing, and for a bit of fun, here’s the log from a short IPython 
session where I use sqlalchemy-drill to run <code class="language-plaintext 
highlighter-rouge">SELECT *</code> on a remote 17 billion record table over a 
0.5 Mbit/s link and start scanning through (a steady trickle of) rows in 
seconds.</p>
+
+<pre><code class="language-ipython">In [4]: r = engine.execute('select 
count(*) from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211888-cc20-fe6f-69d1-6584c5caa2df.
+INFO:drilldbapi:opened a row data stream of 1 columns.
+
+In [5]: next(r)
+Out[5]: (17437571247,)
+
+In [6]: r = engine.execute('select * from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211838-73df-1506-a74e-f5695f6b0ff5.
+INFO:drilldbapi:opened a row data stream of 21 columns.
+
+In [7]: while True:
+   ...:     _ = next(r)
+      ...:
+      INFO:drilldbapi:streamed 10000 rows.
+      INFO:drilldbapi:streamed 20000 rows.
+      INFO:drilldbapi:streamed 30000 rows.
+      INFO:drilldbapi:streamed 40000 rows.
+</code></pre>
+
+<div class="footnotes" role="doc-endnotes">
+  <ol>
+    <li id="fn:1" role="doc-endnote">
+      <p>Some mitigation is possible.  JSON representations of tabular big 
data are typically extremely compressible and you can reduce the bytes sent 
over the network by 10-20x by enabling HTTP response compression on a web 
server like Apache placed in front of the Drill REST API. <a href="#fnref:1" 
class="reversefootnote" role="doc-backlink">&#8617;</a></p>
+    </li>
+  </ol>
+</div>
+
+  </article>
+ <div id="disqus_thread"></div>
+    <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE 
* * */
+        var disqus_shortname = 'drill'; // required: replace example with your 
forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 
'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+            (document.getElementsByTagName('head')[0] || 
document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+    </script>
+    <noscript>Please enable JavaScript to view the <a 
href="http://disqus.com/?ref_noscript";>comments powered by 
Disqus.</a></noscript>
+    
+</div>
+<script type="text/javascript" 
src="https://s7.addthis.com/js/300/addthis_widget.js#pubid=ra-548b2caa33765e8d"; 
async="async"></script>
+
+  </div>
+  <p class="push"></p>
+<div id="footer" class="mw">
+<div class="wrapper">
+Copyright © 2012-2020 The Apache Software Foundation, licensed under the 
Apache License, Version 2.0.<br>
+Apache and the Apache feather logo are trademarks of The Apache Software 
Foundation. Other names appearing on the site may be trademarks of their 
respective owners.<br/><br/>
+</div>
+</div>
+
+  <script>
+(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
+
+ga('create', 'UA-53379651-1', 'auto');
+ga('send', 'pageview');
+</script>
+<script type="text/javascript" 
src="https://s7.addthis.com/js/300/addthis_widget.js#pubid=ra-548b2caa33765e8d"; 
async="async"></script>
+
+</body>
+</html>
diff --git a/blog/index.html b/blog/index.html
index ff5df94..9f136a0 100644
--- a/blog/index.html
+++ b/blog/index.html
@@ -140,6 +140,11 @@
 </div>
 
 <div class="int_text" align="left"><!-- previously: site.posts -->
+<p><a class="post-link" 
href="/blog/2021/07/09/streaming-data-from-the-rest-api/">Streaming data from 
Drill REST API</a><br/>
+<span class="post-date">Posted on Jul 9, 2021
+by James Turton</span>
+<br/>The release of Apache Drill 1.19 saw a major change under the hood of 
Drill's REST API with the introduction of a streaming data path for query 
results moving from Drill and over the network to the initiating client.  The 
result is better memory utilisation, less blocking and a more reliable API.</p>
+<!-- previously: site.posts -->
 <p><a class="post-link" href="/blog/2021/06/10/drill-1.19-released/">Drill 
1.19 Released</a><br/>
 <span class="post-date">Posted on Jun 10, 2021
 by Laurent Goujon</span>
diff --git a/feed.xml b/feed.xml
index 4e0f7d3..3520e0d 100644
--- a/feed.xml
+++ b/feed.xml
@@ -4,13 +4,64 @@
     <title>Apache Drill - Schema-free SQL for Hadoop, NoSQL and Cloud 
Storage</title>
     <description>The official user documentation for Apache Drill.
 </description>
-    <link>/</link>
-    <atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Thu, 08 Jul 2021 14:57:25 +0200</pubDate>
-    <lastBuildDate>Thu, 08 Jul 2021 14:57:25 +0200</lastBuildDate>
+    <link>http://localhost:4000/</link>
+    <atom:link href="http://localhost:4000/feed.xml"; rel="self" 
type="application/rss+xml"/>
+    <pubDate>Fri, 09 Jul 2021 15:37:50 +0200</pubDate>
+    <lastBuildDate>Fri, 09 Jul 2021 15:37:50 +0200</lastBuildDate>
     <generator>Jekyll v3.9.0</generator>
     
       <item>
+        <title>Streaming data from Drill REST API</title>
+        <description>&lt;p&gt;Anyone who’s used a UNIX pipe, or even just 
watched something on Netflix, is at least a little familiar with the idea of 
processing data in a streaming fashion.  While your data are in small volume 
compared to available memory and I/O speeds, streaming is something you can 
afford to dispense.  But if you cannot fit an entire dataset in RAM, or when if 
you have to download an entire 4K movie before you can start playing it, then 
streaming can make a game chan [...]
+
+&lt;p&gt;With the relase of version 1.19, Drill will stream JSON query result 
data over an HTTP response to the client that initiated the query using the 
REST API.  And if anything can easily get big compared to your available RAM or 
network speeds, its query results coming out of Drill.  It’s important to note 
here that JSON over HTTP is never going to be the most efficient way to move 
big data around&lt;sup id=&quot;fnref:1&quot; 
role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&q [...]
+
+&lt;p&gt;Where JSON and HTTP &lt;em&gt;do&lt;/em&gt; win on universality: 
today it’s hard to image a client hardware and software stack that doesn’t 
provide JSON and HTTP out of the box with minimal effort.  So it’s important 
that they work as well as is possible.  The new streaming query results 
materialisation on the server side means that Drill’s heap memory isn’t 
pressurised by having to buffer entire result sets before starting to transmit 
them over the network.  Even existing REST  [...]
+
+&lt;p&gt;To fully realise the benefits of streaming query result data, clients 
can &lt;em&gt;themselves&lt;/em&gt; operate on the HTTP response they receive 
in a streaming fashion, thereby potentially starting to process records before 
Drill has even finished materialsing them.  At the transport level (when there 
are enough results to warrant it), the HTTP response headers from the Drill 
REST API will include a &lt;code class=&quot;language-plaintext 
highlighter-rouge&quot;&gt;Transfer-E [...]
+
+&lt;p&gt;Streaming HTTP client developers should note well that the schema of 
JSON query result is &lt;em&gt;not&lt;/em&gt; a &lt;a 
href=&quot;https://en.wikipedia.org/wiki/JSON_streaming&quot;&gt;streaming JSON 
format&lt;/a&gt; like “JSON lines”.  This means that you should take care to 
avoid parsing any entire object which is a parent of the &lt;code 
class=&quot;language-plaintext highlighter-rouge&quot;&gt;rows&lt;/code&gt; 
property in the query result, thereby parsing the essentailly [...]
+
+&lt;p&gt;In closing, and for a bit of fun, here’s the log from a short IPython 
session where I use sqlalchemy-drill to run &lt;code 
class=&quot;language-plaintext highlighter-rouge&quot;&gt;SELECT *&lt;/code&gt; 
on a remote 17 billion record table over a 0.5 Mbit/s link and start receiving 
(a steady trickle of) rows in seconds.&lt;/p&gt;
+
+&lt;pre&gt;&lt;code class=&quot;language-ipython&quot;&gt;In [4]: r = 
engine.execute('select count(*) from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211888-cc20-fe6f-69d1-6584c5caa2df.
+INFO:drilldbapi:opened a row data stream of 1 columns.
+
+In [5]: next(r)
+Out[5]: (17437571247,)
+
+In [6]: r = engine.execute('select * from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211838-73df-1506-a74e-f5695f6b0ff5.
+INFO:drilldbapi:opened a row data stream of 21 columns.
+
+In [7]: while True:
+   ...:     _ = next(r)
+      ...:
+      INFO:drilldbapi:streamed 10000 rows.
+      INFO:drilldbapi:streamed 20000 rows.
+      INFO:drilldbapi:streamed 30000 rows.
+      INFO:drilldbapi:streamed 40000 rows.
+&lt;/code&gt;&lt;/pre&gt;
+
+&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
+  &lt;ol&gt;
+    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
+      &lt;p&gt;JSON representations of tabular big data are typically 
extremely compressible and you can reduce the bytes sent over the network by 
10-20x by enabling HTTP response compression on a web server like Apache placed 
in front of the Drill REST API. &lt;a href=&quot;#fnref:1&quot; 
class=&quot;reversefootnote&quot; 
role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
+    &lt;/li&gt;
+  &lt;/ol&gt;
+&lt;/div&gt;
+</description>
+        <pubDate>Fri, 09 Jul 2021 00:00:00 +0200</pubDate>
+        
<link>http://localhost:4000/blog/2021/07/09/streaming-data-from-the-rest-api/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2021/07/09/streaming-data-from-the-rest-api/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>Drill 1.19 Released</title>
         <description>&lt;p&gt;Today, we’re happy to announce the availability 
of Drill 1.19.0. You can download it &lt;a 
href=&quot;https://drill.apache.org/download/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 
@@ -40,8 +91,8 @@
 &lt;p&gt;You can find a complete list of improvements and JIRAs resolved in 
the 1.19.0 release &lt;a 
href=&quot;/docs/apache-drill-1-19-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 </description>
         <pubDate>Thu, 10 Jun 2021 00:00:00 +0200</pubDate>
-        <link>/blog/2021/06/10/drill-1.19-released/</link>
-        <guid isPermaLink="true">/blog/2021/06/10/drill-1.19-released/</guid>
+        <link>http://localhost:4000/blog/2021/06/10/drill-1.19-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2021/06/10/drill-1.19-released/</guid>
         
         
         <category>blog</category>
@@ -64,8 +115,8 @@
 &lt;p&gt;You can find a complete list of improvements and JIRAs resolved in 
the 1.18.0 release &lt;a 
href=&quot;/docs/apache-drill-1-18-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 </description>
         <pubDate>Sat, 05 Sep 2020 00:00:00 +0200</pubDate>
-        <link>/blog/2020/09/05/drill-1.18-released/</link>
-        <guid isPermaLink="true">/blog/2020/09/05/drill-1.18-released/</guid>
+        <link>http://localhost:4000/blog/2020/09/05/drill-1.18-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2020/09/05/drill-1.18-released/</guid>
         
         
         <category>blog</category>
@@ -106,8 +157,8 @@
 
 </description>
         <pubDate>Thu, 26 Dec 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/12/26/drill-1.17-released/</link>
-        <guid isPermaLink="true">/blog/2019/12/26/drill-1.17-released/</guid>
+        <link>http://localhost:4000/blog/2019/12/26/drill-1.17-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/12/26/drill-1.17-released/</guid>
         
         
         <category>blog</category>
@@ -119,8 +170,8 @@
         <description>
 </description>
         <pubDate>Thu, 02 May 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/05/02/drill-user-meetup/</link>
-        <guid isPermaLink="true">/blog/2019/05/02/drill-user-meetup/</guid>
+        <link>http://localhost:4000/blog/2019/05/02/drill-user-meetup/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/05/02/drill-user-meetup/</guid>
         
         
         <category>blog</category>
@@ -155,8 +206,8 @@
 
 </description>
         <pubDate>Thu, 02 May 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/05/02/drill-1.16-released/</link>
-        <guid isPermaLink="true">/blog/2019/05/02/drill-1.16-released/</guid>
+        <link>http://localhost:4000/blog/2019/05/02/drill-1.16-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/05/02/drill-1.16-released/</guid>
         
         
         <category>blog</category>
@@ -188,8 +239,8 @@
 
 </description>
         <pubDate>Mon, 31 Dec 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/12/31/drill-1.15-released/</link>
-        <guid isPermaLink="true">/blog/2018/12/31/drill-1.15-released/</guid>
+        <link>http://localhost:4000/blog/2018/12/31/drill-1.15-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/12/31/drill-1.15-released/</guid>
         
         
         <category>blog</category>
@@ -223,8 +274,8 @@
 &lt;/ul&gt;
 </description>
         <pubDate>Sat, 01 Dec 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/12/01/learning-apache-drill-book/</link>
-        <guid 
isPermaLink="true">/blog/2018/12/01/learning-apache-drill-book/</guid>
+        
<link>http://localhost:4000/blog/2018/12/01/learning-apache-drill-book/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/12/01/learning-apache-drill-book/</guid>
         
         
         <category>blog</category>
@@ -236,8 +287,8 @@
         <description>
 </description>
         <pubDate>Tue, 16 Oct 2018 21:18:04 +0200</pubDate>
-        <link>/blog/2018/10/16/drill-user-meetup/</link>
-        <guid isPermaLink="true">/blog/2018/10/16/drill-user-meetup/</guid>
+        <link>http://localhost:4000/blog/2018/10/16/drill-user-meetup/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/10/16/drill-user-meetup/</guid>
         
         
         <category>blog</category>
@@ -249,42 +300,8 @@
         <description>
 </description>
         <pubDate>Tue, 16 Oct 2018 21:18:04 +0200</pubDate>
-        <link>/blog/2018/10/16/drill-developer-day/</link>
-        <guid isPermaLink="true">/blog/2018/10/16/drill-developer-day/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
-        <title>Drill 1.14 Released</title>
-        <description>&lt;p&gt;Today, we’re happy to announce the availability 
of Drill 1.14.0. You can download it &lt;a 
href=&quot;https://drill.apache.org/download/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;The release provides the following bug fixes and 
improvements:&lt;/p&gt;
-
-&lt;h2 id=&quot;run-drill-in-a-docker-container-drill-6346&quot;&gt;Run Drill 
in a Docker Container (DRILL-6346)&lt;/h2&gt;
-&lt;p&gt;Running Drill in a Docker container is the simplest way to start 
using Drill; all you need is the Docker client installed on your machine. You 
simply run a Docker command, and your Docker client downloads the Drill Docker 
image from the apache-drill repository on Docker Hub and then brings up a 
container with Apache Drill running in embedded mode. See &lt;a 
href=&quot;/docs/running-drill-on-docker/&quot;&gt;Running Drill on 
Docker&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;export-and-save-storage-plugin-configurations-drill-4580&quot;&gt;Export
 and Save Storage Plugin Configurations (DRILL-4580)&lt;/h2&gt;
-&lt;p&gt;You can export and save your storage plugin configurations from the 
Storage page in the Drill Web UI. See &lt;a 
href=&quot;/docs/configuring-storage-plugins/#exporting-storage-plugin-configurations&quot;&gt;Exporting
 Storage Plugin Configurations&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;manage-storage-plugin-configurations-in-a-configuration-file-drill-6494&quot;&gt;Manage
 Storage Plugin Configurations in a Configuration File (DRILL-6494)&lt;/h2&gt;
-&lt;p&gt;You can manage storage plugin configurations in the Drill 
configuration file,  storage-plugins-override.conf. When you provide the 
storage plugin configurations in the storage-plugins-override.conf file, Drill 
reads the file and configures the plugins during start-up. See &lt;a 
href=&quot;https://drill.apache.org/docs/configuring-storage-plugins/#configuring-storage-plugins-with-the-storage-plugins-override.conf-file&quot;&gt;Configuring
 Storage Plugins with the storage-plugins- [...]
-
-&lt;h2 
id=&quot;query-metadata-in-various-image-formats-drill-4364&quot;&gt;Query 
Metadata in Various Image Formats (DRILL-4364)&lt;/h2&gt;
-&lt;p&gt;The metadata format plugin is useful for querying a large number of 
image files stored in a distributed file system. You do not have to build a 
metadata repository in advance.&lt;br /&gt;
-See &lt;a href=&quot;/docs/image-metadata-format-plugin/&quot;&gt;Image 
Metadata Format Plugin&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;set-hive-properties-at-the-session-level-drill-6575&quot;&gt;Set Hive 
Properties at the Session Level (DRILL-6575)&lt;/h2&gt;
-&lt;p&gt;The store.hive.conf.properties option enables you to specify Hive 
properties at the session level using the SET command. See &lt;a 
href=&quot;/docs/hive-storage-plugin/#setting-hive-properties&quot;&gt;Setting 
Hive Properties&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;You can find a complete list of JIRAs resolved in the 1.14.0 release 
&lt;a 
href=&quot;/docs/apache-drill-1-14-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
-
-</description>
-        <pubDate>Sun, 05 Aug 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/08/05/drill-1.14-released/</link>
-        <guid isPermaLink="true">/blog/2018/08/05/drill-1.14-released/</guid>
+        <link>http://localhost:4000/blog/2018/10/16/drill-developer-day/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/10/16/drill-developer-day/</guid>
         
         
         <category>blog</category>
diff --git a/index.html b/index.html
index da4e9b0..82bc0b6 100644
--- a/index.html
+++ b/index.html
@@ -203,9 +203,9 @@ $(document).ready(function() {
   <div class="news">News:
   </div>
   
-  <div><a href="/blog/2021/06/10/drill-1.19-released/">Drill 1.19 
Released</a><br/><span>(Laurent Goujon)</span></div>
+  <div><a href="/blog/2021/07/09/streaming-data-from-the-rest-api/">Streaming 
data from Drill REST API</a><br/><span>(James Turton)</span></div>
   
-  <div><a href="/blog/2020/09/05/drill-1.18-released/">Drill 1.18 
Released</a><br/><span>(Abhishek Girish)</span></div>
+  <div><a href="/blog/2021/06/10/drill-1.19-released/">Drill 1.19 
Released</a><br/><span>(Laurent Goujon)</span></div>
 </div>
 <div class="mw introWrapper">
   <table class="intro" cellpadding="0" cellspacing="0" align="center">
diff --git a/zh/README.md b/zh/README.md
index 5d87b02..e230317 100644
--- a/zh/README.md
+++ b/zh/README.md
@@ -144,7 +144,7 @@ The English versions of "site" pages such as index.html are 
stored in the root d
 
 ## Add translated collection pages
 
-The English versions of "collection" pages such as the markdown under _docs/ 
are stored in an en/ subdirectory of the collection root.  Create corresponding 
translated pages in the collection under `lang-code/` in which you translate 
both `title` and `parent` in the front matter but leave the `slug` the same as 
the English page and set `lang` to `lang-code`.
+The English versions of "collection" pages such as the markdown under _docs/ 
are stored in an en/ subdirectory of the collection root.  Create corresponding 
translated pages in the collection under `lang-code/` in which you translate 
both `title` and `parent` in the front matter but leave the `slug` the same as 
the English page and set `lang` to `lang-code`.  Once you've translated the 
`title` of a parent page, you will need to provide files for each of its 
children (which can still cont [...]
 
 # Compiling the Website
 
diff --git a/zh/blog/2021/07/09/streaming-data-from-the-rest-api/index.html 
b/zh/blog/2021/07/09/streaming-data-from-the-rest-api/index.html
new file mode 100644
index 0000000..209378d
--- /dev/null
+++ b/zh/blog/2021/07/09/streaming-data-from-the-rest-api/index.html
@@ -0,0 +1,235 @@
+<!DOCTYPE html>
+<html>
+
+<head>
+
+<meta charset="UTF-8">
+<meta name=viewport content="width=device-width, initial-scale=1">
+
+
+<title>Streaming data from Drill REST API - Apache Drill</title>
+
+<link 
href="https://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css";
 rel="stylesheet" type="text/css"/>
+<link href='https://fonts.googleapis.com/css?family=PT+Sans' rel='stylesheet' 
type='text/css'/>
+<link href="/css/site.css" rel="stylesheet" type="text/css"/>
+
+<link rel="shortcut icon" href="/zh/favicon.ico" type="image/x-icon"/>
+<link rel="icon" href="/zh/favicon.ico" type="image/x-icon"/>
+
+<script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"; 
language="javascript" type="text/javascript"></script>
+<script 
src="https://cdnjs.cloudflare.com/ajax/libs/jquery-easing/1.3/jquery.easing.min.js";
 language="javascript" type="text/javascript"></script>
+<script language="javascript" type="text/javascript" 
src="/js/modernizr.custom.js"></script>
+<script language="javascript" type="text/javascript" 
src="/js/script.js"></script>
+<script language="javascript" type="text/javascript" 
src="/js/drill.js"></script>
+
+</head>
+
+
+<body onResize="resized();">
+  <div class="page-wrap">
+    <div class="bui"></div>
+
+<div id="menu" class="mw">
+<ul>
+  <li class='toc-categories'>
+  <a class="expand-toc-icon" href="javascript:void(0);"><i class="fa 
fa-bars"></i></a>
+  </li>
+  <li class="logo"><a href="/zh/"></a></li>
+  <li class='expand-menu'>
+  <a href="javascript:void(0);"><span class='menu-text'>Menu</span><span 
class='expand-icon'><i class="fa fa-bars"></i></span></a>
+  </li>
+  <li class="clear-float"></li>
+  <li class="nav">
+       <a>Language</a>
+       <ul>
+               
+               <li>
+                       <a  
href="/blog/2021/07/09/streaming-data-from-the-rest-api/" >en</a>
+               </li>
+               
+               <li>
+                       <a style="font-weight: bold;" 
href="/zh/blog/2021/07/09/streaming-data-from-the-rest-api/" >zh</a>
+               </li>
+               
+       </ul>
+  </li>
+  <li class="apache-link">
+    <a href="/zh/apacheASF/">Apache</a>
+  </li>
+  <li class="poweredby">
+    <a href="/zh/poweredBy">Powered By</a>
+  </li>
+  <li class="documentation-menu">
+    <a href="/zh/docs/">Documentation</a>
+    <ul>
+      
+        <li><a href="/zh/docs/getting-started/">新手开始</a></li>
+      
+        <li><a href="/zh/docs/architecture/">Architecture</a></li>
+      
+        <li><a href="/zh/docs/tutorials/">Tutorials</a></li>
+      
+        <li><a href="/zh/docs/drill-on-yarn/">Drill-on-YARN</a></li>
+      
+        <li><a href="/zh/docs/install-drill/">Install Drill</a></li>
+      
+        <li><a href="/zh/docs/configure-drill/">Configure Drill</a></li>
+      
+        <li><a href="/zh/docs/connect-a-data-source/">Connect a Data 
Source</a></li>
+      
+        <li><a href="/zh/docs/odbc-jdbc-interfaces/">ODBC/JDBC 
Interfaces</a></li>
+      
+        <li><a href="/zh/docs/query-data/">Query Data</a></li>
+      
+        <li><a href="/zh/docs/performance-tuning/">Performance Tuning</a></li>
+      
+        <li><a href="/zh/docs/log-and-debug/">Log and Debug</a></li>
+      
+        <li><a href="/zh/docs/sql-reference/">SQL Reference</a></li>
+      
+        <li><a href="/zh/docs/data-sources-and-file-formats/">Data Sources and 
File Formats</a></li>
+      
+        <li><a href="/zh/docs/develop-custom-functions/">Develop Custom 
Functions</a></li>
+      
+        <li><a href="/zh/docs/troubleshooting/">Troubleshooting</a></li>
+      
+        <li><a href="/zh/docs/developer-information/">Developer 
Information</a></li>
+      
+        <li><a href="/zh/docs/release-notes/">Release Notes</a></li>
+      
+        <li><a href="/zh/docs/sample-datasets/">Sample Datasets</a></li>
+      
+        <li><a href="/zh/docs/project-bylaws/">Project Bylaws</a></li>
+      
+        <li><a href="/zh/docs/ecosystem/">Ecosystem</a></li>
+      
+    </ul>
+  </li>
+  <li class='nav'>
+    <a href="/zh/community-resources/">Community</a>
+    <ul>
+      <li><a href="/zh/team/">Team</a></li>
+      <li><a href="/zh/mailinglists/">Mailing Lists</a></li>
+      <li><a href="/zh/community-resources/">Community Resources</a></li>
+    </ul>
+  </li>
+  <li class='nav'><a href="/zh/faq/">FAQ</a></li>
+  <li class='nav'><a href="/zh/blog/">Blog</a></li>
+  <li class="social-menu-item"><a href="https://twitter.com/apachedrill"; 
title="apachedrill on twitter" target="_blank"><img 
src="/images/twitter_32_26_white.png" alt="twitter logo" align="center"></a> 
</li>
+  <li class="social-menu-item"><a 
href="https://join.slack.com/t/apache-drill/shared_invite/enQtNTQ4MjM1MDA3MzQ2LTJlYmUxMTRkMmUwYmQ2NTllYmFmMjU4MDk0NjYwZjBmYjg0MDZmOTE2ZDg0ZjBlYmI3Yjc4Y2I2NTQyNGVlZTc";
 title="Apache Drill Slack channels"
+      target="_blank"><img src="/images/slack-logo.svg" alt="Slack logo" 
align="center"></a> </li>
+  <li class='search-bar'>
+    <form id="drill-search-form">
+      <input type="text" placeholder="Search Apache Drill" 
id="drill-search-term" />
+      <button type="submit">
+        <i class="fa fa-search"></i>
+      </button>
+    </form>
+  </li>
+  <li class="d">
+    <a href="/zh/download/">
+      <i class="fa fa-cloud-download"></i> Download
+    </a>
+  </li>
+</ul>
+</div>
+
+    <link href="/css/content.css" rel="stylesheet" type="text/css">
+
+<div class="post int_text">
+  <header class="post-header">
+    <div class="int_title">
+      <h1 class="post-title">Streaming data from Drill REST API</h1>
+    </div>
+    <p class="post-meta">
+    
+      
+      
+      <strong>Author:</strong> James Turton (Committer, Apache Software 
Foundation)<br />
+    
+<strong>Date:</strong> Jul 9, 2021
+</p>
+  </header>
+  <div class="addthis_sharing_toolbox"></div>
+
+  <article class="post-content">
+    <p>Anyone who’s used a UNIX pipe, or even just watched something on 
Netflix, is at least a little familiar with the idea of processing data in a 
streaming fashion.  While your data are in small volume compared to available 
memory and I/O speeds, streaming is something you can afford to dispense with.  
But when you cannot fit an entire dataset in RAM, or when if you have to 
download an entire 4K movie before you can start playing it, then streaming 
data processing can make a game chan [...]
+
+<p>With the relase of version 1.19, Drill will stream JSON query result data 
over an HTTP response to the client that initiated the query using the REST 
API.  And if anything can easily get big compared to your available RAM or 
network speed, it’s query results coming back from Drill.  It’s important to 
note here that JSON over HTTP is never going to be the most <em>efficient</em> 
way to move big data around<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" 
class="footnote">1</a></sup> [...]
+
+<p>Where JSON and HTTP <em>do</em> win is universality: today it’s hard to 
imagine a client hardware and software stack that doesn’t provide JSON and HTTP 
out of the box with minimal effort.  So it’s important that they work as well 
as is possible, in spite of the alternatives that exist.  The new streaming 
query results delivery on the server side means that Drill’s heap memory isn’t 
pressurised by having to buffer entire result sets before it starts to transmit 
them over the network.   [...]
+
+<p>To fully realise the benefits of streaming query result data, clients can 
<em>themselves</em> operate on the HTTP response they receive in a streaming 
fashion, thereby potentially starting to process records before Drill has even 
finished materialising them and avoiding holding the full set in memory if they 
choose.  At the transport level (when there are enough results to warrant it), 
the HTTP response headers from the Drill REST API will include a <code 
class="language-plaintext hig [...]
+
+<p>If you set out to develop a streaming HTTP client for the Drill REST API, 
do take note that the schema of the JSON query result is <em>not</em> a <a 
href="https://en.wikipedia.org/wiki/JSON_streaming";>streaming JSON format</a> 
like “JSON lines”.  This means that you must be careful about which JSON 
objects you parse entirely in a single call, particularly avoiding any parent 
of the <code class="language-plaintext highlighter-rouge">rows</code> property 
in the query result which would  [...]
+
+<p>In closing, and for a bit of fun, here’s the log from a short IPython 
session where I use sqlalchemy-drill to run <code class="language-plaintext 
highlighter-rouge">SELECT *</code> on a remote 17 billion record table over a 
0.5 Mbit/s link and start scanning through (a steady trickle of) rows in 
seconds.</p>
+
+<pre><code class="language-ipython">In [4]: r = engine.execute('select 
count(*) from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211888-cc20-fe6f-69d1-6584c5caa2df.
+INFO:drilldbapi:opened a row data stream of 1 columns.
+
+In [5]: next(r)
+Out[5]: (17437571247,)
+
+In [6]: r = engine.execute('select * from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211838-73df-1506-a74e-f5695f6b0ff5.
+INFO:drilldbapi:opened a row data stream of 21 columns.
+
+In [7]: while True:
+   ...:     _ = next(r)
+      ...:
+      INFO:drilldbapi:streamed 10000 rows.
+      INFO:drilldbapi:streamed 20000 rows.
+      INFO:drilldbapi:streamed 30000 rows.
+      INFO:drilldbapi:streamed 40000 rows.
+</code></pre>
+
+<div class="footnotes" role="doc-endnotes">
+  <ol>
+    <li id="fn:1" role="doc-endnote">
+      <p>Some mitigation is possible.  JSON representations of tabular big 
data are typically extremely compressible and you can reduce the bytes sent 
over the network by 10-20x by enabling HTTP response compression on a web 
server like Apache placed in front of the Drill REST API. <a href="#fnref:1" 
class="reversefootnote" role="doc-backlink">&#8617;</a></p>
+    </li>
+  </ol>
+</div>
+
+  </article>
+ <div id="disqus_thread"></div>
+    <script type="text/javascript">
+        /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE 
* * */
+        var disqus_shortname = 'drill'; // required: replace example with your 
forum shortname
+
+        /* * * DON'T EDIT BELOW THIS LINE * * */
+        (function() {
+            var dsq = document.createElement('script'); dsq.type = 
'text/javascript'; dsq.async = true;
+            dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
+            (document.getElementsByTagName('head')[0] || 
document.getElementsByTagName('body')[0]).appendChild(dsq);
+        })();
+    </script>
+    <noscript>Please enable JavaScript to view the <a 
href="http://disqus.com/?ref_noscript";>comments powered by 
Disqus.</a></noscript>
+    
+</div>
+<script type="text/javascript" 
src="https://s7.addthis.com/js/300/addthis_widget.js#pubid=ra-548b2caa33765e8d"; 
async="async"></script>
+
+  </div>
+  <p class="push"></p>
+<div id="footer" class="mw">
+<div class="wrapper">
+Copyright © 2012-2020 The Apache Software Foundation, licensed under the 
Apache License, Version 2.0.<br>
+Apache and the Apache feather logo are trademarks of The Apache Software 
Foundation. Other names appearing on the site may be trademarks of their 
respective owners.<br/><br/>
+</div>
+</div>
+
+  <script>
+(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
+
+ga('create', 'UA-53379651-1', 'auto');
+ga('send', 'pageview');
+</script>
+<script type="text/javascript" 
src="https://s7.addthis.com/js/300/addthis_widget.js#pubid=ra-548b2caa33765e8d"; 
async="async"></script>
+
+</body>
+</html>
diff --git a/zh/blog/index.html b/zh/blog/index.html
index cd8ab0a..ff02bf0 100644
--- a/zh/blog/index.html
+++ b/zh/blog/index.html
@@ -140,6 +140,11 @@
 </div>
 
 <div class="int_text" align="left"><!-- previously: site.posts -->
+<p><a class="post-link" 
href="/zh/blog/2021/07/09/streaming-data-from-the-rest-api/">Streaming data 
from Drill REST API</a><br/>
+<span class="post-date">Posted on Jul 9, 2021
+by James Turton</span>
+<br/>The release of Apache Drill 1.19 saw a major change under the hood of 
Drill's REST API with the introduction of a streaming data path for query 
results moving from Drill and over the network to the initiating client.  The 
result is better memory utilisation, less blocking and a more reliable API.</p>
+<!-- previously: site.posts -->
 <p><a class="post-link" href="/zh/blog/2021/06/10/drill-1.19-released/">Drill 
1.19 Released</a><br/>
 <span class="post-date">Posted on Jun 10, 2021
 by Laurent Goujon</span>
diff --git a/zh/feed.xml b/zh/feed.xml
index 645d54e..90de618 100644
--- a/zh/feed.xml
+++ b/zh/feed.xml
@@ -4,13 +4,64 @@
     <title>Apache Drill - Schema-free SQL for Hadoop, NoSQL and Cloud 
Storage</title>
     <description>The official user documentation for Apache Drill.
 </description>
-    <link>/</link>
-    <atom:link href="/zh/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Thu, 08 Jul 2021 14:57:25 +0200</pubDate>
-    <lastBuildDate>Thu, 08 Jul 2021 14:57:25 +0200</lastBuildDate>
+    <link>http://localhost:4000/</link>
+    <atom:link href="http://localhost:4000/zh/feed.xml"; rel="self" 
type="application/rss+xml"/>
+    <pubDate>Fri, 09 Jul 2021 15:37:50 +0200</pubDate>
+    <lastBuildDate>Fri, 09 Jul 2021 15:37:50 +0200</lastBuildDate>
     <generator>Jekyll v3.9.0</generator>
     
       <item>
+        <title>Streaming data from Drill REST API</title>
+        <description>&lt;p&gt;Anyone who’s used a UNIX pipe, or even just 
watched something on Netflix, is at least a little familiar with the idea of 
processing data in a streaming fashion.  While your data are in small volume 
compared to available memory and I/O speeds, streaming is something you can 
afford to dispense.  But if you cannot fit an entire dataset in RAM, or when if 
you have to download an entire 4K movie before you can start playing it, then 
streaming can make a game chan [...]
+
+&lt;p&gt;With the relase of version 1.19, Drill will stream JSON query result 
data over an HTTP response to the client that initiated the query using the 
REST API.  And if anything can easily get big compared to your available RAM or 
network speeds, its query results coming out of Drill.  It’s important to note 
here that JSON over HTTP is never going to be the most efficient way to move 
big data around&lt;sup id=&quot;fnref:1&quot; 
role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&q [...]
+
+&lt;p&gt;Where JSON and HTTP &lt;em&gt;do&lt;/em&gt; win on universality: 
today it’s hard to image a client hardware and software stack that doesn’t 
provide JSON and HTTP out of the box with minimal effort.  So it’s important 
that they work as well as is possible.  The new streaming query results 
materialisation on the server side means that Drill’s heap memory isn’t 
pressurised by having to buffer entire result sets before starting to transmit 
them over the network.  Even existing REST  [...]
+
+&lt;p&gt;To fully realise the benefits of streaming query result data, clients 
can &lt;em&gt;themselves&lt;/em&gt; operate on the HTTP response they receive 
in a streaming fashion, thereby potentially starting to process records before 
Drill has even finished materialsing them.  At the transport level (when there 
are enough results to warrant it), the HTTP response headers from the Drill 
REST API will include a &lt;code class=&quot;language-plaintext 
highlighter-rouge&quot;&gt;Transfer-E [...]
+
+&lt;p&gt;Streaming HTTP client developers should note well that the schema of 
JSON query result is &lt;em&gt;not&lt;/em&gt; a &lt;a 
href=&quot;https://en.wikipedia.org/wiki/JSON_streaming&quot;&gt;streaming JSON 
format&lt;/a&gt; like “JSON lines”.  This means that you should take care to 
avoid parsing any entire object which is a parent of the &lt;code 
class=&quot;language-plaintext highlighter-rouge&quot;&gt;rows&lt;/code&gt; 
property in the query result, thereby parsing the essentailly [...]
+
+&lt;p&gt;In closing, and for a bit of fun, here’s the log from a short IPython 
session where I use sqlalchemy-drill to run &lt;code 
class=&quot;language-plaintext highlighter-rouge&quot;&gt;SELECT *&lt;/code&gt; 
on a remote 17 billion record table over a 0.5 Mbit/s link and start receiving 
(a steady trickle of) rows in seconds.&lt;/p&gt;
+
+&lt;pre&gt;&lt;code class=&quot;language-ipython&quot;&gt;In [4]: r = 
engine.execute('select count(*) from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211888-cc20-fe6f-69d1-6584c5caa2df.
+INFO:drilldbapi:opened a row data stream of 1 columns.
+
+In [5]: next(r)
+Out[5]: (17437571247,)
+
+In [6]: r = engine.execute('select * from dfs.ws.big_table')
+INFO:drilldbapi:received Drill query ID 1f211838-73df-1506-a74e-f5695f6b0ff5.
+INFO:drilldbapi:opened a row data stream of 21 columns.
+
+In [7]: while True:
+   ...:     _ = next(r)
+      ...:
+      INFO:drilldbapi:streamed 10000 rows.
+      INFO:drilldbapi:streamed 20000 rows.
+      INFO:drilldbapi:streamed 30000 rows.
+      INFO:drilldbapi:streamed 40000 rows.
+&lt;/code&gt;&lt;/pre&gt;
+
+&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
+  &lt;ol&gt;
+    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
+      &lt;p&gt;JSON representations of tabular big data are typically 
extremely compressible and you can reduce the bytes sent over the network by 
10-20x by enabling HTTP response compression on a web server like Apache placed 
in front of the Drill REST API. &lt;a href=&quot;#fnref:1&quot; 
class=&quot;reversefootnote&quot; 
role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
+    &lt;/li&gt;
+  &lt;/ol&gt;
+&lt;/div&gt;
+</description>
+        <pubDate>Fri, 09 Jul 2021 00:00:00 +0200</pubDate>
+        
<link>http://localhost:4000/blog/2021/07/09/streaming-data-from-the-rest-api/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2021/07/09/streaming-data-from-the-rest-api/</guid>
+        
+        
+        <category>blog</category>
+        
+      </item>
+    
+      <item>
         <title>Drill 1.19 Released</title>
         <description>&lt;p&gt;今天, we’re happy to announce the availability of 
Drill 1.19.0. You can download it &lt;a 
href=&quot;https://drill.apache.org/download/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 
@@ -40,8 +91,8 @@
 &lt;p&gt;You can find a complete list of improvements and JIRAs resolved in 
the 1.19.0 release &lt;a 
href=&quot;/docs/apache-drill-1-19-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 </description>
         <pubDate>Thu, 10 Jun 2021 00:00:00 +0200</pubDate>
-        <link>/blog/2021/06/10/drill-1.19-released/</link>
-        <guid isPermaLink="true">/blog/2021/06/10/drill-1.19-released/</guid>
+        <link>http://localhost:4000/blog/2021/06/10/drill-1.19-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2021/06/10/drill-1.19-released/</guid>
         
         
         <category>blog</category>
@@ -64,8 +115,8 @@
 &lt;p&gt;You can find a complete list of improvements and JIRAs resolved in 
the 1.18.0 release &lt;a 
href=&quot;/docs/apache-drill-1-18-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
 </description>
         <pubDate>Sat, 05 Sep 2020 00:00:00 +0200</pubDate>
-        <link>/blog/2020/09/05/drill-1.18-released/</link>
-        <guid isPermaLink="true">/blog/2020/09/05/drill-1.18-released/</guid>
+        <link>http://localhost:4000/blog/2020/09/05/drill-1.18-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2020/09/05/drill-1.18-released/</guid>
         
         
         <category>blog</category>
@@ -106,8 +157,8 @@
 
 </description>
         <pubDate>Thu, 26 Dec 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/12/26/drill-1.17-released/</link>
-        <guid isPermaLink="true">/blog/2019/12/26/drill-1.17-released/</guid>
+        <link>http://localhost:4000/blog/2019/12/26/drill-1.17-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/12/26/drill-1.17-released/</guid>
         
         
         <category>blog</category>
@@ -119,8 +170,8 @@
         <description>
 </description>
         <pubDate>Thu, 02 May 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/05/02/drill-user-meetup/</link>
-        <guid isPermaLink="true">/blog/2019/05/02/drill-user-meetup/</guid>
+        <link>http://localhost:4000/blog/2019/05/02/drill-user-meetup/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/05/02/drill-user-meetup/</guid>
         
         
         <category>blog</category>
@@ -155,8 +206,8 @@
 
 </description>
         <pubDate>Thu, 02 May 2019 00:00:00 +0200</pubDate>
-        <link>/blog/2019/05/02/drill-1.16-released/</link>
-        <guid isPermaLink="true">/blog/2019/05/02/drill-1.16-released/</guid>
+        <link>http://localhost:4000/blog/2019/05/02/drill-1.16-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2019/05/02/drill-1.16-released/</guid>
         
         
         <category>blog</category>
@@ -188,8 +239,8 @@
 
 </description>
         <pubDate>Mon, 31 Dec 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/12/31/drill-1.15-released/</link>
-        <guid isPermaLink="true">/blog/2018/12/31/drill-1.15-released/</guid>
+        <link>http://localhost:4000/blog/2018/12/31/drill-1.15-released/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/12/31/drill-1.15-released/</guid>
         
         
         <category>blog</category>
@@ -223,8 +274,8 @@
 &lt;/ul&gt;
 </description>
         <pubDate>Sat, 01 Dec 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/12/01/learning-apache-drill-book/</link>
-        <guid 
isPermaLink="true">/blog/2018/12/01/learning-apache-drill-book/</guid>
+        
<link>http://localhost:4000/blog/2018/12/01/learning-apache-drill-book/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/12/01/learning-apache-drill-book/</guid>
         
         
         <category>blog</category>
@@ -236,8 +287,8 @@
         <description>
 </description>
         <pubDate>Tue, 16 Oct 2018 21:18:04 +0200</pubDate>
-        <link>/blog/2018/10/16/drill-user-meetup/</link>
-        <guid isPermaLink="true">/blog/2018/10/16/drill-user-meetup/</guid>
+        <link>http://localhost:4000/blog/2018/10/16/drill-user-meetup/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/10/16/drill-user-meetup/</guid>
         
         
         <category>blog</category>
@@ -249,42 +300,8 @@
         <description>
 </description>
         <pubDate>Tue, 16 Oct 2018 21:18:04 +0200</pubDate>
-        <link>/blog/2018/10/16/drill-developer-day/</link>
-        <guid isPermaLink="true">/blog/2018/10/16/drill-developer-day/</guid>
-        
-        
-        <category>blog</category>
-        
-      </item>
-    
-      <item>
-        <title>Drill 1.14 Released</title>
-        <description>&lt;p&gt;Today, we’re happy to announce the availability 
of Drill 1.14.0. You can download it &lt;a 
href=&quot;https://drill.apache.org/download/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;The release provides the following bug fixes and 
improvements:&lt;/p&gt;
-
-&lt;h2 id=&quot;run-drill-in-a-docker-container-drill-6346&quot;&gt;Run Drill 
in a Docker Container (DRILL-6346)&lt;/h2&gt;
-&lt;p&gt;Running Drill in a Docker container is the simplest way to start 
using Drill; all you need is the Docker client installed on your machine. You 
simply run a Docker command, and your Docker client downloads the Drill Docker 
image from the apache-drill repository on Docker Hub and then brings up a 
container with Apache Drill running in embedded mode. See &lt;a 
href=&quot;/docs/running-drill-on-docker/&quot;&gt;Running Drill on 
Docker&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;export-and-save-storage-plugin-configurations-drill-4580&quot;&gt;Export
 and Save Storage Plugin Configurations (DRILL-4580)&lt;/h2&gt;
-&lt;p&gt;You can export and save your storage plugin configurations from the 
Storage page in the Drill Web UI. See &lt;a 
href=&quot;/docs/configuring-storage-plugins/#exporting-storage-plugin-configurations&quot;&gt;Exporting
 Storage Plugin Configurations&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;manage-storage-plugin-configurations-in-a-configuration-file-drill-6494&quot;&gt;Manage
 Storage Plugin Configurations in a Configuration File (DRILL-6494)&lt;/h2&gt;
-&lt;p&gt;You can manage storage plugin configurations in the Drill 
configuration file,  storage-plugins-override.conf. When you provide the 
storage plugin configurations in the storage-plugins-override.conf file, Drill 
reads the file and configures the plugins during start-up. See &lt;a 
href=&quot;https://drill.apache.org/docs/configuring-storage-plugins/#configuring-storage-plugins-with-the-storage-plugins-override.conf-file&quot;&gt;Configuring
 Storage Plugins with the storage-plugins- [...]
-
-&lt;h2 
id=&quot;query-metadata-in-various-image-formats-drill-4364&quot;&gt;Query 
Metadata in Various Image Formats (DRILL-4364)&lt;/h2&gt;
-&lt;p&gt;The metadata format plugin is useful for querying a large number of 
image files stored in a distributed file system. You do not have to build a 
metadata repository in advance.&lt;br /&gt;
-See &lt;a href=&quot;/docs/image-metadata-format-plugin/&quot;&gt;Image 
Metadata Format Plugin&lt;/a&gt;.&lt;/p&gt;
-
-&lt;h2 
id=&quot;set-hive-properties-at-the-session-level-drill-6575&quot;&gt;Set Hive 
Properties at the Session Level (DRILL-6575)&lt;/h2&gt;
-&lt;p&gt;The store.hive.conf.properties option enables you to specify Hive 
properties at the session level using the SET command. See &lt;a 
href=&quot;/docs/hive-storage-plugin/#setting-hive-properties&quot;&gt;Setting 
Hive Properties&lt;/a&gt;.&lt;/p&gt;
-
-&lt;p&gt;You can find a complete list of JIRAs resolved in the 1.14.0 release 
&lt;a 
href=&quot;/docs/apache-drill-1-14-0-release-notes/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
-
-</description>
-        <pubDate>Sun, 05 Aug 2018 00:00:00 +0200</pubDate>
-        <link>/blog/2018/08/05/drill-1.14-released/</link>
-        <guid isPermaLink="true">/blog/2018/08/05/drill-1.14-released/</guid>
+        <link>http://localhost:4000/blog/2018/10/16/drill-developer-day/</link>
+        <guid 
isPermaLink="true">http://localhost:4000/blog/2018/10/16/drill-developer-day/</guid>
         
         
         <category>blog</category>
diff --git a/zh/index.html b/zh/index.html
index d23a962..51bc664 100644
--- a/zh/index.html
+++ b/zh/index.html
@@ -202,9 +202,9 @@
   <div class="news">News:
   </div>
   
-  <div><a href="/zh/blog/2021/06/10/drill-1.19-released/">Drill 1.19 
Released</a><br/><span>(Laurent Goujon)</span></div>
+  <div><a 
href="/zh/blog/2021/07/09/streaming-data-from-the-rest-api/">Streaming data 
from Drill REST API</a><br/><span>(James Turton)</span></div>
   
-  <div><a href="/zh/blog/2020/09/05/drill-1.18-released/">Drill 1.18 
Released</a><br/><span>(Abhishek Girish)</span></div>
+  <div><a href="/zh/blog/2021/06/10/drill-1.19-released/">Drill 1.19 
Released</a><br/><span>(Laurent Goujon)</span></div>
 </div>
 <div class="mw introWrapper">
   <table class="intro" cellpadding="0" cellspacing="0" align="center">

[drill-site] branch asf-site updated: New blog post on HTTP streaming from the REST API.

Reply via email to