This is an automated email from the ASF dual-hosted git repository.

planka pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 7e0e2aec7 Updated site to include documentation for Read vs Seek 
Optimization
7e0e2aec7 is described below

commit 7e0e2aec73e5695ccf9b73243fdb23b7adfb40d2
Author: Pavan Lanka <[email protected]>
AuthorDate: Tue Jun 14 16:07:53 2022 -0700

    Updated site to include documentation for Read vs Seek Optimization
---
 develop/design/index.html    |   1 +
 develop/design/io/index.html | 537 +++++++++++++++++++++++++++++++++++++++++++
 img/seekvsread.png           | Bin 0 -> 32924 bytes
 3 files changed, 538 insertions(+)

diff --git a/develop/design/index.html b/develop/design/index.html
index b2f5ff612..e019dfda9 100644
--- a/develop/design/index.html
+++ b/develop/design/index.html
@@ -86,6 +86,7 @@
         <h1>Design</h1>
         <ul>
   <li><a href="lazy_filter">Lazy Filters</a></li>
+  <li><a href="io">IO</a></li>
 </ul>
 
       </article>
diff --git a/develop/design/io/index.html b/develop/design/io/index.html
new file mode 100644
index 000000000..01b3eba03
--- /dev/null
+++ b/develop/design/io/index.html
@@ -0,0 +1,537 @@
+<!DOCTYPE HTML>
+<html lang="en-US">
+<head>
+  <meta charset="UTF-8">
+  <title>IO</title>
+  <meta name="viewport" content="width=device-width,initial-scale=1">
+  <meta name="generator" content="Jekyll v3.8.6">
+  <link rel="stylesheet" 
href="//fonts.googleapis.com/css?family=Lato:300,300italic,400,400italic,700,700italic,900">
+  <link rel="stylesheet" href="/css/screen.css">
+  <link rel="icon" type="image/x-icon" href="/favicon.ico">
+  <!--[if lt IE 9]>
+  <script src="/js/html5shiv.min.js"></script>
+  <script src="/js/respond.min.js"></script>
+  <![endif]-->
+</head>
+
+
+<body class="wrap">
+  <header role="banner">
+  <nav class="mobile-nav show-on-mobiles">
+    <ul>
+  <li class="">
+    <a href="/">Home</a>
+  </li>
+  <li class="">
+    <a href="/docs/"><span class="show-on-mobiles">Docs</span>
+                     <span class="hide-on-mobiles">Documentation</span></a>
+  </li>
+  <li class="">
+    <a href="/talks/">Talks</a>
+  </li>
+  <li class="">
+    <a href="/news/">News</a>
+  </li>
+  <li class="">
+    <a href="/help/">Help</a>
+  </li>
+  <li class="current">
+    <a href="/develop/">Develop</a>
+  </li>
+</ul>
+
+  </nav>
+  <div class="grid">
+    <div class="unit one-third center-on-mobiles">
+      <h1>
+        <a href="/">
+          <span class="sr-only">Apache ORC</span>
+          <img src="/img/logo.png" width="249" height="101" alt="ORC Logo">
+        </a>
+      </h1>
+    </div>
+    <nav class="main-nav unit two-thirds hide-on-mobiles">
+      <ul>
+  <li class="">
+    <a href="/">Home</a>
+  </li>
+  <li class="">
+    <a href="/docs/"><span class="show-on-mobiles">Docs</span>
+                     <span class="hide-on-mobiles">Documentation</span></a>
+  </li>
+  <li class="">
+    <a href="/talks/">Talks</a>
+  </li>
+  <li class="">
+    <a href="/news/">News</a>
+  </li>
+  <li class="">
+    <a href="/help/">Help</a>
+  </li>
+  <li class="current">
+    <a href="/develop/">Develop</a>
+  </li>
+</ul>
+
+    </nav>
+  </div>
+</header>
+
+
+  <section class="standalone">
+  <div class="grid">
+
+    <div class="unit whole">
+      <article>
+        <h1>IO</h1>
+        <ul>
+  <li><a href="#Background">Background</a>
+    <ul>
+      <li><a href="#SeekvsRead">Seek vs Read</a></li>
+      <li><a href="#ORCRead">ORC Read</a></li>
+    </ul>
+  </li>
+  <li><a href="#ReadOptimization">Read Optimization</a>
+    <ul>
+      <li><a href="#Approach">Approach</a></li>
+      <li><a href="#Scope">Scope</a></li>
+      <li><a href="#Benchmarks">Benchmarks</a>
+        <ul>
+          <li><a href="#LocalFS">Local FS</a></li>
+          <li><a href="#AWSS3">AWS S3</a></li>
+        </ul>
+      </li>
+      <li><a href="#Summary">Summary</a></li>
+    </ul>
+  </li>
+</ul>
+
+<h2 id="background-">Background <a id="Background"></a></h2>
+
+<p>We are moving our workloads from HDFS to AWS S3. As part of this activity 
we wanted to understand the performance
+characteristics and costs of using S3.</p>
+
+<h3 id="seek-vs-read-">Seek vs Read <a id="SeekvsRead"></a></h3>
+
+<p>One particular scenario that stood out in our performance testing was Seek 
vs Read when dealing with S3.</p>
+
+<p>In this test we are trying to read through a file</p>
+
+<ul>
+  <li>Seek to Point A in the file read X bytes</li>
+  <li>Move to Point B in the file that is A + X + Y
+    <ul>
+      <li>This is accomplished as another seek or as a read</li>
+      <li>We will leave Y variable to determine when this is best</li>
+    </ul>
+  </li>
+  <li>Read X bytes</li>
+</ul>
+
+<p><img src="/img/seekvsread.png" alt="Seek vs Read" /></p>
+
+<p>Observations:</p>
+
+<ul>
+  <li>We could clearly see that a read is more performant than seek when 
dealing with steps/gaps smaller than 4 MB.
+    <ul>
+      <li>At 4 MB read is faster by ~ 11%</li>
+      <li>At 1 MB read is faster by ~ 20%</li>
+    </ul>
+  </li>
+  <li>Reads are also cheaper as we perform a single GET instead of multiple 
GETs from <a href="https://aws.amazon.com/s3/pricing/";>AWS S3 Pricing</a>
+    <ul>
+      <li>Cost for GET: $0.0004</li>
+      <li>Cost for Data Retrieval to the same region AWS EKS: $0.0000</li>
+    </ul>
+  </li>
+</ul>
+
+<h3 id="orc-read-">ORC Read <a id="ORCRead"></a></h3>
+
+<p>Based on the above performance penalty when dealing with multiple seeks 
over small gaps, we measured the performance of
+ORC read on a file.</p>
+
+<p>File details:</p>
+
+<ul>
+  <li>Size ~ 21 MB</li>
+  <li>Column Count: ~ 400</li>
+  <li>Row Count: ~ 65K</li>
+</ul>
+
+<table>
+  <thead>
+    <tr>
+      <th style="text-align: left">Read Type</th>
+      <th style="text-align: right">Duration</th>
+      <th style="text-align: left">Unit</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: left">All Columns</td>
+      <td style="text-align: right">1.075</td>
+      <td style="text-align: left">s</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">Alternate Columns</td>
+      <td style="text-align: right">6.489</td>
+      <td style="text-align: left">s</td>
+    </tr>
+  </tbody>
+</table>
+
+<p>Observations:</p>
+
+<ul>
+  <li>We can clearly see that we pay a significant penalty when reading 
alternate columns, which in the current
+implementation of ORC translates to multiple GET calls on AWS S3</li>
+  <li>While the impact of penalty will be less significant in large reads, it 
will incur overheads both in terms of time and
+cost</li>
+</ul>
+
+<h2 id="read-optimization-">Read Optimization <a 
id="ReadOptimization"></a></h2>
+
+<h3 id="approach-">Approach <a id="Approach"></a></h3>
+
+<p>The following optimizations are planned:</p>
+
+<ul>
+  <li><strong>orc.min.disk.seek.size</strong> is a value in bytes: When trying 
to determine a single read, if the gap between two reads
+is smaller than this then it is combined into a single read.</li>
+  <li><strong>orc.min.disk.seek.size.tolerance</strong> is a fractional input: 
If the extra bytes read is greater than this fraction of
+the required bytes, then we drop the extra bytes from memory.</li>
+  <li>We can further consider adding an optimization for the complete stripe 
in case the stripe size is smaller than
+<code class="highlighter-rouge">orc.min.disk.seek.size</code></li>
+</ul>
+
+<h3 id="scope-">Scope <a id="Scope"></a></h3>
+
+<p>Different types of IO takes place in ORC today.</p>
+
+<ul>
+  <li>Reading of File Footer: Unchanged</li>
+  <li>Reading of Stripe Footer: Unchanged</li>
+  <li>Reading of Stripe Index information: Optimized</li>
+  <li>Reading of Stripe Data: Optimized</li>
+</ul>
+
+<p>Each of the above happens at different stages of the read. The current 
implementation optimizes reads that happen using
+the <a 
href="https://github.com/apache/orc/tree/main/java/core/src/java/org/apache/orc/DataReader.java";>DataReader</a>
 interface.</p>
+
+<p>This does not:</p>
+
+<ul>
+  <li>Optimize the read of the file/stripe footer</li>
+  <li>Reads across multiple stripes</li>
+</ul>
+
+<h3 id="benchmarks-">Benchmarks <a id="Benchmarks"></a></h3>
+
+<h4 id="local-fs-">Local FS <a id="LocalFS"></a></h4>
+
+<p>This benchmark is run on the local filesystem with NVMe SSD, so it has very 
different performance characteristics to AWS
+S3.</p>
+
+<p>The purpose of this benchmark is to ascertain if we have added any 
significant penalties in the ORC code by adding
+<code class="highlighter-rouge">minSeekSize</code> and <code 
class="highlighter-rouge">extraByteTolerance</code>.</p>
+
+<div class="language-bash highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>java <span class="nt">-jar</span> 
java/bench/core/target/orc-benchmarks-core-<span class="k">*</span><span 
class="nt">-uber</span>.jar chunk_read
+</code></pre></div></div>
+
+<table>
+  <thead>
+    <tr>
+      <th style="text-align: left">(alt)</th>
+      <th style="text-align: right">(cols)</th>
+      <th style="text-align: right">(byteTol)</th>
+      <th style="text-align: right">(minSeek)</th>
+      <th style="text-align: left">Mode</th>
+      <th style="text-align: right">Cnt</th>
+      <th style="text-align: right">Score</th>
+      <th style="text-align: left">Sign</th>
+      <th style="text-align: right">Error</th>
+      <th style="text-align: left">Units</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: left">true</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">0</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.352</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.006</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">true</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.357</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.002</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">true</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">10.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.349</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.002</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">false</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">0</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.667</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.007</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">false</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.673</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.004</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">false</td>
+      <td style="text-align: right">128</td>
+      <td style="text-align: right">10.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">20</td>
+      <td style="text-align: right">0.671</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.005</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+  </tbody>
+</table>
+
+<p>Observations/Details:</p>
+
+<ul>
+  <li><strong>Input File details</strong>:
+    <ul>
+      <li>Rows: 65536</li>
+      <li>Columns: 128</li>
+      <li>FileSize: ~ 72 MB</li>
+    </ul>
+  </li>
+  <li>Full Read (alternate = false)
+    <ul>
+      <li>No significant difference between the options as expected</li>
+    </ul>
+  </li>
+  <li>Alternate Read (alternate = true)
+    <ul>
+      <li>No significant difference between the options given the small file 
size and performance of local disk</li>
+      <li>This also calls out that the recommended minSeekSize should be 
determined for each platform e.g. HDFS, S3, etc</li>
+    </ul>
+  </li>
+</ul>
+
+<h4 id="aws-s3-">AWS S3 <a id="AWSS3"></a></h4>
+
+<p>In this benchmark we brought up an EKS Container in the same region as the 
AWS S3 bucket to test the performance of the
+patch.</p>
+
+<table>
+  <thead>
+    <tr>
+      <th style="text-align: left">(alternate)</th>
+      <th style="text-align: right">(byteTol)</th>
+      <th style="text-align: right">(minSeekSize)</th>
+      <th style="text-align: left">Mode</th>
+      <th style="text-align: right">Cnt</th>
+      <th style="text-align: right">Score</th>
+      <th style="text-align: left">Sign</th>
+      <th style="text-align: right">Error</th>
+      <th style="text-align: left">Units</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td style="text-align: left">FALSE</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">0</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">1.837</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.089</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">FALSE</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">1.919</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.11</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">FALSE</td>
+      <td style="text-align: right">10.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">1.895</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.191</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">TRUE</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">0</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">5.8</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">1.132</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">TRUE</td>
+      <td style="text-align: right">0.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">1.479</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.197</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+    <tr>
+      <td style="text-align: left">TRUE</td>
+      <td style="text-align: right">10.0</td>
+      <td style="text-align: right">4194304</td>
+      <td style="text-align: left">avgt</td>
+      <td style="text-align: right">5</td>
+      <td style="text-align: right">1.435</td>
+      <td style="text-align: left">±</td>
+      <td style="text-align: right">0.176</td>
+      <td style="text-align: left">s/op</td>
+    </tr>
+  </tbody>
+</table>
+
+<p>Observations/Details:</p>
+
+<ul>
+  <li><strong>Input File details</strong>:
+    <ul>
+      <li>Rows: 65536</li>
+      <li>Columns: 128</li>
+      <li>FileSize: ~ 72 MB</li>
+    </ul>
+  </li>
+  <li>Full Read (alternate = false)
+    <ul>
+      <li>No significant difference between the options as expected</li>
+    </ul>
+  </li>
+  <li>Alternate Read (alternate = true)
+    <ul>
+      <li>We get a significant boost in performance 5.8s without optimization 
to 1.5s with optimization giving us a time
+reduction of ~ 75 %</li>
+      <li>This also gives us a cost saving as 64 GET one for each column per 
stripe have been replaced with a single GET</li>
+      <li>We can see a marginal improvement ~ 3% when choosing to retain extra 
bytes (extraByteTolerance=10.0) as compared to
+(extraByteTolerance=0.0) which performs additional work of dropping the extra 
bytes from memory.</li>
+    </ul>
+  </li>
+</ul>
+
+<h3 id="summary-">Summary <a id="Summary"></a></h3>
+
+<p>Based on the benchmarks the following is recommended for ORC in AWS S3:</p>
+
+<ul>
+  <li><code class="highlighter-rouge">orc.min.disk.seek.size</code> is set to 
<code class="highlighter-rouge">4194304</code> (4 MB)</li>
+  <li><code class="highlighter-rouge">orc.min.disk.seek.size.tolerance</code> 
is set to value that is acceptable based on the memory usage constraints. When 
set
+to <code class="highlighter-rouge">0.0</code> it will always do the extra work 
of dropping the extra bytes.</li>
+</ul>
+
+
+      </article>
+    </div>
+
+    <div class="clear"></div>
+
+  </div>
+</section>
+
+
+  <footer role="contentinfo">
+  <p>The contents of this website are &copy;&nbsp;2022
+     <a href="https://www.apache.org/";>Apache Software Foundation</a>
+     under the terms of the <a
+      href="https://www.apache.org/licenses/LICENSE-2.0.html";>
+      Apache&nbsp;License&nbsp;v2</a>. Apache ORC and its logo are trademarks
+      of the Apache Software Foundation.</p>
+</footer>
+
+  <script>
+  var anchorForId = function (id) {
+    var anchor = document.createElement("a");
+    anchor.className = "header-link";
+    anchor.href      = "#" + id;
+    anchor.innerHTML = "<span class=\"sr-only\">Permalink</span><i class=\"fa 
fa-link\"></i>";
+    anchor.title = "Permalink";
+    return anchor;
+  };
+
+  var linkifyAnchors = function (level, containingElement) {
+    var headers = containingElement.getElementsByTagName("h" + level);
+    for (var h = 0; h < headers.length; h++) {
+      var header = headers[h];
+
+      if (typeof header.id !== "undefined" && header.id !== "") {
+        header.appendChild(anchorForId(header.id));
+      }
+    }
+  };
+
+  document.onreadystatechange = function () {
+    if (this.readyState === "complete") {
+      var contentBlock = document.getElementsByClassName("docs")[0] || 
document.getElementsByClassName("news")[0];
+      if (!contentBlock) {
+        return;
+      }
+      for (var level = 1; level <= 6; level++) {
+        linkifyAnchors(level, contentBlock);
+      }
+    }
+  };
+</script>
+
+
+</body>
+</html>
diff --git a/img/seekvsread.png b/img/seekvsread.png
new file mode 100644
index 000000000..6491a5492
Binary files /dev/null and b/img/seekvsread.png differ

Reply via email to