This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository
https://gitbox.apache.org/repos/asf/incubator-datasketches-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 7cb632e Automatic Site Publish by Buildbot
7cb632e is described below
commit 7cb632edee8719ff536a0674160aec1e179682b7
Author: buildbot <[email protected]>
AuthorDate: Mon Oct 19 19:05:59 2020 +0000
Automatic Site Publish by Buildbot
---
output/docs/REQ/ReqAccuracyAdversarial.html | 91 ++++++++++++++++++++++---
output/docs/REQ/ReqAccuracyRandomShuffled.html | 35 ++++------
output/docs/img/req/FlipFlopPattern.png | Bin 0 -> 21053 bytes
output/docs/img/req/RandomPattern.png | Bin 0 -> 21357 bytes
output/docs/img/req/ReversedPattern.png | Bin 0 -> 21011 bytes
output/docs/img/req/SortedPattern.png | Bin 0 -> 20843 bytes
output/docs/img/req/SqrtPattern.png | Bin 0 -> 20958 bytes
output/docs/img/req/ZoominPattern.png | Bin 0 -> 20951 bytes
output/docs/img/req/ZoomoutPattern.png | Bin 0 -> 21077 bytes
9 files changed, 96 insertions(+), 30 deletions(-)
diff --git a/output/docs/REQ/ReqAccuracyAdversarial.html
b/output/docs/REQ/ReqAccuracyAdversarial.html
index d62cff8..6cc8711 100644
--- a/output/docs/REQ/ReqAccuracyAdversarial.html
+++ b/output/docs/REQ/ReqAccuracyAdversarial.html
@@ -505,23 +505,96 @@
-->
<h1 id="reqsketch-accuracy-with-adversarial-streams">ReqSketch Accuracy with
Adversarial Streams</h1>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T12_LT_Sh.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<p>This set of tests characterize the accuracy (or more precisely the rank
error) of the ReqSketch using specifically selected adversarial streams. The
goal of this suite of tests is to understand how the rank error of the sketch
behaves across all ranks with these specific stream patterns. All of these
tests are run with the same configuration except for the choice of the
adversarial stream pattern.</p>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T12_LT_NoSh.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<p>The design of these tests is quite different from the tests for the
<em>Random Shuffled Streams</em>. Here, each test has one pattern and running
multiple trials on the same pattern will not produce a nice distribution of
error that we can easily analyze. We would like to capture the ranks where the
pattern creates the largest error. These aberrant ranks could occur anywhere in
the stream. Instead of choosing 100 plot points where the error is exclusively
measured, we want to measur [...]
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Sorted.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<p>In this case we collect the statistics of all the errors in 100 contiguous
intervals of the stream. For a stream length of 2^20, each interval consists of
about ten thousand values. The errors from these 10K values are fed into a
standard quantile sketch as before, and we extract 3 statical quantile points,
-3SD, median and +3SD, and plot those 3 values at each of the 100 plot
points.</p>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Reversed.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<p>As you can see, some of these patterns challenge our current a priori
calculation of the error bounds, which means we may need to adjust them
somewhat. If we do, these plots will be regenerated.</p>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Random.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<p>For those that are interested in the actual code that run these tests can
examine the following links.</p>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<ul>
+ <li><a
href="https://github.com/apache/incubator-datasketches-characterization/blob/master/src/test/java/org/apache/datasketches/characterization/quantiles/ReqSketchAccuracyProfile2.java">Code</a>:
The code used to generate these characterization studies.</li>
+ <li><a
href="https://github.com/apache/incubator-datasketches-characterization/blob/master/src/main/resources/quantiles/ReqSketchAccuracy2Job.conf">Config</a>:
The human readable and editable configuration file that instructs the above
code with the specific properties used to run the test. These configuration
properties are different for each of the following plots and summarized below
with each plot.</li>
+</ul>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<h2 id="test-design">Test Design</h2>
+<ul>
+ <li>Stream Length (SL): 2^20</li>
+ <li>Stream Values: Natural numbers, ℕ<sub>1</sub>, from 1 to SL, expressed
as 32-bit floats.</li>
+ <li>Y-axis: The absolute error of the sketch <em>getRank(value)</em>
method.</li>
+ <li>X-axis: The normalized rank [0.0, 1.0]</li>
+ <li>Plot Points (PP): 100. Equally spaced points along the X-axis starting
at <em>1.0/SL</em> and ending at 1.0.</li>
+ <li>Trial:
+ <ul>
+ <li>The stream is generated according to the chosen adversarial
pattern.</li>
+ <li>At each Plot Point, we compute the rank errors of all ~10K points
from the preceding PP to the current PP.</li>
+ <li>These 10K error values are fed into an error quantile sketch
associated with the current PP.</li>
+ <li>3 quantile values (-3sd, Median, +3sd) are extracted from each error
sketch. These error quantiles correspond to the standard normal distribuion at
the median, and +/- 3SD, where SD stands for Standard Deviation from the
mean.</li>
+ </ul>
+ </li>
+ <li>Plotting:
+ <ul>
+ <li>Each of the error quantiles are connected by lines to form contours
of the error distribution where the area between the +/- 3SD contours is the
99.7% confidence interval.</li>
+ <li>In addition to the error contours. 6 dashed contours (with colors
corresponding to the error contours) represent the a priori estimates of the
error at each of the +/- standard deviations computed from the sketch’s
<em>getRankLowerBound(double, int)</em> and <em>getRankUpperBound(double,
int)</em> methods.</li>
+ </ul>
+ </li>
+</ul>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<h2 id="specific-configurations">Specific Configurations</h2>
+<h3 id="common-configuration-for-the-following-plots">Common Configuration for
the following plots</h3>
+<ul>
+ <li>K=50: the sketch sizing & accuracy parameter</li>
+ <li>HRA: High Rank Accuracy</li>
+ <li>Crit=LT: Comparison criterion: LT = Less-Than</li>
+ <li>SL=2^20: StreamLength</li>
+ <li>Eq Spaced: Equally spaced Plot Points (PP)</li>
+ <li>PP=100: Number of plot points on the x-axis</li>
+</ul>
-<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
+<h3 id="plot-1-adversarial-pattern-sorted">Plot 1 Adversarial Pattern:
Sorted</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/SortedPattern.png"
alt="/req/SortedPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Sorted.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Sorted.png" /></p>
+
+<h3 id="plot-2-adversarial-pattern-reversed">Plot 2 Adversarial Pattern:
Reversed</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/ReversedPattern.png"
alt="/req/ReversedPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Reversed.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Reversed.png" /></p>
+
+<h3 id="plot-3-adversarial-pattern-random">Plot 3 Adversarial Pattern:
Random</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/RandomPattern.png"
alt="/req/RandomPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Random.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Random.png" /></p>
+
+<h3 id="plot-4-adversarial-pattern-zoomin">Plot 4 Adversarial Pattern:
Zoomin</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/ZoominPattern.png"
alt="/req/ZoominPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Zoomin.png" /></p>
+
+<h3 id="plot-5-adversarial-pattern-zoomout">Plot 5 Adversarial Pattern:
Zoomout</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/ZoomoutPattern.png"
alt="/req/ZoomoutPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Zoomout.png" /></p>
+
+<h3 id="plot-6-adversarial-pattern-sqrt">Plot 6 Adversarial Pattern: Sqrt</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/SqrtPattern.png"
alt="/req/SqrtPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_Sqrt.png" /></p>
+
+<h3 id="plot-6-adversarial-pattern-flipflop">Plot 6 Adversarial Pattern:
FlipFlop</h3>
+
+<p><img class="doc-img-qtr" src="/docs/img/req/FlipFlopPattern.png"
alt="/req/FlipFlopPattern.png" /></p>
+
+<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png"
alt="/req/ReqErrEqHraK50SL20T0_LT_FlipFlop.png" /></p>
</div> <!-- End content -->
</div> <!-- End row -->
diff --git a/output/docs/REQ/ReqAccuracyRandomShuffled.html
b/output/docs/REQ/ReqAccuracyRandomShuffled.html
index 837db39..d4d1cf0 100644
--- a/output/docs/REQ/ReqAccuracyRandomShuffled.html
+++ b/output/docs/REQ/ReqAccuracyRandomShuffled.html
@@ -504,7 +504,12 @@
under the License.
-->
<h1 id="reqsketch-accuracy-with-random-shuffled-streams">ReqSketch Accuracy
with Random Shuffled Streams</h1>
-<p>This set of tests characterize the accuracy of the ReqSketch using random
shuffled streams.</p>
+<p>This set of tests characterize the accuracy (or more precisely the rank
error) of the ReqSketch using random shuffled streams. The goal of this suite
of tests is to understand how the rank error of the sketch behaves across all
ranks. All of these tests are run with the same stream length. The two major
parameters that are varied are the sketch’s <em>K</em>, which affects size and
accuracy of the sketch, and the <em>HRA / LRA</em> parameter, which selects the
region of highest accur [...]
+
+<p>These tests also confirm that the a priori prediction of the error bounds
are reasonable and relatively conservative. The computation of these bounds
are based on empirical measurements derived from tests such as these and are
subject to some tuning as we understand the sketch’s error behavior over a
wider selection of streams.</p>
+
+<p>For those that are interested in the actual code that run these tests can
examine the following links.</p>
+
<ul>
<li><a
href="https://github.com/apache/incubator-datasketches-characterization/blob/master/src/test/java/org/apache/datasketches/characterization/quantiles/ReqSketchAccuracyProfile.java">Code</a>:
The code used to generate these characterization studies.</li>
<li><a
href="https://github.com/apache/incubator-datasketches-characterization/blob/master/src/main/resources/quantiles/ReqSketchAccuracyJob.conf">Config</a>:
The human readable and editable configuration file that instructs the above
code with the specific properties used to run the test. These configuration
properties are different for each of the following plots and summarized below
with each plot.</li>
@@ -539,31 +544,29 @@
</ul>
<h2 id="specific-configurations">Specific Configurations</h2>
-
-<h3 id="plot-1">Plot 1</h3>
+<h3 id="common-configuration-for-the-following-plots">Common Configuration for
the following plots</h3>
<ul>
- <li>K=12: the sketch sizing & accuracy parameter</li>
<li>SL=2^20: StreamLength</li>
- <li>HRA: High Rank Accuracy</li>
<li>Eq Spaced: Equally spaced Plot Points (PP)</li>
<li>PP=100: Number of plot points on the x-axis</li>
<li>LgT=12: Number of trials = 2^12</li>
- <li>Crit=LT: Comparison criterion: LT = Less-Than</li>
<li>Shuffled: Random shuffle of the input stream for each trial</li>
</ul>
+<h3 id="plot-1">Plot 1</h3>
+<ul>
+ <li>K=12: the sketch sizing & accuracy parameter</li>
+ <li>HRA: High Rank Accuracy</li>
+ <li>Crit=LT: Comparison criterion: LT = Less-Than</li>
+</ul>
+
<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK12SL20T12_LT_Sh.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
<h3 id="plot-2">Plot 2</h3>
<ul>
<li>K=12: the sketch sizing & accuracy parameter</li>
- <li>SL=2^20: StreamLength</li>
<li>LRA: Low Rank Accuracy</li>
- <li>Eq Spaced: Equally spaced Plot Points (PP)</li>
- <li>PP=100: Number of plot points on the x-axis</li>
- <li>LgT=12: Number of trials = 2^12</li>
<li>Crit=LE: Comparison criterion: LE = Less-Than or Equal</li>
- <li>Shuffled: Random shuffle of the input stream for each trial</li>
</ul>
<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqLraK12SL20T12_LE_Sh.png"
alt="/req/ReqErrEqLraK50SL20T12_LE_Sh.png" /></p>
@@ -571,13 +574,8 @@
<h3 id="plot-3">Plot 3</h3>
<ul>
<li>K=50: the sketch sizing & accuracy parameter</li>
- <li>SL=2^20: StreamLength</li>
<li>HRA: High Rank Accuracy</li>
- <li>Eq Spaced: Equally spaced Plot Points (PP)</li>
- <li>PP=100: Number of plot points on the x-axis</li>
- <li>LgT=12: Number of trials = 2^12</li>
<li>Crit=LT: Comparison criterion: LT = Less-Than</li>
- <li>Shuffled: Random shuffle of the input stream for each trial</li>
</ul>
<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqHraK50SL20T12_LT_Sh.png"
alt="/req/ReqErrEqHraK50SL20T12_LT_Sh.png" /></p>
@@ -585,13 +583,8 @@
<h3 id="plot-4">Plot 4</h3>
<ul>
<li>K=50: the sketch sizing & accuracy parameter</li>
- <li>SL=2^20: StreamLength</li>
<li>LRA: Low Rank Accuracy</li>
- <li>Eq Spaced: Equally spaced Plot Points (PP)</li>
- <li>PP=100: Number of plot points on the x-axis</li>
- <li>LgT=12: Number of trials = 2^12</li>
<li>Crit=LE: Comparison criterion: LE = Less-Than or Equal</li>
- <li>Shuffled: Random shuffle of the input stream for each trial</li>
</ul>
<p><img class="doc-img-full"
src="/docs/img/req/ReqErrEqLraK50SL20T12_LE_Sh.png"
alt="/req/ReqErrEqLraK50SL20T12_LE_Sh.png" /></p>
diff --git a/output/docs/img/req/FlipFlopPattern.png
b/output/docs/img/req/FlipFlopPattern.png
new file mode 100644
index 0000000..21763d6
Binary files /dev/null and b/output/docs/img/req/FlipFlopPattern.png differ
diff --git a/output/docs/img/req/RandomPattern.png
b/output/docs/img/req/RandomPattern.png
new file mode 100644
index 0000000..b90b3f3
Binary files /dev/null and b/output/docs/img/req/RandomPattern.png differ
diff --git a/output/docs/img/req/ReversedPattern.png
b/output/docs/img/req/ReversedPattern.png
new file mode 100644
index 0000000..163cd03
Binary files /dev/null and b/output/docs/img/req/ReversedPattern.png differ
diff --git a/output/docs/img/req/SortedPattern.png
b/output/docs/img/req/SortedPattern.png
new file mode 100644
index 0000000..6cdb872
Binary files /dev/null and b/output/docs/img/req/SortedPattern.png differ
diff --git a/output/docs/img/req/SqrtPattern.png
b/output/docs/img/req/SqrtPattern.png
new file mode 100644
index 0000000..a89d5e8
Binary files /dev/null and b/output/docs/img/req/SqrtPattern.png differ
diff --git a/output/docs/img/req/ZoominPattern.png
b/output/docs/img/req/ZoominPattern.png
new file mode 100644
index 0000000..f3c40b7
Binary files /dev/null and b/output/docs/img/req/ZoominPattern.png differ
diff --git a/output/docs/img/req/ZoomoutPattern.png
b/output/docs/img/req/ZoomoutPattern.png
new file mode 100644
index 0000000..d9cb057
Binary files /dev/null and b/output/docs/img/req/ZoomoutPattern.png differ
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]