[08/51] [partial] madlib-site git commit: Doc: Add v1.15.1 documentation

nkak Mon, 15 Oct 2018 11:48:52 -0700

http://git-wip-us.apache.org/repos/asf/madlib-site/blob/af0e5f14/docs/v1.15.1/group__grp__strs.html
----------------------------------------------------------------------
diff --git a/docs/v1.15.1/group__grp__strs.html 
b/docs/v1.15.1/group__grp__strs.html
new file mode 100644
index 0000000..74f8305
--- /dev/null
+++ b/docs/v1.15.1/group__grp__strs.html
@@ -0,0 +1,269 @@
+<!-- HTML header for doxygen 1.8.4-->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml";>
+<head>
+<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
+<meta http-equiv="X-UA-Compatible" content="IE=9"/>
+<meta name="generator" content="Doxygen 1.8.14"/>
+<meta name="keywords" content="madlib,postgres,greenplum,machine learning,data 
mining,deep learning,ensemble methods,data science,market basket 
analysis,affinity analysis,pca,lda,regression,elastic net,huber 
white,proportional hazards,k-means,latent dirichlet allocation,bayes,support 
vector machines,svm"/>
+<title>MADlib: Stratified Sampling</title>
+<link href="tabs.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="jquery.js"></script>
+<script type="text/javascript" src="dynsections.js"></script>
+<link href="navtree.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="resize.js"></script>
+<script type="text/javascript" src="navtreedata.js"></script>
+<script type="text/javascript" src="navtree.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(initResizable);
+/* @license-end */</script>
+<link href="search/search.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="search/searchdata.js"></script>
+<script type="text/javascript" src="search/search.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(function() { init_search(); });
+/* @license-end */
+</script>
+<script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
+    jax: ["input/TeX","output/HTML-CSS"],
+});
+</script><script type="text/javascript" async 
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js";></script>
+<!-- hack in the navigation tree -->
+<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
+<link href="doxygen.css" rel="stylesheet" type="text/css" />
+<link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
+<!-- google analytics -->
+<script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+  
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+  ga('create', 'UA-45382226-1', 'madlib.apache.org');
+  ga('send', 'pageview');
+</script>
+</head>
+<body>
+<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
+<div id="titlearea">
+<table cellspacing="0" cellpadding="0">
+ <tbody>
+ <tr style="height: 56px;">
+  <td id="projectlogo"><a href="http://madlib.apache.org";><img alt="Logo" 
src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
+  <td style="padding-left: 0.5em;">
+   <div id="projectname">
+   <span id="projectnumber">1.15.1</span>
+   </div>
+   <div id="projectbrief">User Documentation for Apache MADlib</div>
+  </td>
+   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
+        <span class="left">
+          <img id="MSearchSelect" src="search/mag_sel.png"
+               onmouseover="return searchBox.OnSearchSelectShow()"
+               onmouseout="return searchBox.OnSearchSelectHide()"
+               alt=""/>
+          <input type="text" id="MSearchField" value="Search" accesskey="S"
+               onfocus="searchBox.OnSearchFieldFocus(true)" 
+               onblur="searchBox.OnSearchFieldFocus(false)" 
+               onkeyup="searchBox.OnSearchFieldChange(event)"/>
+          </span><span class="right">
+            <a id="MSearchClose" 
href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" 
border="0" src="search/close.png" alt=""/></a>
+          </span>
+        </div>
+</td>
+ </tr>
+ </tbody>
+</table>
+</div>
+<!-- end header part -->
+<!-- Generated by Doxygen 1.8.14 -->
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+var searchBox = new SearchBox("searchBox", "search",false,'Search');
+/* @license-end */
+</script>
+</div><!-- top -->
+<div id="side-nav" class="ui-resizable side-nav-resizable">
+  <div id="nav-tree">
+    <div id="nav-tree-contents">
+      <div id="nav-sync" class="sync"></div>
+    </div>
+  </div>
+  <div id="splitbar" style="-moz-user-select:none;" 
+       class="ui-resizable-handle">
+  </div>
+</div>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+$(document).ready(function(){initNavTree('group__grp__strs.html','');});
+/* @license-end */
+</script>
+<div id="doc-content">
+<!-- window showing the filter options -->
+<div id="MSearchSelectWindow"
+     onmouseover="return searchBox.OnSearchSelectShow()"
+     onmouseout="return searchBox.OnSearchSelectHide()"
+     onkeydown="return searchBox.OnSearchSelectKey(event)">
+</div>
+
+<!-- iframe showing the search results (closed by default) -->
+<div id="MSearchResultsWindow">
+<iframe src="javascript:void(0)" frameborder="0" 
+        name="MSearchResults" id="MSearchResults">
+</iframe>
+</div>
+
+<div class="header">
+  <div class="headertitle">
+<div class="title">Stratified Sampling<div class="ingroups"><a class="el" 
href="group__grp__sampling.html">Sampling</a></div></div>  </div>
+</div><!--header-->
+<div class="contents">
+<div class="toc"><b>Contents</b> <ul>
+<li>
+<a href="#strs">Stratified Sampling</a> </li>
+<li>
+<a href="#examples">Examples</a> </li>
+</ul>
+</div><p>Stratified sampling is a method for independently sampling 
subpopulations (strata). It is commonly used to reduce sampling error by 
ensuring that subgroups are adequately represented in the sample.</p>
+<p><a class="anchor" id="strs"></a></p><dl class="section user"><dt>Stratified 
Sampling</dt><dd></dd></dl>
+<pre class="syntax">
+stratified_sample(  source_table,
+                    output_table,
+                    proportion,
+                    grouping_cols,
+                    target_cols,
+                    with_replacement
+                  )
+</pre><p><b>Arguments</b> </p><dl class="arglist">
+<dt>source_table </dt>
+<dd><p class="startdd">TEXT. Name of the table containing the input data.</p>
+<p class="enddd"></p>
+</dd>
+<dt>output_table </dt>
+<dd><p class="startdd">TEXT. Name of output table that contains the sampled 
data. The output table contains all columns present in the source table unless 
otherwise specified in the 'target_cols' parameter below.</p>
+<p class="enddd"></p>
+</dd>
+<dt>proportion </dt>
+<dd><p class="startdd">FLOAT8 in the range (0,1). Each stratum is sampled 
independently.</p>
+<p class="enddd"></p>
+</dd>
+<dt>grouping_cols (optional) </dt>
+<dd><p class="startdd">TEXT, default: NULL. A single column or a list of 
comma-separated columns that defines the strata. When this parameter is NULL, 
no grouping is used so the sampling is non-stratified, that is, the whole table 
is treated as a single group.</p>
+<p class="enddd"></p>
+</dd>
+<dt>target_cols (optional) </dt>
+<dd><p class="startdd">TEXT, default NULL. A comma-separated list of columns 
to appear in the 'output_table'. If NULL or '*', all columns from the 
'source_table' will appear in the 'output_table'.</p>
+<p class="enddd"><a class="anchor" id="note"></a></p><dl class="section 
note"><dt>Note</dt><dd>Do not include 'grouping_cols' in the parameter 
'target_cols', because they are always included in the 'output_table'.</dd></dl>
+</dd>
+<dt>with_replacement (optional) </dt>
+<dd>BOOLEAN, default FALSE. Determines whether to sample with replacement or 
without replacement (default). With replacement means that it is possible that 
the same row may appear in the sample set more than once. Without replacement 
means a given row can be selected only once. </dd>
+</dl>
+<p><a class="anchor" id="examples"></a></p><dl class="section 
user"><dt>Examples</dt><dd></dd></dl>
+<p>Please note that due to the random nature of sampling, your results may 
look different from those below.</p>
+<ol type="1">
+<li>Create an input table: <pre class="syntax">
+DROP TABLE IF EXISTS test;
+CREATE TABLE test(
+    id1 INTEGER,
+    id2 INTEGER,
+    gr1 INTEGER,
+    gr2 INTEGER
+);
+INSERT INTO test VALUES
+(1,0,1,1),
+(2,0,1,1),
+(3,0,1,1),
+(4,0,1,1),
+(5,0,1,1),
+(6,0,1,1),
+(7,0,1,1),
+(8,0,1,1),
+(9,0,1,1),
+(9,0,1,1),
+(9,0,1,1),
+(9,0,1,1),
+(0,1,1,2),
+(0,2,1,2),
+(0,3,1,2),
+(0,4,1,2),
+(0,5,1,2),
+(0,6,1,2),
+(10,10,2,2),
+(20,20,2,2),
+(30,30,2,2),
+(40,40,2,2),
+(50,50,2,2),
+(60,60,2,2),
+(70,70,2,2);
+</pre></li>
+<li>Sample without replacement: <pre class="syntax">
+DROP TABLE IF EXISTS out;
+SELECT madlib.stratified_sample(
+                                'test',    -- Source table
+                                'out',     -- Output table
+                                0.5,       -- Sample proportion
+                                'gr1,gr2', -- Strata definition
+                                'id1,id2', -- Columns to output
+                                FALSE);    -- Sample without replacement
+SELECT * FROM out ORDER BY gr1,gr2,id1,id2;
+</pre> <pre class="result">
+ gr1 | gr2 | id1 | id2
+-----+-----+-----+-----
+   1 |   1 |   2 |   0
+   1 |   1 |   4 |   0
+   1 |   1 |   7 |   0
+   1 |   1 |   8 |   0
+   1 |   1 |   9 |   0
+   1 |   1 |   9 |   0
+   1 |   2 |   0 |   2
+   1 |   2 |   0 |   3
+   1 |   2 |   0 |   4
+   2 |   2 |  20 |  20
+   2 |   2 |  30 |  30
+   2 |   2 |  40 |  40
+   2 |   2 |  60 |  60
+(13 rows)
+</pre></li>
+<li>Sample with replacement: <pre class="syntax">
+DROP TABLE IF EXISTS out;
+SELECT madlib.stratified_sample(
+                                'test',    -- Source table
+                                'out',     -- Output table
+                                0.5,       -- Sample proportion
+                                'gr1,gr2', -- Strata definition
+                                'id1,id2', -- Columns to output
+                                TRUE);     -- Sample with replacement
+SELECT * FROM out ORDER BY gr1,gr2,id1,id2;
+</pre> <pre class="result">
+ gr1 | gr2 | id1 | id2
+----&mdash;+----&mdash;+----&mdash;+----&mdash;
+   1 |   1 |   3 |   0
+   1 |   1 |   6 |   0
+   1 |   1 |   6 |   0
+   1 |   1 |   7 |   0
+   1 |   1 |   7 |   0
+   1 |   1 |   9 |   0
+   1 |   2 |   0 |   1
+   1 |   2 |   0 |   2
+   1 |   2 |   0 |   6
+   2 |   2 |  20 |  20
+   2 |   2 |  30 |  30
+   2 |   2 |  50 |  50
+   2 |   2 |  50 |  50
+</pre> </li>
+</ol>
+</div><!-- contents -->
+</div><!-- doc-content -->
+<!-- start footer part -->
+<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
+  <ul>
+    <li class="footer">Generated on Mon Oct 15 2018 11:24:30 for MADlib by
+    <a href="http://www.doxygen.org/index.html";>
+    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.14 </li>
+  </ul>
+</div>
+</body>
+</html>


http://git-wip-us.apache.org/repos/asf/madlib-site/blob/af0e5f14/docs/v1.15.1/group__grp__summary.html
----------------------------------------------------------------------
diff --git a/docs/v1.15.1/group__grp__summary.html 
b/docs/v1.15.1/group__grp__summary.html
new file mode 100644
index 0000000..41f9a8c
--- /dev/null
+++ b/docs/v1.15.1/group__grp__summary.html
@@ -0,0 +1,504 @@
+<!-- HTML header for doxygen 1.8.4-->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml";>
+<head>
+<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
+<meta http-equiv="X-UA-Compatible" content="IE=9"/>
+<meta name="generator" content="Doxygen 1.8.14"/>
+<meta name="keywords" content="madlib,postgres,greenplum,machine learning,data 
mining,deep learning,ensemble methods,data science,market basket 
analysis,affinity analysis,pca,lda,regression,elastic net,huber 
white,proportional hazards,k-means,latent dirichlet allocation,bayes,support 
vector machines,svm"/>
+<title>MADlib: Summary</title>
+<link href="tabs.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="jquery.js"></script>
+<script type="text/javascript" src="dynsections.js"></script>
+<link href="navtree.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="resize.js"></script>
+<script type="text/javascript" src="navtreedata.js"></script>
+<script type="text/javascript" src="navtree.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(initResizable);
+/* @license-end */</script>
+<link href="search/search.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="search/searchdata.js"></script>
+<script type="text/javascript" src="search/search.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(function() { init_search(); });
+/* @license-end */
+</script>
+<script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
+    jax: ["input/TeX","output/HTML-CSS"],
+});
+</script><script type="text/javascript" async 
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js";></script>
+<!-- hack in the navigation tree -->
+<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
+<link href="doxygen.css" rel="stylesheet" type="text/css" />
+<link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
+<!-- google analytics -->
+<script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+  
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+  ga('create', 'UA-45382226-1', 'madlib.apache.org');
+  ga('send', 'pageview');
+</script>
+</head>
+<body>
+<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
+<div id="titlearea">
+<table cellspacing="0" cellpadding="0">
+ <tbody>
+ <tr style="height: 56px;">
+  <td id="projectlogo"><a href="http://madlib.apache.org";><img alt="Logo" 
src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
+  <td style="padding-left: 0.5em;">
+   <div id="projectname">
+   <span id="projectnumber">1.15.1</span>
+   </div>
+   <div id="projectbrief">User Documentation for Apache MADlib</div>
+  </td>
+   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
+        <span class="left">
+          <img id="MSearchSelect" src="search/mag_sel.png"
+               onmouseover="return searchBox.OnSearchSelectShow()"
+               onmouseout="return searchBox.OnSearchSelectHide()"
+               alt=""/>
+          <input type="text" id="MSearchField" value="Search" accesskey="S"
+               onfocus="searchBox.OnSearchFieldFocus(true)" 
+               onblur="searchBox.OnSearchFieldFocus(false)" 
+               onkeyup="searchBox.OnSearchFieldChange(event)"/>
+          </span><span class="right">
+            <a id="MSearchClose" 
href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" 
border="0" src="search/close.png" alt=""/></a>
+          </span>
+        </div>
+</td>
+ </tr>
+ </tbody>
+</table>
+</div>
+<!-- end header part -->
+<!-- Generated by Doxygen 1.8.14 -->
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+var searchBox = new SearchBox("searchBox", "search",false,'Search');
+/* @license-end */
+</script>
+</div><!-- top -->
+<div id="side-nav" class="ui-resizable side-nav-resizable">
+  <div id="nav-tree">
+    <div id="nav-tree-contents">
+      <div id="nav-sync" class="sync"></div>
+    </div>
+  </div>
+  <div id="splitbar" style="-moz-user-select:none;" 
+       class="ui-resizable-handle">
+  </div>
+</div>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+$(document).ready(function(){initNavTree('group__grp__summary.html','');});
+/* @license-end */
+</script>
+<div id="doc-content">
+<!-- window showing the filter options -->
+<div id="MSearchSelectWindow"
+     onmouseover="return searchBox.OnSearchSelectShow()"
+     onmouseout="return searchBox.OnSearchSelectHide()"
+     onkeydown="return searchBox.OnSearchSelectKey(event)">
+</div>
+
+<!-- iframe showing the search results (closed by default) -->
+<div id="MSearchResultsWindow">
+<iframe src="javascript:void(0)" frameborder="0" 
+        name="MSearchResults" id="MSearchResults">
+</iframe>
+</div>
+
+<div class="header">
+  <div class="headertitle">
+<div class="title">Summary<div class="ingroups"><a class="el" 
href="group__grp__stats.html">Statistics</a> &raquo; <a class="el" 
href="group__grp__desc__stats.html">Descriptive Statistics</a></div></div>  
</div>
+</div><!--header-->
+<div class="contents">
+<div class="toc"><b>Contents</b> <ul>
+<li>
+<a href="#usage">Summary Function Syntax</a> </li>
+<li>
+<a href="#examples">Examples</a> </li>
+<li>
+<a href="#notes">Notes</a> </li>
+<li>
+<a href="#related">Related Topics</a> </li>
+</ul>
+</div><p>The MADlib <b><a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a></b>
 function produces summary statistics for any data table. The function invokes 
various methods from the MADlib library to provide the data overview.</p>
+<p><a class="anchor" id="usage"></a></p><dl class="section user"><dt>Summary 
Function Syntax</dt><dd>The <b><a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a></b>
 function has the following syntax:</dd></dl>
+<pre class="syntax">
+summary ( source_table,
+          output_table,
+          target_cols,
+          grouping_cols,
+          get_distinct,
+          get_quartiles,
+          ntile_array,
+          how_many_mfv,
+          get_estimates,
+          n_cols_per_run
+        )
+</pre><p> The <b><a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a></b>
 function returns a composite type containing three fields: </p><table 
class="output">
+<tr>
+<th>output_table </th><td>TEXT. The name of the output table.  </td></tr>
+<tr>
+<th>num_col_summarized </th><td>INTEGER. The number of columns from the source 
table that have been summarized.  </td></tr>
+<tr>
+<th>duration </th><td>FLOAT8. The time taken (in seconds) to compute the 
summary.  </td></tr>
+</table>
+<p><b>Arguments</b> </p><dl class="arglist">
+<dt>source_table </dt>
+<dd><p class="startdd">TEXT. Name of the table containing the input data.</p>
+<p class="enddd"></p>
+</dd>
+<dt>output_table </dt>
+<dd><p class="startdd">TEXT. Name of the table for the output summary 
statistics. This table contains the following columns: </p><table 
class="output">
+<tr>
+<th>group_by </th><td>Group-by column name. NULL if none provided.  </td></tr>
+<tr>
+<th>group_by_value </th><td>Value of the group-by column. NULL if there is no 
grouping.  </td></tr>
+<tr>
+<th>target_column </th><td>Targeted column values for which summary is 
requested.  </td></tr>
+<tr>
+<th>column_number </th><td>Physical column number for the target column, as 
described in <em>pg_attribute</em>  catalog.  </td></tr>
+<tr>
+<th>data_type </th><td>Data type of the target column. Standard GPDB type 
descriptors are displayed.  </td></tr>
+<tr>
+<th>row_count </th><td>Number of rows for the target column.  </td></tr>
+<tr>
+<th>distinct_values </th><td>Number of distinct values in the target column. 
If the <a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a> 
function is called with the <em>get_estimates</em> argument set to TRUE 
(default), then this is an estimated statistic based on the Flajolet-Martin 
distinct count estimator. If the <em>get_estimates</em> argument set to FALSE, 
will use PostgreSQL COUNT DISTINCT.  </td></tr>
+<tr>
+<th>missing_values </th><td>Number of missing values in the target column.  
</td></tr>
+<tr>
+<th>blank_values </th><td>Number of blank values. Blanks are defined by this 
regular expression:<pre class="fragment">'^\w*$'</pre>  </td></tr>
+<tr>
+<th>fraction_missing </th><td>Percentage of total rows that are missing, as a 
decimal value, e.g. 0.3.  </td></tr>
+<tr>
+<th>fraction_blank </th><td>Percentage of total rows that are blank, as a 
decimal value, e.g. 0.3.  </td></tr>
+<tr>
+<th>positive_values </th><td>Number of positive values in the target column if 
target is numeric, otherwise NULL.  </td></tr>
+<tr>
+<th>negative_values </th><td>Number of negative values in the target column if 
target is numeric, otherwise NULL.  </td></tr>
+<tr>
+<th>zero_values </th><td>Number of zero values in the target column if target 
is numeric, otherwise NULL. Note that we are reporting exact equality to 0.0 
here, so even if you have a float value that is extremely small (say due to 
rounding), it will not be reported as a zero value.  </td></tr>
+<tr>
+<th>mean </th><td>Mean value of target column if target is numeric, otherwise 
NULL.  </td></tr>
+<tr>
+<th>variance </th><td>Variance of target column if target is numeric, 
otherwise NULL.  </td></tr>
+<tr>
+<th>confidence_interval </th><td>Confidence interval (95% using z-score) of 
the mean value for the target column if target is numeric, otherwise NULL. 
Presented as an array of two elements in the form {lower bound, upper bound}.  
</td></tr>
+<tr>
+<th>min </th><td>Minimum value of target column. For strings this is the 
length of the shortest string.  </td></tr>
+<tr>
+<th>max </th><td>Maximum value of target column. For strings this is the 
length of the longest string.  </td></tr>
+<tr>
+<th>first_quartile </th><td>First quartile (25th percentile), only for numeric 
columns. (Unavailable for PostgreSQL 9.3 or lower.)  </td></tr>
+<tr>
+<th>median </th><td>Median value of target column, if target is numeric, 
otherwise NULL. (Unavailable for PostgreSQL 9.3 or lower.)  </td></tr>
+<tr>
+<th>third_quartile </th><td>Third quartile (25th percentile), only for numeric 
columns. (Unavailable for PostgreSQL 9.3 or lower.)  </td></tr>
+<tr>
+<th>quantile_array </th><td>Percentile values corresponding to 
<em>ntile_array</em>. (Unavailable for PostgreSQL 9.3 or lower.)  </td></tr>
+<tr>
+<th>most_frequent_values </th><td>An array containing the most frequently 
occurring values. The <em>how_many_mfv</em> argument determines the length of 
the array, which is 10 by default. If the <a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a> 
function is called with the <em>get_estimates</em> argument set to TRUE 
(default), the frequent values computation is performed using a parallel 
aggregation method that is faster, but in some cases may fail to detect the 
exact most frequent values.  </td></tr>
+<tr>
+<th>mfv_frequencies </th><td>Array containing the frequency count for each of 
the most frequent values.   </td></tr>
+</table>
+<p class="enddd"></p>
+</dd>
+<dt>target_columns (optional) </dt>
+<dd><p class="startdd">TEXT, default NULL. A comma-separated list of columns 
to summarize. If NULL, summaries are produced for all columns.</p>
+<p class="enddd"></p>
+</dd>
+<dt>grouping_cols (optional) </dt>
+<dd>TEXT, default: null. A comma-separated list of columns on which to group 
results. If NULL, summaries are produced for the complete table. <dl 
class="section note"><dt>Note</dt><dd>Please note that summary statistics are 
calculated for each grouping column independently. That is, grouping columns 
are not combined together as in the regular PostgreSQL style GROUP BY 
directive. (This was done to reduce long run time and huge output table size 
which would otherwise result in the case of large input tables with a lot of 
grouping_cols and target_cols specified.)</dd></dl>
+</dd>
+<dt>get_distinct (optional) </dt>
+<dd><p class="startdd">BOOLEAN, default TRUE. If true, distinct values are 
counted. The method for computing distinct values depends on the setting of the 
'get_estimates' parameter below.</p>
+<p class="enddd"></p>
+</dd>
+<dt>get_quartiles (optional) </dt>
+<dd><p class="startdd">BOOLEAN, default TRUE. If TRUE, quartiles are 
computed.</p>
+<p class="enddd"></p>
+</dd>
+<dt>ntile_array (optional) </dt>
+<dd>FLOAT8[], default NULL. An array of quantile values to compute. If NULL, 
quantile values are not computed. <dl class="section 
note"><dt>Note</dt><dd>Quartile and quantile functions are not available in 
PostgreSQL 9.3 or lower. If you are using PostgreSQL 9.3 or lower, the output 
table will not contain these values, even if you set 'get_quartiles' = TRUE or 
provide an array of quantile values for the parameter 'ntile_array'.</dd></dl>
+</dd>
+<dt>how_many_mfv (optional) </dt>
+<dd><p class="startdd">INTEGER, default: 10. The number of 
most-frequent-values to compute. The method for computing MFV depends on the 
setting of the 'get_estimates' parameter below.</p>
+<p class="enddd"></p>
+</dd>
+<dt>get_estimates (optional) </dt>
+<dd><p class="startdd">BOOLEAN, default TRUE. If TRUE, estimated values are 
produced for distinct values and most frequent values. If FALSE, exact values 
are calculated which will take longer to run, with the impact depending on data 
size.</p>
+<p class="enddd"></p>
+</dd>
+<dt>n_cols_per_run (optional) </dt>
+<dd>INTEGER, default: 15. The number of columns to collect summary statistics 
in one pass of the data. This parameter determines the number of passes through 
the data. For e.g., with a total of 40 columns to summarize and 'n_cols_per_run 
= 15', there will be 3 passes through the data, with each pass summarizing a 
maximum of 15 columns. <dl class="section note"><dt>Note</dt><dd>This parameter 
should be used with caution. Increasing this parameter could decrease the total 
run time (if number of passes decreases), but will increase the memory 
consumption during each run. Since PostgreSQL limits the memory available for a 
single aggregate run, this increased memory consumption could result in an 
out-of-memory termination error.</dd></dl>
+</dd>
+</dl>
+<p><a class="anchor" id="examples"></a></p><dl class="section 
user"><dt>Examples</dt><dd></dd></dl>
+<ol type="1">
+<li>View online help for the <a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a> 
function. <pre class="example">
+SELECT * FROM madlib.summary();
+</pre></li>
+<li>Create an input data table using part of the well known iris data set. 
<pre class="example">
+DROP TABLE IF EXISTS iris;
+CREATE TABLE iris (id INT, sepal_length FLOAT, sepal_width FLOAT,
+                    petal_length FLOAT, petal_width FLOAT,
+                   class_name text);
+INSERT INTO iris VALUES
+(1,5.1,3.5,1.4,0.2,'Iris-setosa'),
+(2,4.9,3.0,1.4,0.2,'Iris-setosa'),
+(3,4.7,3.2,1.3,0.2,'Iris-setosa'),
+(4,4.6,3.1,1.5,0.2,'Iris-setosa'),
+(5,5.0,3.6,1.4,0.2,'Iris-setosa'),
+(6,5.4,3.9,1.7,0.4,'Iris-setosa'),
+(7,4.6,3.4,1.4,0.3,'Iris-setosa'),
+(8,5.0,3.4,1.5,0.2,'Iris-setosa'),
+(9,4.4,2.9,1.4,0.2,'Iris-setosa'),
+(10,4.9,3.1,1.5,0.1,'Iris-setosa'),
+(11,7.0,3.2,4.7,1.4,'Iris-versicolor'),
+(12,6.4,3.2,4.5,1.5,'Iris-versicolor'),
+(13,6.9,3.1,4.9,1.5,'Iris-versicolor'),
+(14,5.5,2.3,4.0,1.3,'Iris-versicolor'),
+(15,6.5,2.8,4.6,1.5,'Iris-versicolor'),
+(16,5.7,2.8,4.5,1.3,'Iris-versicolor'),
+(17,6.3,3.3,4.7,1.6,'Iris-versicolor'),
+(18,4.9,2.4,3.3,1.0,'Iris-versicolor'),
+(19,6.6,2.9,4.6,1.3,'Iris-versicolor'),
+(20,5.2,2.7,3.9,1.4,'Iris-versicolor'),
+(21,6.3,3.3,6.0,2.5,'Iris-virginica'),
+(22,5.8,2.7,5.1,1.9,'Iris-virginica'),
+(23,7.1,3.0,5.9,2.1,'Iris-virginica'),
+(24,6.3,2.9,5.6,1.8,'Iris-virginica'),
+(25,6.5,3.0,5.8,2.2,'Iris-virginica'),
+(26,7.6,3.0,6.6,2.1,'Iris-virginica'),
+(27,4.9,2.5,4.5,1.7,'Iris-virginica'),
+(28,7.3,2.9,6.3,1.8,'Iris-virginica'),
+(29,6.7,2.5,5.8,1.8,'Iris-virginica'),
+(30,7.2,3.6,6.1,2.5,'Iris-virginica');
+</pre></li>
+<li>Run the <b><a class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a></b>
 function using all defaults. <pre class="example">
+DROP TABLE IF EXISTS iris_summary;
+SELECT * FROM madlib.summary( 'iris',            -- Source table
+                              'iris_summary'     -- Output table
+                            );
+</pre> Result: <pre class="result">
+ output_table | num_col_summarized |     duration
+--------------+--------------------+-------------------
+ iris_summary |                  6 | 0.574938058853149
+(1 row)
+</pre> View the summary data. <pre class="example">
+-- Turn on expanded display for readability.
+\x on
+SELECT * FROM iris_summary;
+</pre> Result (partial): <pre class="result">
+...
+&#160;-[ RECORD 2 ]--------+---------------------------------------------
+group_by             |
+group_by_value       |
+target_column        | sepal_length
+column_number        | 2
+data_type            | float8
+row_count            | 30
+distinct_values      | 22
+missing_values       | 0
+blank_values         |
+fraction_missing     | 0
+fraction_blank       |
+positive_values      | 30
+negative_values      | 0
+zero_values          | 0
+mean                 | 5.84333333333333
+variance             | 0.929436781609188
+confidence_interval  | {5.49834423494374,6.18832243172292}
+min                  | 4.4
+max                  | 7.6
+first_quartile       | 4.925
+median               | 5.75
+third_quartile       | 6.575
+most_frequent_values | {4.9,6.3,5,6.5,4.6,7.2,5.5,5.7,7.3,6.7}
+mfv_frequencies      | {4,3,2,2,2,1,1,1,1,1}
+...
+&#160;-[ RECORD 6 ]--------+---------------------------------------------
+group_by             |
+group_by_value       |
+target_column        | class_name
+column_number        | 6
+data_type            | text
+row_count            | 30
+distinct_values      | 3
+missing_values       | 0
+blank_values         | 0
+fraction_missing     | 0
+fraction_blank       | 0
+positive_values      |
+negative_values      |
+zero_values          |
+mean                 |
+variance             |
+confidence_interval  |
+min                  | 11
+max                  | 15
+first_quartile       |
+median               |
+third_quartile       |
+most_frequent_values | {Iris-setosa,Iris-versicolor,Iris-virginica}
+mfv_frequencies      | {10,10,10}
+</pre> Note that for the text column in record 6, some statistics are n/a, and 
the min and max values represent the length of the shortest and longest strings 
respectively.</li>
+<li>Now group by the class of iris: <pre class="example">
+DROP TABLE IF EXISTS iris_summary;
+SELECT * FROM madlib.summary( 'iris',                       -- Source table
+                              'iris_summary',               -- Output table
+                              'sepal_length, sepal_width',  -- Columns to 
summarize
+                              'class_name'                  -- Grouping column
+                            );
+SELECT * FROM iris_summary;
+</pre> Result (partial): <pre class="result">
+&#160;-[ RECORD 1 ]--------+----------------------------------------
+group_by             | class_name
+group_by_value       | Iris-setosa
+target_column        | sepal_length
+column_number        | 2
+data_type            | float8
+row_count            | 10
+distinct_values      | 7
+missing_values       | 0
+blank_values         |
+fraction_missing     | 0
+fraction_blank       |
+positive_values      | 10
+negative_values      | 0
+zero_values          | 0
+mean                 | 4.86
+variance             | 0.0848888888888875
+confidence_interval  | {4.67941507384182,5.04058492615818}
+min                  | 4.4
+max                  | 5.4
+first_quartile       | 4.625
+median               | 4.9
+third_quartile       | 5
+most_frequent_values | {4.9,5,4.6,5.1,4.7,5.4,4.4}
+mfv_frequencies      | {2,2,2,1,1,1,1}
+...
+&#160;-[ RECORD 3 ]--------+----------------------------------------
+group_by             | class_name
+group_by_value       | Iris-versicolor
+target_column        | sepal_length
+column_number        | 2
+data_type            | float8
+row_count            | 10
+distinct_values      | 10
+missing_values       | 0
+blank_values         |
+fraction_missing     | 0
+fraction_blank       |
+positive_values      | 10
+negative_values      | 0
+zero_values          | 0
+mean                 | 6.1
+variance             | 0.528888888888893
+confidence_interval  | {5.64924734548141,6.55075265451859}
+min                  | 4.9
+max                  | 7
+first_quartile       | 5.55
+median               | 6.35
+third_quartile       | 6.575
+most_frequent_values | {6.9,5.5,6.5,5.7,6.3,4.9,6.6,5.2,7,6.4}
+mfv_frequencies      | {1,1,1,1,1,1,1,1,1,1}
+...
+</pre></li>
+<li>Trying some other parameters: <pre class="example">
+DROP TABLE IF EXISTS iris_summary;
+SELECT * FROM madlib.summary( 'iris',                       -- Source table
+                              'iris_summary',               -- Output table
+                              'sepal_length, sepal_width',  -- Columns to 
summarize
+                               NULL,                        -- No grouping
+                               TRUE,                        -- Get distinct 
values
+                               FALSE,                       -- Dont get 
quartiles
+                               ARRAY[0.33, 0.66],           -- Get ntiles
+                               3,                           -- Number of MFV 
to compute
+                               FALSE                        -- Get exact values
+                            );
+SELECT * FROM iris_summary;
+</pre> Result: <pre class="result">
+&#160;-[ RECORD 1 ]--------+------------------------------------
+group_by             |
+group_by_value       |
+target_column        | sepal_length
+column_number        | 2
+data_type            | float8
+row_count            | 30
+distinct_values      | 22
+missing_values       | 0
+blank_values         |
+fraction_missing     | 0
+fraction_blank       |
+positive_values      | 30
+negative_values      | 0
+zero_values          | 0
+mean                 | 5.84333333333333
+variance             | 0.929436781609175
+confidence_interval  | {5.49834423494375,6.18832243172292}
+min                  | 4.4
+max                  | 7.6
+quantile_array       | {5.057,6.414}
+most_frequent_values | {4.9,6.3,6.5}
+mfv_frequencies      | {4,3,2}
+&#160;-[ RECORD 2 ]--------+------------------------------------
+group_by             |
+group_by_value       |
+target_column        | sepal_width
+column_number        | 3
+data_type            | float8
+row_count            | 30
+distinct_values      | 14
+missing_values       | 0
+blank_values         |
+fraction_missing     | 0
+fraction_blank       |
+positive_values      | 30
+negative_values      | 0
+zero_values          | 0
+mean                 | 3.04
+variance             | 0.13903448275862
+confidence_interval  | {2.90656901047539,3.17343098952461}
+min                  | 2.3
+max                  | 3.9
+quantile_array       | {2.9,3.2}
+most_frequent_values | {2.9,3,3.2}
+mfv_frequencies      | {4,4,3}
+</pre></li>
+</ol>
+<p><a class="anchor" id="notes"></a></p><dl class="section 
user"><dt>Notes</dt><dd><ul>
+<li>Table names can be optionally schema qualified (current_schemas() would be 
searched if a schema name is not provided) and table and column names should 
follow case-sensitivity and quoting rules per the database. (For instance, 
'mytable' and 'MyTable' both resolve to the same entity, i.e. 'mytable'. If 
mixed-case or multi-byte characters are desired for entity names then the 
string should be double-quoted; in this case the input would be 
'"MyTable"').</li>
+<li>The <em>get_estimates</em> parameter controls computation for both 
distinct count and most frequent values:<ul>
+<li>If <em>get_estimates</em> is TRUE then the distinct value computation is 
estimated using Flajolet-Martin. MFV is computed using a fast method that does 
parallel aggregation in Greenplum Database at the expense of missing or 
duplicating some of the most frequent values.</li>
+<li>If <em>get_estimates</em> is FALSE then the distinct values are computed 
in a slower but exact method using PostgreSQL COUNT DISTINCT. MFV is computed 
using a faithful implementation that preserves the approximation guarantees of 
the Cormode/Muthukrishnan method (more information at <a class="el" 
href="group__grp__mfvsketch.html">MFV (Most Frequent Values)</a>).</li>
+</ul>
+</li>
+</ul>
+</dd></dl>
+<p><a class="anchor" id="related"></a></p><dl class="section user"><dt>Related 
Topics</dt><dd>File <a class="el" href="summary_8sql__in.html" title="Summary 
function for descriptive statistics. ">summary.sql_in</a> documenting the <b><a 
class="el" 
href="summary_8sql__in.html#a4be51e88a1df45191a1692b95429af36">summary()</a></b>
 function</dd></dl>
+<p><a class="el" href="group__grp__fmsketch.html">FM (Flajolet-Martin)</a> <br 
/>
+ <a class="el" href="group__grp__mfvsketch.html">MFV (Most Frequent 
Values)</a> <br />
+ <a class="el" href="group__grp__countmin.html">CountMin 
(Cormode-Muthukrishnan)</a> </p>
+</div><!-- contents -->
+</div><!-- doc-content -->
+<!-- start footer part -->
+<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
+  <ul>
+    <li class="footer">Generated on Mon Oct 15 2018 11:24:30 for MADlib by
+    <a href="http://www.doxygen.org/index.html";>
+    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.14 </li>
+  </ul>
+</div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/madlib-site/blob/af0e5f14/docs/v1.15.1/group__grp__super.html
----------------------------------------------------------------------
diff --git a/docs/v1.15.1/group__grp__super.html 
b/docs/v1.15.1/group__grp__super.html
new file mode 100644
index 0000000..e25995b
--- /dev/null
+++ b/docs/v1.15.1/group__grp__super.html
@@ -0,0 +1,158 @@
+<!-- HTML header for doxygen 1.8.4-->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml";>
+<head>
+<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
+<meta http-equiv="X-UA-Compatible" content="IE=9"/>
+<meta name="generator" content="Doxygen 1.8.14"/>
+<meta name="keywords" content="madlib,postgres,greenplum,machine learning,data 
mining,deep learning,ensemble methods,data science,market basket 
analysis,affinity analysis,pca,lda,regression,elastic net,huber 
white,proportional hazards,k-means,latent dirichlet allocation,bayes,support 
vector machines,svm"/>
+<title>MADlib: Supervised Learning</title>
+<link href="tabs.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="jquery.js"></script>
+<script type="text/javascript" src="dynsections.js"></script>
+<link href="navtree.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="resize.js"></script>
+<script type="text/javascript" src="navtreedata.js"></script>
+<script type="text/javascript" src="navtree.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(initResizable);
+/* @license-end */</script>
+<link href="search/search.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="search/searchdata.js"></script>
+<script type="text/javascript" src="search/search.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(function() { init_search(); });
+/* @license-end */
+</script>
+<script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
+    jax: ["input/TeX","output/HTML-CSS"],
+});
+</script><script type="text/javascript" async 
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js";></script>
+<!-- hack in the navigation tree -->
+<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
+<link href="doxygen.css" rel="stylesheet" type="text/css" />
+<link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
+<!-- google analytics -->
+<script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+  
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+  ga('create', 'UA-45382226-1', 'madlib.apache.org');
+  ga('send', 'pageview');
+</script>
+</head>
+<body>
+<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
+<div id="titlearea">
+<table cellspacing="0" cellpadding="0">
+ <tbody>
+ <tr style="height: 56px;">
+  <td id="projectlogo"><a href="http://madlib.apache.org";><img alt="Logo" 
src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
+  <td style="padding-left: 0.5em;">
+   <div id="projectname">
+   <span id="projectnumber">1.15.1</span>
+   </div>
+   <div id="projectbrief">User Documentation for Apache MADlib</div>
+  </td>
+   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
+        <span class="left">
+          <img id="MSearchSelect" src="search/mag_sel.png"
+               onmouseover="return searchBox.OnSearchSelectShow()"
+               onmouseout="return searchBox.OnSearchSelectHide()"
+               alt=""/>
+          <input type="text" id="MSearchField" value="Search" accesskey="S"
+               onfocus="searchBox.OnSearchFieldFocus(true)" 
+               onblur="searchBox.OnSearchFieldFocus(false)" 
+               onkeyup="searchBox.OnSearchFieldChange(event)"/>
+          </span><span class="right">
+            <a id="MSearchClose" 
href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" 
border="0" src="search/close.png" alt=""/></a>
+          </span>
+        </div>
+</td>
+ </tr>
+ </tbody>
+</table>
+</div>
+<!-- end header part -->
+<!-- Generated by Doxygen 1.8.14 -->
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+var searchBox = new SearchBox("searchBox", "search",false,'Search');
+/* @license-end */
+</script>
+</div><!-- top -->
+<div id="side-nav" class="ui-resizable side-nav-resizable">
+  <div id="nav-tree">
+    <div id="nav-tree-contents">
+      <div id="nav-sync" class="sync"></div>
+    </div>
+  </div>
+  <div id="splitbar" style="-moz-user-select:none;" 
+       class="ui-resizable-handle">
+  </div>
+</div>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+$(document).ready(function(){initNavTree('group__grp__super.html','');});
+/* @license-end */
+</script>
+<div id="doc-content">
+<!-- window showing the filter options -->
+<div id="MSearchSelectWindow"
+     onmouseover="return searchBox.OnSearchSelectShow()"
+     onmouseout="return searchBox.OnSearchSelectHide()"
+     onkeydown="return searchBox.OnSearchSelectKey(event)">
+</div>
+
+<!-- iframe showing the search results (closed by default) -->
+<div id="MSearchResultsWindow">
+<iframe src="javascript:void(0)" frameborder="0" 
+        name="MSearchResults" id="MSearchResults">
+</iframe>
+</div>
+
+<div class="header">
+  <div class="summary">
+<a href="#groups">Modules</a>  </div>
+  <div class="headertitle">
+<div class="title">Supervised Learning</div>  </div>
+</div><!--header-->
+<div class="contents">
+<a name="details" id="details"></a><h2 class="groupheader">Detailed 
Description</h2>
+<p>Methods to perform a variety of supervised learning tasks. </p>
+<table class="memberdecls">
+<tr class="heading"><td colspan="2"><h2 class="groupheader"><a 
name="groups"></a>
+Modules</h2></td></tr>
+<tr class="memitem:group__grp__crf"><td class="memItemLeft" align="right" 
valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" 
href="group__grp__crf.html">Conditional Random Field</a></td></tr>
+<tr class="memdesc:group__grp__crf"><td class="mdescLeft">&#160;</td><td 
class="mdescRight">Constructs a Conditional Random Fields (CRF) model for 
labeling sequential data. <br /></td></tr>
+<tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:group__grp__nn"><td class="memItemLeft" align="right" 
valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" 
href="group__grp__nn.html">Neural Network</a></td></tr>
+<tr class="memdesc:group__grp__nn"><td class="mdescLeft">&#160;</td><td 
class="mdescRight">Solves classification and regression problems with several 
fully connected layers and non-linear activation functions. <br /></td></tr>
+<tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:group__grp__regml"><td class="memItemLeft" align="right" 
valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" 
href="group__grp__regml.html">Regression Models</a></td></tr>
+<tr class="memdesc:group__grp__regml"><td class="mdescLeft">&#160;</td><td 
class="mdescRight">A collection of methods for modeling conditional expectation 
of a response variable. <br /></td></tr>
+<tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:group__grp__svm"><td class="memItemLeft" align="right" 
valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" 
href="group__grp__svm.html">Support Vector Machines</a></td></tr>
+<tr class="memdesc:group__grp__svm"><td class="mdescLeft">&#160;</td><td 
class="mdescRight">Solves classification and regression problems by separating 
data with a hyperplane or other nonlinear decision boundary. <br /></td></tr>
+<tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
+<tr class="memitem:group__grp__tree"><td class="memItemLeft" align="right" 
valign="top">&#160;</td><td class="memItemRight" valign="bottom"><a class="el" 
href="group__grp__tree.html">Tree Methods</a></td></tr>
+<tr class="memdesc:group__grp__tree"><td class="mdescLeft">&#160;</td><td 
class="mdescRight">A collection of recursive partitioning (tree) methods. <br 
/></td></tr>
+<tr class="separator:"><td class="memSeparator" colspan="2">&#160;</td></tr>
+</table>
+</div><!-- contents -->
+</div><!-- doc-content -->
+<!-- start footer part -->
+<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
+  <ul>
+    <li class="footer">Generated on Mon Oct 15 2018 11:24:30 for MADlib by
+    <a href="http://www.doxygen.org/index.html";>
+    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.14 </li>
+  </ul>
+</div>
+</body>
+</html>

http://git-wip-us.apache.org/repos/asf/madlib-site/blob/af0e5f14/docs/v1.15.1/group__grp__super.js
----------------------------------------------------------------------
diff --git a/docs/v1.15.1/group__grp__super.js 
b/docs/v1.15.1/group__grp__super.js
new file mode 100644
index 0000000..c36abae
--- /dev/null
+++ b/docs/v1.15.1/group__grp__super.js
@@ -0,0 +1,8 @@
+var group__grp__super =
+[
+    [ "Conditional Random Field", "group__grp__crf.html", null ],
+    [ "Neural Network", "group__grp__nn.html", null ],
+    [ "Regression Models", "group__grp__regml.html", "group__grp__regml" ],
+    [ "Support Vector Machines", "group__grp__svm.html", null ],
+    [ "Tree Methods", "group__grp__tree.html", "group__grp__tree" ]
+];
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/madlib-site/blob/af0e5f14/docs/v1.15.1/group__grp__svd.html
----------------------------------------------------------------------
diff --git a/docs/v1.15.1/group__grp__svd.html 
b/docs/v1.15.1/group__grp__svd.html
new file mode 100644
index 0000000..0a8c79e
--- /dev/null
+++ b/docs/v1.15.1/group__grp__svd.html
@@ -0,0 +1,424 @@
+<!-- HTML header for doxygen 1.8.4-->
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml";>
+<head>
+<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
+<meta http-equiv="X-UA-Compatible" content="IE=9"/>
+<meta name="generator" content="Doxygen 1.8.14"/>
+<meta name="keywords" content="madlib,postgres,greenplum,machine learning,data 
mining,deep learning,ensemble methods,data science,market basket 
analysis,affinity analysis,pca,lda,regression,elastic net,huber 
white,proportional hazards,k-means,latent dirichlet allocation,bayes,support 
vector machines,svm"/>
+<title>MADlib: Singular Value Decomposition</title>
+<link href="tabs.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="jquery.js"></script>
+<script type="text/javascript" src="dynsections.js"></script>
+<link href="navtree.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="resize.js"></script>
+<script type="text/javascript" src="navtreedata.js"></script>
+<script type="text/javascript" src="navtree.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(initResizable);
+/* @license-end */</script>
+<link href="search/search.css" rel="stylesheet" type="text/css"/>
+<script type="text/javascript" src="search/searchdata.js"></script>
+<script type="text/javascript" src="search/search.js"></script>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+  $(document).ready(function() { init_search(); });
+/* @license-end */
+</script>
+<script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
+    jax: ["input/TeX","output/HTML-CSS"],
+});
+</script><script type="text/javascript" async 
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/MathJax.js";></script>
+<!-- hack in the navigation tree -->
+<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
+<link href="doxygen.css" rel="stylesheet" type="text/css" />
+<link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
+<!-- google analytics -->
+<script>
+  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+  
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+  ga('create', 'UA-45382226-1', 'madlib.apache.org');
+  ga('send', 'pageview');
+</script>
+</head>
+<body>
+<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
+<div id="titlearea">
+<table cellspacing="0" cellpadding="0">
+ <tbody>
+ <tr style="height: 56px;">
+  <td id="projectlogo"><a href="http://madlib.apache.org";><img alt="Logo" 
src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
+  <td style="padding-left: 0.5em;">
+   <div id="projectname">
+   <span id="projectnumber">1.15.1</span>
+   </div>
+   <div id="projectbrief">User Documentation for Apache MADlib</div>
+  </td>
+   <td>        <div id="MSearchBox" class="MSearchBoxInactive">
+        <span class="left">
+          <img id="MSearchSelect" src="search/mag_sel.png"
+               onmouseover="return searchBox.OnSearchSelectShow()"
+               onmouseout="return searchBox.OnSearchSelectHide()"
+               alt=""/>
+          <input type="text" id="MSearchField" value="Search" accesskey="S"
+               onfocus="searchBox.OnSearchFieldFocus(true)" 
+               onblur="searchBox.OnSearchFieldFocus(false)" 
+               onkeyup="searchBox.OnSearchFieldChange(event)"/>
+          </span><span class="right">
+            <a id="MSearchClose" 
href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" 
border="0" src="search/close.png" alt=""/></a>
+          </span>
+        </div>
+</td>
+ </tr>
+ </tbody>
+</table>
+</div>
+<!-- end header part -->
+<!-- Generated by Doxygen 1.8.14 -->
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+var searchBox = new SearchBox("searchBox", "search",false,'Search');
+/* @license-end */
+</script>
+</div><!-- top -->
+<div id="side-nav" class="ui-resizable side-nav-resizable">
+  <div id="nav-tree">
+    <div id="nav-tree-contents">
+      <div id="nav-sync" class="sync"></div>
+    </div>
+  </div>
+  <div id="splitbar" style="-moz-user-select:none;" 
+       class="ui-resizable-handle">
+  </div>
+</div>
+<script type="text/javascript">
+/* @license 
magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt 
GPL-v2 */
+$(document).ready(function(){initNavTree('group__grp__svd.html','');});
+/* @license-end */
+</script>
+<div id="doc-content">
+<!-- window showing the filter options -->
+<div id="MSearchSelectWindow"
+     onmouseover="return searchBox.OnSearchSelectShow()"
+     onmouseout="return searchBox.OnSearchSelectHide()"
+     onkeydown="return searchBox.OnSearchSelectKey(event)">
+</div>
+
+<!-- iframe showing the search results (closed by default) -->
+<div id="MSearchResultsWindow">
+<iframe src="javascript:void(0)" frameborder="0" 
+        name="MSearchResults" id="MSearchResults">
+</iframe>
+</div>
+
+<div class="header">
+  <div class="headertitle">
+<div class="title">Singular Value Decomposition<div class="ingroups"><a 
class="el" href="group__grp__datatrans.html">Data Types and Transformations</a> 
&raquo; <a class="el" href="group__grp__arraysmatrix.html">Arrays and 
Matrices</a> &raquo; <a class="el" 
href="group__grp__matrix__factorization.html">Matrix 
Factorization</a></div></div>  </div>
+</div><!--header-->
+<div class="contents">
+<div class="toc"><b>Contents</b> <ul>
+<li>
+<a href="#syntax">SVD Functions</a> </li>
+<li>
+<a href="#output">Output Tables</a> </li>
+<li>
+<a href="#examples">Examples</a></li>
+<li>
+</li>
+<li>
+<a href="#background">Technical Background</a> </li>
+</ul>
+</div><p>In linear algebra, the singular value decomposition (SVD) is a 
factorization of a real or complex matrix, with many useful applications in 
signal processing and statistics.</p>
+<p>Let \(A\) be a \(mxn\) matrix, where \(m \ge n\). Then \(A\) can be 
decomposed as follows: </p><p class="formulaDsp">
+\[ A = U \Sigma V^T, \]
+</p>
+<p> where \(U\) is a \(m \times n\) orthonormal matrix, \(\Sigma\) is a \(n 
\times n\) diagonal matrix, and \(V\) is an \(n \times n\) orthonormal matrix. 
The diagonal elements of \(\Sigma\) are called the <em>singular values</em>.</p>
+<p><a class="anchor" id="syntax"></a></p><dl class="section user"><dt>SVD 
Functions</dt><dd></dd></dl>
+<p>SVD factorizations are provided for dense and sparse matrices. In addition, 
a native implementation is provided for very sparse matrices for improved 
performance.</p>
+<p><b>SVD Function for Dense Matrices</b></p>
+<pre class="syntax">
+svd( source_table,
+     output_table_prefix,
+     row_id,
+     k,
+     n_iterations,
+     result_summary_table
+);
+</pre><p> <b>Arguments</b> </p><dl class="arglist">
+<dt>source_table </dt>
+<dd><p class="startdd">TEXT. Source table name (dense matrix).</p>
+<p class="enddd">The table contains a <code>row_id</code> column that 
identifies each row, with numbering starting from 1. The other columns contain 
the data for the matrix. There are two possible dense formats as illustrated by 
the 2x2 matrix example below. You can use either of these dense formats:</p><ol 
type="1">
+<li><pre class="example">
+            row_id     col1     col2
+row1         1           1         0
+row2         2           0         1
+    </pre></li>
+<li><pre class="example">
+        row_id     row_vec
+row1        1       {1, 0}
+row2        2       {0, 1}
+    </pre>  </li>
+</ol>
+</dd>
+<dt>output_table_prefix </dt>
+<dd>TEXT. Prefix for output tables. See <a href="#output">Output Tables</a> 
below for a description of the convention used. </dd>
+<dt>row_id </dt>
+<dd>TEXT. ID for each row. </dd>
+<dt>k </dt>
+<dd>INTEGER. Number of singular values to compute. </dd>
+<dt>n_iterations (optional).  </dt>
+<dd>INTEGER. Number of iterations to run. <dl class="section 
note"><dt>Note</dt><dd>The number of iterations must be in the range [k, column 
dimension], where k is number of singular values. </dd></dl>
+</dd>
+<dt>result_summary_table (optional) </dt>
+<dd>TEXT. The name of the table to store the result summary. </dd>
+</dl>
+<hr/>
+<p> <b>SVD Function for Sparse Matrices</b></p>
+<p>Use this function for matrices that are represented in the sparse-matrix 
format (example below). <b>Note that the input matrix is converted to a dense 
matrix before the SVD operation, for efficient computation reasons. </b></p>
+<pre class="syntax">
+svd_sparse( source_table,
+            output_table_prefix,
+            row_id,
+            col_id,
+            value,
+            row_dim,
+            col_dim,
+            k,
+            n_iterations,
+            result_summary_table
+          );
+</pre><p> <b>Arguments</b> </p><dl class="arglist">
+<dt>source_table </dt>
+<dd><p class="startdd">TEXT. Source table name (sparse matrix).</p>
+<p>A sparse matrix is represented using the row and column indices for each 
non-zero entry of the matrix. This representation is useful for matrices 
containing multiple zero elements. Below is an example of a sparse 4x7 matrix 
with just 6 out of 28 entries being non-zero. The dimensionality of the matrix 
is inferred using the max value in <em>row</em> and <em>col</em> columns. Note 
the last entry is included (even though it is 0) to provide the dimensionality 
of the matrix, indicating that the 4th row and 7th column contain all zeros. 
</p><pre class="example">
+ row_id | col_id | value
+--------+--------+-------
+      1 |      1 |     9
+      1 |      5 |     6
+      1 |      6 |     6
+      2 |      1 |     8
+      3 |      1 |     3
+      3 |      2 |     9
+      4 |      7 |     0
+(6 rows)
+</pre> <p class="enddd"></p>
+</dd>
+<dt>output_table_prefix </dt>
+<dd>TEXT. Prefix for output tables. See <a href="#output">Output Tables</a> 
below for a description of the convention used.  </dd>
+<dt>row_id </dt>
+<dd>TEXT. Name of the column containing the row index for each entry in sparse 
matrix. </dd>
+<dt>col_id </dt>
+<dd>TEXT. Name of the column containing the column index for each entry in 
sparse matrix. </dd>
+<dt>value </dt>
+<dd>TEXT. Name of column containing the non-zero values of the sparse matrix. 
</dd>
+<dt>row_dim </dt>
+<dd>INTEGER. Number of rows in matrix. </dd>
+<dt>col_dim </dt>
+<dd>INTEGER. Number of columns in matrix. </dd>
+<dt>k </dt>
+<dd>INTEGER. Number of singular values to compute. </dd>
+<dt>n_iterations (optional) </dt>
+<dd>INTEGER. Number of iterations to run. <dl class="section 
note"><dt>Note</dt><dd>The number of iterations must be in the range [k, column 
dimension], where k is number of singular values. </dd></dl>
+</dd>
+<dt>result_summary_table (optional) </dt>
+<dd>TEXT. The name of the table to store the result summary. </dd>
+</dl>
+<hr/>
+<p> <b>Native Implementation for Sparse Matrices</b></p>
+<p>Use this function for matrices that are represented in the sparse-matrix 
format (see sparse matrix example above). This function uses the native sparse 
representation while computing the SVD. </p><dl class="section 
note"><dt>Note</dt><dd>Note that this function should be favored if the matrix 
is highly sparse, since it computes very sparse matrices efficiently. </dd></dl>
+<pre class="syntax">
+svd_sparse_native( source_table,
+                   output_table_prefix,
+                   row_id,
+                   col_id,
+                   value,
+                   row_dim,
+                   col_dim,
+                   k,
+                   n_iterations,
+                   result_summary_table
+                 );
+</pre><p> <b>Arguments</b> </p><dl class="arglist">
+<dt>source_table </dt>
+<dd>TEXT. Source table name (sparse matrix - see example above). </dd>
+<dt>output_table_prefix </dt>
+<dd>TEXT. Prefix for output tables. See <a href="#output">Output Tables</a> 
below for a description of the convention used. </dd>
+<dt>row_id </dt>
+<dd>TEXT. ID for each row. </dd>
+<dt>col_id </dt>
+<dd>TEXT. ID for each column. </dd>
+<dt>value </dt>
+<dd>TEXT. Non-zero values of the sparse matrix. </dd>
+<dt>row_dim </dt>
+<dd>INTEGER. Row dimension of sparse matrix. </dd>
+<dt>col_dim </dt>
+<dd>INTEGER. Col dimension of sparse matrix. </dd>
+<dt>k </dt>
+<dd>INTEGER. Number of singular values to compute. </dd>
+<dt>n_iterations (optional) </dt>
+<dd>INTEGER. Number of iterations to run. <dl class="section 
note"><dt>Note</dt><dd>The number of iterations must be in the range [k, column 
dimension], where k is number of singular values. </dd></dl>
+</dd>
+<dt>result_summary_table (optional) </dt>
+<dd>TEXT. Table name to store result summary. </dd>
+</dl>
+<hr/>
+<p><a class="anchor" id="output"></a></p><dl class="section user"><dt>Output 
Tables</dt><dd></dd></dl>
+<p>Output for eigenvectors/values is in the following three tables:</p><ul>
+<li>Left singular matrix: Table is named &lt;output_table_prefix&gt;_u (e.g. 
ânetflix_uâ)</li>
+<li>Right singular matrix: Table is named &lt;output_table_prefix&gt;_v (e.g. 
ânetflix_vâ)</li>
+<li>Singular values: Table is named &lt;output_table_prefix&gt;_s (e.g. 
ânetflix_sâ)</li>
+</ul>
+<p>The left and right singular vector tables are of the format: </p><table 
class="output">
+<tr>
+<th>row_id </th><td>INTEGER. The ID corresponding to each eigenvalue (in 
decreasing order).  </td></tr>
+<tr>
+<th>row_vec </th><td>FLOAT8[]. Singular vector elements for this row_id. Each 
array is of size k.  </td></tr>
+</table>
+<p>The singular values table is in sparse table format, since only the 
diagonal elements of the matrix are non-zero: </p><table class="output">
+<tr>
+<th>row_id </th><td>INTEGER. <em>i</em> for <em>ith</em> eigenvalue.  
</td></tr>
+<tr>
+<th>col_id </th><td>INTEGER. <em>i</em> for <em>ith</em> eigenvalue (same as 
row_id).  </td></tr>
+<tr>
+<th>value </th><td>FLOAT8. Eigenvalue.  </td></tr>
+</table>
+<p>All <code>row_id</code> and <code>col_id</code> in the above tables start 
from 1.</p>
+<p>The result summary table has the following columns: </p><table 
class="output">
+<tr>
+<th>rows_used </th><td>INTEGER. Number of rows used for SVD calculation.  
</td></tr>
+<tr>
+<th>exec_time </th><td>FLOAT8. Total time for executing SVD.  </td></tr>
+<tr>
+<th>iter </th><td>INTEGER. Total number of iterations run.  </td></tr>
+<tr>
+<th>recon_error </th><td>FLOAT8. Total quality score (i.e. approximation 
quality) for this set of orthonormal basis.  </td></tr>
+<tr>
+<th>relative_recon_error </th><td>FLOAT8. Relative quality score.  </td></tr>
+</table>
+<p>In the result summary table, the reconstruction error is computed as \( 
\sqrt{mean((X - USV^T)_{ij}^2)} \), where the average is over all elements of 
the matrices. The relative reconstruction error is then computed as ratio of 
the reconstruction error and \( \sqrt{mean(X_{ij}^2)} \).</p>
+<p><a class="anchor" id="examples"></a></p><dl class="section 
user"><dt>Examples</dt><dd></dd></dl>
+<ol type="1">
+<li>View online help for the SVD function. <pre class="example">
+SELECT madlib.svd();
+</pre></li>
+<li>Create an input dataset (dense matrix). <pre class="example">
+DROP TABLE IF EXISTS mat, mat_sparse, svd_summary_table, svd_u, svd_v, svd_s;
+CREATE TABLE mat (
+    row_id integer,
+    row_vec double precision[]
+);
+INSERT INTO mat VALUES
+(1,'{396,840,353,446,318,886,15,584,159,383}'),
+(2,'{691,58,899,163,159,533,604,582,269,390}'),
+(3,'{293,742,298,75,404,857,941,662,846,2}'),
+(4,'{462,532,787,265,982,306,600,608,212,885}'),
+(5,'{304,151,337,387,643,753,603,531,459,652}'),
+(6,'{327,946,368,943,7,516,272,24,591,204}'),
+(7,'{877,59,260,302,891,498,710,286,864,675}'),
+(8,'{458,959,774,376,228,354,300,669,718,565}'),
+(9,'{824,390,818,844,180,943,424,520,65,913}'),
+(10,'{882,761,398,688,761,405,125,484,222,873}'),
+(11,'{528,1,860,18,814,242,314,965,935,809}'),
+(12,'{492,220,576,289,321,261,173,1,44,241}'),
+(13,'{415,701,221,503,67,393,479,218,219,916}'),
+(14,'{350,192,211,633,53,783,30,444,176,932}'),
+(15,'{909,472,871,695,930,455,398,893,693,838}'),
+(16,'{739,651,678,577,273,935,661,47,373,618}');
+</pre></li>
+<li>Run SVD function for a dense matrix. <pre class="example">
+SELECT madlib.svd( 'mat',       -- Input table
+                   'svd',       -- Output table prefix
+                   'row_id',    -- Column name with row index
+                   10,          -- Number of singular values to compute
+                   NULL,        -- Use default number of iterations
+                   'svd_summary_table'  -- Result summary table
+                 );
+</pre></li>
+<li>Print out the singular values and the summary table. For the singular 
values: <pre class="example">
+SELECT * FROM svd_s ORDER BY row_id;
+</pre> Result: <pre class="result">
+ row_id | col_id |      value
+&#160;--------+--------+------------------
+      1 |      1 | 6475.67225281804
+      2 |      2 | 1875.18065580415
+      3 |      3 | 1483.25228429636
+      4 |      4 | 1159.72262897427
+      5 |      5 | 1033.86092570574
+      6 |      6 | 948.437358703966
+      7 |      7 | 795.379572772455
+      8 |      8 | 709.086240684469
+      9 |      9 | 462.473775959371
+     10 |     10 | 365.875217945698
+     10 |     10 |
+(11 rows)
+</pre> For the summary table: <pre class="example">
+SELECT * FROM svd_summary_table;
+</pre> Result: <pre class="result">
+ rows_used | exec_time (ms) | iter |    recon_error    | relative_recon_error
+&#160;-----------+----------------+------+-------------------+----------------------
+        16 |        1332.47 |   10 | 4.36920148766e-13 |    7.63134130332e-16
+(1 row)
+</pre></li>
+<li>Create a sparse matrix by running the <a class="el" 
href="matrix__ops_8sql__in.html#a390fb7234f49e17c780e961184873759">matrix_sparsify()</a>
 utility function on the dense matrix. <pre class="example">
+SELECT madlib.matrix_sparsify('mat',
+                              'row=row_id, val=row_vec',
+                              'mat_sparse',
+                              'row=row_id, col=col_id, val=value');
+</pre></li>
+<li>Run the SVD function for a sparse matrix. <pre class="example">
+SELECT madlib.svd_sparse( 'mat_sparse',   -- Input table
+                          'svd',          -- Output table prefix
+                          'row_id',       -- Column name with row index
+                          'col_id',       -- Column name with column index
+                          'value',        -- Matrix cell value
+                          16,             -- Number of rows in matrix
+                          10,             -- Number of columns in matrix
+                          10              -- Number of singular values to 
compute
+                          );
+</pre></li>
+<li>Run the SVD function for a very sparse matrix. <pre class="example">
+SELECT madlib.svd_sparse_native ( 'mat_sparse',   -- Input table
+                          'svd',          -- Output table prefix
+                          'row_id',       -- Column name with row index
+                          'col_id',       -- Column name with column index
+                          'value',        -- Matrix cell value
+                          16,             -- Number of rows in matrix
+                          10,             -- Number of columns in matrix
+                          10              -- Number of singular values to 
compute
+                          );
+</pre> <a class="anchor" id="background"></a><dl class="section 
user"><dt>Technical Background</dt><dd>In linear algebra, the singular value 
decomposition (SVD) is a factorization of a real or complex matrix, with many 
useful applications in signal processing and statistics. Let \(A\) be a \(m 
\times n\) matrix, where \(m \ge n\). Then \(A\) can be decomposed as follows: 
<p class="formulaDsp">
+\[ A = U \Sigma V^T, \]
+</p>
+ where \(U\) is a \(m \times n\) orthonormal matrix, \(\Sigma\) is a \(n 
\times n\) diagonal matrix, and \(V\) is an \(n \times n\) orthonormal matrix. 
The diagonal elements of \(\Sigma\) are called the <em>singular values</em>. It 
is possible to formulate the problem of computing the singular triplets ( 
\(\sigma_i, u_i, v_i\)) of \(A\) as an eigenvalue problem involving a Hermitian 
matrix related to \(A\). There are two possible ways of achieving 
this:</dd></dl>
+</li>
+</ol>
+<ul>
+<li>With the cross product matrix, \(A^TA\) and \(AA^T\)</li>
+<li>With the cyclic matrix <p class="formulaDsp">
+\[ H(A) = \begin{bmatrix} 0 &amp; A\\ A^* &amp; 0 \end{bmatrix} \]
+</p>
+ The singular values are the nonnegative square roots of the eigenvalues of 
the cross product matrix. This approach may imply a severe loss of accuracy in 
the smallest singular values. The cyclic matrix approach is an alternative that 
avoids this problem, but at the expense of significantly increasing the cost of 
the computation. Computing the cross product matrix explicitly is not 
recommended, especially in the case of sparse A. Bidiagonalization was proposed 
by Golub and Kahan [citation?] as a way of tridiagonalizing the cross product 
matrix without forming it explicitly. Consider the following decomposition <p 
class="formulaDsp">
+\[ A = P B Q^T, \]
+</p>
+ where \(P\) and \(Q\) are unitary matrices and \(B\) is an \(m \times n\) 
upper bidiagonal matrix. Then the tridiagonal matrix \(B*B\) is unitarily 
similar to \(A*A\). Additionally, specific methods exist that compute the 
singular values of \(B\) without forming \(B*B\). Therefore, after computing 
the SVD of B, <p class="formulaDsp">
+\[ B = X\Sigma Y^T, \]
+</p>
+ it only remains to compute the SVD of the original matrix with \(U = PX\) and 
\(V = QY\). </li>
+</ul>
+</div><!-- contents -->
+</div><!-- doc-content -->
+<!-- start footer part -->
+<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
+  <ul>
+    <li class="footer">Generated on Mon Oct 15 2018 11:24:30 for MADlib by
+    <a href="http://www.doxygen.org/index.html";>
+    <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.14 </li>
+  </ul>
+</div>
+</body>
+</html>

[08/51] [partial] madlib-site git commit: Doc: Add v1.15.1 documentation

Reply via email to