Repository: madlib Updated Branches: refs/heads/master a3b59356f -> 5e707f745
add note to user docs on vec2cols about unequal arrays Project: http://git-wip-us.apache.org/repos/asf/madlib/repo Commit: http://git-wip-us.apache.org/repos/asf/madlib/commit/5e707f74 Tree: http://git-wip-us.apache.org/repos/asf/madlib/tree/5e707f74 Diff: http://git-wip-us.apache.org/repos/asf/madlib/diff/5e707f74 Branch: refs/heads/master Commit: 5e707f745c50343dd7395a3e8f86c04428210977 Parents: a3b5935 Author: Frank McQuillan <fmcquil...@pivotal.io> Authored: Fri Aug 17 13:38:20 2018 -0700 Committer: Frank McQuillan <fmcquil...@pivotal.io> Committed: Fri Aug 17 13:38:20 2018 -0700 ---------------------------------------------------------------------- .../postgres/modules/stats/correlation.sql_in | 10 +++++----- .../postgres/modules/utilities/cols2vec.sql_in | 4 ++-- .../postgres/modules/utilities/vec2cols.sql_in | 19 ++++++++++++------- 3 files changed, 19 insertions(+), 14 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/stats/correlation.sql_in ---------------------------------------------------------------------- diff --git a/src/ports/postgres/modules/stats/correlation.sql_in b/src/ports/postgres/modules/stats/correlation.sql_in index 64ed27e..3bf3e46 100644 --- a/src/ports/postgres/modules/stats/correlation.sql_in +++ b/src/ports/postgres/modules/stats/correlation.sql_in @@ -222,7 +222,7 @@ SELECT * FROM example_data_output ORDER BY column_position; <pre class="result"> column_position | variable | temperature | humidity -----------------+-------------+---------------------+---------- - 1 | temperature | 1 | + 1 | temperature | 1 | 2 | humidity | 0.00607993890408995 | 1 (2 rows) </pre> @@ -259,11 +259,11 @@ SELECT * FROM example_data_output ORDER BY day, column_position; <pre class="result"> column_position | variable | day | temperature | humidity -----------------+-------------+------+-------------------+---------- - 1 | temperature | Mon | 1 | + 1 | temperature | Mon | 1 | 2 | humidity | Mon | 0.616876934548786 | 1 - 1 | temperature | Tues | 1 | + 1 | temperature | Tues | 1 | 2 | humidity | Tues | 0.616876934548786 | 1 - 1 | temperature | Wed | 1 | + 1 | temperature | Wed | 1 | 2 | humidity | Wed | -0.28969669368457 | 1 (6 rows) </pre> @@ -315,7 +315,7 @@ SELECT * FROM example_data_output ORDER BY column_position; <pre class="result"> column_position | variable | temperature | humidity -----------------+-------------+------------------+------------------ - 1 | temperature | 507.926664293343 | + 1 | temperature | 507.926664293343 | 2 | humidity | 2.40227839088644 | 307.359914560342 (2 rows) </pre> http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/utilities/cols2vec.sql_in ---------------------------------------------------------------------- diff --git a/src/ports/postgres/modules/utilities/cols2vec.sql_in b/src/ports/postgres/modules/utilities/cols2vec.sql_in index 82a1f94..0c54ab5 100644 --- a/src/ports/postgres/modules/utilities/cols2vec.sql_in +++ b/src/ports/postgres/modules/utilities/cols2vec.sql_in @@ -82,8 +82,8 @@ values.</dd> <dt>list_of_features_to_exclude (optional)</dt> <dd>TEXT. Default NULL. -Comma-separated string of column names to exclude from the feature array. -Typically used when 'list_of_features' is set to '*'.</dd> +Comma-separated string of column names to exclude from the feature array. Typically used +when 'list_of_features' is set to '*'.</dd> <dt>cols_to_output (optional)</dt> <dd>TEXT. Default NULL. http://git-wip-us.apache.org/repos/asf/madlib/blob/5e707f74/src/ports/postgres/modules/utilities/vec2cols.sql_in ---------------------------------------------------------------------- diff --git a/src/ports/postgres/modules/utilities/vec2cols.sql_in b/src/ports/postgres/modules/utilities/vec2cols.sql_in index 989074c..115e015 100644 --- a/src/ports/postgres/modules/utilities/vec2cols.sql_in +++ b/src/ports/postgres/modules/utilities/vec2cols.sql_in @@ -72,23 +72,28 @@ vec2cols( same name already exists, an error will be returned.</tt> <dt>vector_col</dt> -<dd>TEXT. Name of the column containing the feature array. -Must be a one-dimensional array.</tt> +<dd>TEXT. Name of the column containing the feature array. Must be a one-dimensional array.</tt> <dt>feature_names (optional)</dt> -<dd>TEXT[]. Array of names associated with the feature array. -Note that this array exists in the -summary table created by the function 'cols2vec'. -If the 'feature_names' array is not specified, +<dd>TEXT[]. Array of names associated with the feature array. Note that +this array exists in the summary table created by the function 'cols2vec'. If +the 'feature_names' array is not specified, column names will be automatically generated of the form 'f1, f2, ...fn'.</tt> +@note If you specify the 'feature_names' parameter, you will get exactly that number of +feature columns in the 'output_table'. It means feature arrays from the 'vector_col' may be +padded or truncated, if a particular feature array size does not match the target +number of feature columns. <br><br>If you do not specify the 'feature names' parameter, +the number of feature columns generated +in the 'output_table' will be the maximum array size from 'vector_col'. +Feature arrays that are less than this maximum will be padded. <dt>cols_to_output (optional)</dt> <dd>TEXT, default NULL. Comma-separated string of column names from the source table to keep in the output table, in addition to the feature columns. To keep all columns from the source table, use '*'. -Note: total number of columns in a table cannot exceed the +The total number of columns in a table cannot exceed the PostgreSQL limits.</tt> </dd> </dl>