Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/incubator-madlib/pull/49#discussion_r67965192
--- Diff: src/ports/postgres/modules/utilities/sessionize.py_in ---
@@ -35,41 +36,83 @@ def sessionize(schema_madlib, source_table,
output_table, partition_expr,
@param source_table: str, Name of the input table/view
@param output_table: str, Name of the table to store result
@param partition_expr: str, Expression to partition (group) the
input data
- @param time_stamp: str, Column name with time used for
sessionization calculation
+ @param time_stamp: str, The time stamp column name that is used
for sessionization calculation
@param max_time: interval, Delta time between subsequent events to
define a session
-
+ @param output_cols: str, list of columns the output table/view
must contain (default '*'):
+ * - all columns in the input table, and a new
session ID column
+ 'a,b,c,...' - a comma separated list of column
names/expressions to be projected, along with a new session ID column
--- End diff --
Processing strings will be error prone and adds a lot of complexity.
Nevertheless, this might be
a good to have feature. I will leave this as is for Phase 2 and look at it
in Phase 3.
Have added a comment in the Phase 3 JIRA
(https://issues.apache.org/jira/browse/MADLIB-1002).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---