kaknikhil commented on code in PR #615:
URL: https://github.com/apache/madlib/pull/615#discussion_r1511755950


##########
src/ports/postgres/modules/pmml/table_to_pmml.sql_in:
##########
@@ -146,10 +144,25 @@ Alternatively, the above can also be invoked as below if 
custom names are needed
 for fields in the Data Dictionary:
 <pre class="example">
 SELECT madlib.pmml('patients_logregr',
-                   'out_attack~1+in_trait_anxiety+in_treatment');
+                   'out_attack~in_trait_anxiety+in_treatment');
 </pre>
 
-\b Note: If the second argument of 'pmml' function is not specified, a default 
suffix "_pmml_prediction" will be automatically append to the column name to be 
predicted. This can help avoid name conflicts.
+\b Note: 1. If the second argument of 'pmml' function is not specified, a 
default suffix "_pmml_prediction" will be automatically append to the column 
name to be predicted. This can help avoid name conflicts.
+
+\b Note: 2. While training regression models, it is possible to use a non 
array expression. Consider this example:
+<pre>
+-- Create a table where a column named 'x' is an array of the independent 
variables
+CREATE TABLE patients2 AS SELECT second_attack AS y, ARRAY[1, treatment, 
trait_anxiety] AS x from patients;
+
+-- Now use the columns 'x' and 'y' created in the previous step
+SELECT madlib.logregr_train(
+        'patients2',
+        'patients_logregr2',
+        'y',
+        'x');
+</pre>
+In such scenarios, the pmml code always assumes that the intercept variable 
"1," was already included in the independent variable
+expression. If it is not included, the exported PMML would be incorrect.

Review Comment:
   > If it is not included, the exported PMML would not explicitly capture it 
as intercept
   
   Sure but I would still mention that the exported PMML would be incorrect. 
This is because we would assume that x1's coefficient is the intercept's 
coefficient and our total predictors will go down by 1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@madlib.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to