kaknikhil commented on code in PR #615:
URL: https://github.com/apache/madlib/pull/615#discussion_r1511755950
##########
src/ports/postgres/modules/pmml/table_to_pmml.sql_in:
##########
@@ -146,10 +144,25 @@ Alternatively, the above can also be invoked as below if
custom names are needed
for fields in the Data Dictionary:
<pre class="example">
SELECT madlib.pmml('patients_logregr',
- 'out_attack~1+in_trait_anxiety+in_treatment');
+ 'out_attack~in_trait_anxiety+in_treatment');
</pre>
-\b Note: If the second argument of 'pmml' function is not specified, a default
suffix "_pmml_prediction" will be automatically append to the column name to be
predicted. This can help avoid name conflicts.
+\b Note: 1. If the second argument of 'pmml' function is not specified, a
default suffix "_pmml_prediction" will be automatically append to the column
name to be predicted. This can help avoid name conflicts.
+
+\b Note: 2. While training regression models, it is possible to use a non
array expression. Consider this example:
+<pre>
+-- Create a table where a column named 'x' is an array of the independent
variables
+CREATE TABLE patients2 AS SELECT second_attack AS y, ARRAY[1, treatment,
trait_anxiety] AS x from patients;
+
+-- Now use the columns 'x' and 'y' created in the previous step
+SELECT madlib.logregr_train(
+ 'patients2',
+ 'patients_logregr2',
+ 'y',
+ 'x');
+</pre>
+In such scenarios, the pmml code always assumes that the intercept variable
"1," was already included in the independent variable
+expression. If it is not included, the exported PMML would be incorrect.
Review Comment:
> If it is not included, the exported PMML would not explicitly capture it
as intercept
Sure but I would still mention that the exported PMML would be incorrect.
This is because we would assume that x1's coefficient is the intercept's
coefficient and our total predictors will go down by 1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]