[ 
https://issues.apache.org/jira/browse/MADLIB-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525455#comment-16525455
 ] 

Nandish Jayaram commented on MADLIB-1232:
-----------------------------------------

w/ Arvind Sridhar
We ran install-check and dev-check on GPDB 5 and GPDB 4.3 clusters with 16 
nodes.
We ran the tests on each cluster 20 times, and of all those runs, there has 
only been one
failure. The failure happens in SVM, on the following dev-check test query:
{code}
-- serial
-- learning
SELECT svm_classification(
    'svm_normalized',
    'svm_model',
    'label',
    'ind',
    NULL, -- kernel_func
    NULL, -- kernel_pararms
    NULL, -- grouping_col
    'init_stepsize=0.03, decay_factor=1, max_iter=5, tolerance=0, lambda=0',
    false -- verbose
    );
 svm_classification
--------------------

(1 row)

\x on
Expanded display is on.
SELECT * FROM svm_model;
-[ RECORD 1 
]------+-------------------------------------------------------------------------------------------------
coef               | 
{0.889674731059023,-0.829317968356413,-0.0191735576783983,0.0164118502023703,-0.948240000000001}
loss               | 291.345980000935
norm_of_gradient   | 141.442489577661
num_iterations     | 5
num_rows_processed | 1000
num_rows_skipped   | 0
dep_var_mapping    | {-1,1}

-- l2
SELECT svm_classification(
    'svm_normalized',
    'svm_model_small_norm2',
    'label',
    'ind',
    NULL, -- kernel_func
    NULL, -- kernel_pararms
    NULL, --grouping_col
    'init_stepsize=0.03, decay_factor=1, max_iter=5, tolerance=0, lambda=1'
    );
 svm_classification
--------------------

(1 row)

\x on
Expanded display is on.
SELECT * FROM svm_model_small_norm2;
-[ RECORD 1 
]------+-------------------------------------------------------------------------------------------------
coef               | 
{0.889186884610682,-0.829281065077366,-0.0194635076935577,0.0182715168960788,-0.949188716810076}
loss               | 291.154163138472
norm_of_gradient   | 140.989704965787
num_iterations     | 5
num_rows_processed | 1000
num_rows_skipped   | 0
dep_var_mapping    | {-1,1}

\x off
Expanded display is off.
SELECT
    assert(
        norm2(l2.coef) < norm2(noreg.coef),
        'l2 regularization should produce coef with smaller l2 norm!')
FROM svm_model AS noreg, svm_model_small_norm2 AS l2;
psql:/tmp/madlib.G2AvRa/svm/test/svm.sql_in.tmp:224: ERROR:  Failed assertion: 
l2 regularization should produce coef with smaller l2 norm!  (seg3 slice2 
192.168.4.15:20001 pid=14554)
{code}

> Install check periodically fails on larger clusters
> ---------------------------------------------------
>
>                 Key: MADLIB-1232
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1232
>             Project: Apache MADlib
>          Issue Type: Task
>          Components: All Modules
>            Reporter: Frank McQuillan
>            Priority: Minor
>             Fix For: v1.15
>
>
> We have observed that the install check can fail on a database clusters with 
> ~8 segment hosts or more.  This is possibly because the data sets for IC are 
> so small that some segments get no data, causing the query to fail.
> This story is to investigate current IC and make appropriate changes so that 
> it passes for small and large clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to