Hi Lijie,

Thank you for reaching out. Unfortunately, right now only the neural network 
module supports minibatching, so you won't be able to use it with SVM or LR.

- Nikhil

________________________________
From: Lijie Xu <[email protected]>
Sent: Thursday, December 23, 2021 3:13 AM
To: [email protected] <[email protected]>
Subject: About the mini-batch training

Hi All,

I noticed that MADlib provides a mini-batch preprocessor 
(https://madlib.apache.org/docs/latest/group__grp__minibatch__preprocessing.html<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmadlib.apache.org%2Fdocs%2Flatest%2Fgroup__grp__minibatch__preprocessing.html&data=04%7C01%7Cnkak%40vmware.com%7C7e5384bf7e14408fa8dc08d9c60555a7%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637758548486490101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4G8hyZL6kQhURyT%2BTPoAz%2BJFgMm5At2d%2BgbvmASClqo%3D&reserved=0>)
 for Neural Networks.

I'm wondering if this mini-batch processor can work with the linear models such 
as SVM and LR (i.e., mini-batch SGD).

I just used this mini-batch preprocessor on a dataset and got the batched table 
as follows. When I performed the SVM on it, I encountered an error as 'SVM 
error: dependent_varname cannot be of array type!'. It seems that SVM does not 
work on this batched table.

------------------------------------------------------------
db=# \d susy_b128
                         Table "public.susy_b128"
       Column        |        Type        | Collation | Nullable | Default
---------------------+--------------------+-----------+----------+---------
 __id__              | bigint             |           |          |
 dependent_varname   | double precision[] |           |          |
 independent_varname | double precision[] |           |          |

db=# SELECT madlib.svm_classification('susy_b128', 'susy_b128_out', 
'dependent_varname', 'independent_varname', 'linear', '', '', 
'init_stepsize=0.1, decay_factor=0.95, max_iter=3, tolerance=0, lambda=0');
ERROR:  plpy.Error: SVM error: dependent_varname cannot be of array type!
------------------------------------------------------------

Any suggestions are welcome! Thanks!

Best,
Lijie

Reply via email to