Hi, I was curious how LinearSVC as implemented in lightning [*] compares to LinearSVC in scikit-learn so I benchmarked them on the MNIST and News20 datasets.
Here are the results for L1-loss SVM with L2 penalty. MNIST (class 8 vs. others) ------------------------------------ lightning Training time: 18.0813570023 Acc: 0.953066666667 scikit-learn Training time: 17.2167401314 Acc: 0.953183333333 News20 (all classes) ---------------------------- lightning Training time: 13.0210821629 Acc: 0.966571155683 scikit-learn Training time: 11.1561429501 Acc: 0.966571155683 So, lightning is slighly slower. This is probably due to the virtual method calls needed for the dataset abstraction lightning is using. However, the main advantage of lightning is that it works directly on the NumPy array or SciPy sparse matrix *without* memory copy (scikit-learn converts the data to liblinear's sparse data structure). I'm attaching the script I used. Mathieu [*] https://github.com/mblondel/lightning
benchmark_lightning.py
Description: Binary data
------------------------------------------------------------------------------ Introducing Performance Central, a new site from SourceForge and AppDynamics. Performance Central is your source for news, insights, analysis and resources for efficient Application Performance Management. Visit us today! http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
_______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
