Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Olivier Grisel
2013/9/23 Fred Baba : > I have a python wrapper that allows me to export datasets to Python for > analysis. But since my features are computed in C++, I'd like to compute > predictions in C++ as well. Then I would simply set the coefficients at > startup (v.first) and update the feature values cont

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread abhishek
@Fred. It seems very interesting. Could you please provide more information on how you achieved this? On Mon, Sep 23, 2013 at 5:46 PM, Fred Mailhot wrote: > FYI, I've used sklearn's LogisticRegression in an online/real-time text > classification app without having to dig into the internals and g

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Olivier Grisel
2013/9/23 Fred Baba : > Thanks, Olivier. The application involves continuous classification of > real-time input features. So I only make on prediction at a time. My > intention would be to observe the structure of the fitted model and then > optimize the prediction function by hand in c++. For ins

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Fred Baba
Thanks, Olivier. The application involves continuous classification of real-time input features. So I only make on prediction at a time. My intention would be to observe the structure of the fitted model and then optimize the prediction function by hand in c++. For instance, a linear classifier wou

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Olivier Grisel
2013/9/23 abhishek : > @Fred. It seems very interesting. Could you please provide more information > on how you achieved this? Using the default sklearn API (Pipeline of TfidfVectorizer + LinearSVC) should work to provide ms-level prediction times. I think there is nothing special to do in this c

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Fred Baba
I have a python wrapper that allows me to export datasets to Python for analysis. But since my features are computed in C++, I'd like to compute predictions in C++ as well. Then I would simply set the coefficients at startup (v.first) and update the feature values continuously (v.second). The entir

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Olivier Grisel
>2013/9/23 Fred Baba : > System performance is currently on the order of ~1us, so Python overhead > would be unacceptable. For SVM, I'll extract the support vectors and > investigate using libSVM directly, as per federico vaggi's advice. +1 for > PMML support at some point down the road. Thanks for

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Fred Mailhot
FYI, I've used sklearn's LogisticRegression in an online/real-time text classification app without having to dig into the internals and gotten ~2.5ms response time (including vectorizing; vocab size ~200k). On 23 September 2013 06:37, Peter Prettenhofer wrote: > We don't have a PMML interface y

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Fred Baba
System performance is currently on the order of ~1us, so Python overhead would be unacceptable. For SVM, I'll extract the support vectors and investigate using libSVM directly, as per federico vaggi's advice. +1 for PMML support at some point down the road. Thanks for the quick responses. On Mon,

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Eustache DIEMERT
What are your requirements Fred ? in terms of maximum execution time ? Predicting could be quite "fast" actually (think milliseconds) but depends on your model and nber of features. Timing a prediction code with timeit [1] should give you numbers for your specific use case. HTH, [1] http://docs

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread federico vaggi
As a generic answer, no. You'll have to write the equivalent of the predict method yourself given the internal state (or, parameters) of the trained classifier. For classifiers like SVM you can use libSVM directly and avoid the Python wrapper. On Mon, Sep 23, 2013 at 3:28 PM, Fred Baba wrote:

Re: [Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Peter Prettenhofer
We don't have a PMML interface yet [1] - so you need to write custom code to extract internal state each individual classifier. What do you mean by performance critical (<1ms, <<1ms)? Do you make predictions per sample or can you buffer samples and make predictions for batches? In general, what ki

[Scikit-learn-general] Representing classifiers outside of Python

2013-09-23 Thread Fred Baba
I'd like to use classifiers trained via sklearn in a real-time application, performance critical application. How do I access the internal representation of trained classifiers? For linear classifiers/regressions, I can simply store the coefficients and generate the linear combination myself. For