Re: [Scikit-learn-general] help installing scikit-learn (and scipy) on Cygwin

2015-08-20 Thread Sebastian Raschka
Hi, Howard, I have no experience with Windows in particular, but installing libraries from the Sci-stack (SciPy, NumPy) is usually a little bit of a hassle on every platform. I can only highly recommend you looking at Anaconda, a Python distribution for scientific computing — it’s free. 99% of t

Re: [Scikit-learn-general] Persisting models

2015-08-20 Thread Joel Nothman
I suspect supporting PMML import is a separate and low-priority project. Higher priority is support for transformers (in pipelines / feature unions), other predictors, and tests that verify the model against an existing PMML predictor. On 21 August 2015 at 01:37, Dale Smith wrote: > Package skle

[Scikit-learn-general] help installing scikit-learn (and scipy) on Cygwin

2015-08-20 Thread Howard Karloff
Hi. I am attempting to install scikit-learn on my Windows 7 laptop on which I'm running Cygwin. I want to be able to use scikit-learn from within a Cygwin window. I have Python 3.4 and numpy installed. However, I am having trouble installing scipy and this obviously may be related. >pip instal

Re: [Scikit-learn-general] Persisting models

2015-08-20 Thread Dale Smith
Package sklearn_pmml appeared on github: https://github.com/alex-pirozhenko/sklearn-pmml It's still in the early stages. I have yet to experiment with it, and I don't think it supports pmml import. Dale Smith, Ph.D. Data Scientist ​ d. 404.495.7220 x 4008   f. 404.795.7221 Nexidia Corporate |

Re: [Scikit-learn-general] How to extract the decision tree rule of each leaf node into Pandas Dataframe query?

2015-08-20 Thread Jacob Schreiber
It sounds like you prefer false negatives over false positives (not catching bad activity, but rarely misclassifying good activity as bad activity). You can weight the different classes currently by setting the sample weight on good activity points to be higher than those of bad activity points. Th

Re: [Scikit-learn-general] How to extract the decision tree rule of each leaf node into Pandas Dataframe query?

2015-08-20 Thread Rex X
Very nice! Thanks to both of you, Jacob and Andreas! Andreas, yes, I'm interested in all leafs. The additional Pandas query done on each leaf node is a further check to inspect whether this leaf node can be of interest or not. Binary classification for example, fraud detection to be more specific

[Scikit-learn-general] How to encode categorical feature values containing special characters?

2015-08-20 Thread Rex X
Hi fellows, I found some numbers generated are not right, and it took me half a day of debugging. Finally, I found it was due to the loaded CSV file contains special characters for a few categorical features. These categorical values are all in UNICODE in different languages, Hindi, Chinese, Engli

Re: [Scikit-learn-general] Persisting models

2015-08-20 Thread Alexandre Gramfort
hi, > Agreed—this is exactly the type of use case I want to support. > Pickling won't work here, but using HDF5 like MNE does would > probably be close to ideal (thanks to Chris Holdgraf for the > heads-up): > > https://github.com/mne-tools/mne-python/blob/master/mne/_hdf5.py For your info Eric L