Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread JAGANADH G
Hi Olivier , Here is the output as requested. sklearn version - '0.12.1' Python 2.7 Os : Ubuntu 11.04 Trace : In [3]: from sklearn.datasets import load_files In [4]: categ = ['pos','neg'] In [5]: dataset = load_files('data_n',categories=categ,shuffle=False) In [6]: from sklearn.feature_extraction

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Leon Palafox
Hello, I just finished with most of the edits. Regards On Fri, Jan 18, 2013 at 7:30 AM, Didier Vila wrote: > All, thanks for your good work. I just completed the form as an user ( > and not core). Didier > > ** ** > > Didier Vila, PhD | Risk | CapQuest Group Ltd | Fleet 27 | Rye > Close

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Didier Vila
All, thanks for your good work. I just completed the form as an user ( and not core). Didier Didier Vila, PhD | Risk | CapQuest Group Ltd | Fleet 27 | Rye Close | Fleet | Hampshire | GU51 2QQ | Tel: 0871 574 7989 | Fax: 0871 574 2992 | Email: dv...@capquestco.com

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Robert Layton
On 18 January 2013 00:31, Jaques Grobler wrote: > Sure, I can do that > > Regards, > J > > > 2013/1/17 Andreas Mueller > >> Hi Leon. >> Looks good, thanks :) >> >> Maybe some others have some ideas for questions? >> >> I think we might get more out of people if we have more multiple choice >> o

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread Olivier Grisel
2013/1/17 Andreas Mueller : > On 01/17/2013 07:02 PM, Olivier Grisel wrote: >> This is a bug. >> >> Could you run the profiler (cProfile or line_profiler) on >> TfidfVectorizer on a slice of your data an post the output? >> >> http://scikit-learn.org/dev/developers/performance.html#profiling-python

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread Andreas Mueller
On 01/17/2013 07:02 PM, Olivier Grisel wrote: > This is a bug. > > Could you run the profiler (cProfile or line_profiler) on > TfidfVectorizer on a slice of your data an post the output? > > http://scikit-learn.org/dev/developers/performance.html#profiling-python-code > Do you think this is specifi

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread Olivier Grisel
This is a bug. Could you run the profiler (cProfile or line_profiler) on TfidfVectorizer on a slice of your data an post the output? http://scikit-learn.org/dev/developers/performance.html#profiling-python-code -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel --

Re: [Scikit-learn-general] INRIA teams using scikit-learn

2013-01-17 Thread Charles-Pierre Astolfi
Hi Gael! I used to work (6 months internship) with Gilles Stoltz from the CLASSIC team, but I was the only one using scikit-learn. Cheers -- Cp On Thu, Jan 17, 2013 at 1:26 PM, Gael Varoquaux wrote: > If you are in an INRIA team that is using scikit learn, please tell me > now, I need to list

[Scikit-learn-general] Fwd: [Broken] scikit-learn/scikit-learn#779 (master - 578034e)

2013-01-17 Thread Lars Buitinck
HTTPError: HTTP Error 500: Internal Server Error when fetching the leukemia dataset... can someone disable that doctest? -- Forwarded message -- From: Travis-CI Date: 2013/1/17 Subject: [Broken] scikit-learn/scikit-learn#779 (master - 578034e) To: "larsm...@gmail.com" ** Th

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Jaques Grobler
Sure, I can do that Regards, J 2013/1/17 Andreas Mueller > Hi Leon. > Looks good, thanks :) > > Maybe some others have some ideas for questions? > > I think we might get more out of people if we have more multiple choice or > radio buttons, > as these are way easier to click. > > Maybe we cou

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Olivier Grisel
2013/1/17 Leon Palafox : > Hello, > > I have some sort of form ready in google docs. > > https://docs.google.com/spreadsheet/viewform?formkey=dFdyeGNhMzlCRWZUdldpMEZlZ1B1YkE6MQ For the first question: - Self taught - I am still at high school + add hobbyist as a third option to the academic / in

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Andreas Mueller
Hi Leon. Looks good, thanks :) Maybe some others have some ideas for questions? I think we might get more out of people if we have more multiple choice or radio buttons, as these are way easier to click. Maybe we could also have something about priorities, like "which of the following should

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Jaques Grobler
Looks good.. should there perhaps be an 'other' tickbox for the first question (maybe with a textbox to specify?) - or is that overkill? also, shouldn't TI be IT? Only Texas Instruments and non english version of IT comes to mind for me.. I mind be being dumb here :D Thanks for doing this 2013/

Re: [Scikit-learn-general] INRIA teams using scikit-learn

2013-01-17 Thread Gael Varoquaux
On Thu, Jan 17, 2013 at 02:15:48PM +0100, Yogesh Karpate wrote: >     I am PhD student from  team "VISAGES" INRIA Rennes. I use scikit > learn  but > don't think other members use it. Excellent! I thought that there was some usage of the scikit in Visage, but I wasn't sure. By the way, I'l

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Leon Palafox
Hello, I have some sort of form ready in google docs. https://docs.google.com/spreadsheet/viewform?formkey=dFdyeGNhMzlCRWZUdldpMEZlZ1B1YkE6MQ Let me know your suggestions Leon On Thu, Jan 17, 2013 at 10:13 PM, Andreas Mueller wrote: > Hi all. > Leon, did you set up a Forms already? > > Do we

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread JAGANADH G
On Thu, Jan 17, 2013 at 3:38 PM, Olivier Grisel wrote: > It sounds like a bug. How many tokens do you have in your corpus? > > If you have the vectorized corpus in a variable X (e.g. `X = > CountVectorizer().fit_transform(list_of_documents)`) you can do: > > >>> print(repr(X)) > > to get the d

Re: [Scikit-learn-general] INRIA teams using scikit-learn

2013-01-17 Thread Yogesh Karpate
Hi Gael! I am PhD student from team "VISAGES" INRIA Rennes. I use scikit learn but don't think other members use it. On Thu, Jan 17, 2013 at 1:26 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > If you are in an INRIA team that is using scikit learn, please tell me > now

Re: [Scikit-learn-general] Roadmap / Scope

2013-01-17 Thread Andreas Mueller
Hi all. Leon, did you set up a Forms already? Do we have any more ideas for questions? I think we should start voting on them if we want to get it ready for the release. Cheers, Andy -- Master Visual Studio, SharePoint,

[Scikit-learn-general] INRIA teams using scikit-learn

2013-01-17 Thread Gael Varoquaux
If you are in an INRIA team that is using scikit learn, please tell me now, I need to list them as I am writing a grant to get money for scikit-learn development. Thanks, Gaël -- Master Visual Studio, SharePoint, SQL, AS

Re: [Scikit-learn-general] Fwd: [Broken] scikit-learn/scikit-learn#770 (master - c59af39)

2013-01-17 Thread Gael Varoquaux
On Thu, Jan 17, 2013 at 11:52:38AM +0100, Lars Buitinck wrote: >     Executing your (cd scikit-learn/scikit-learn) took longer than 3 minutes > and was terminated. > I bet it's not really the cd that took this long -- does anyone have trouble > building? It builds fine here. G -

Re: [Scikit-learn-general] Fwd: [Broken] scikit-learn/scikit-learn#770 (master - c59af39)

2013-01-17 Thread Jaques Grobler
No problems by me 2013/1/17 Lars Buitinck > Executing your (cd scikit-learn/scikit-learn) took longer than 3 > minutes and was terminated. > > I bet it's not really the cd that took this long -- does anyone have > trouble building? > > > -- Forwarded message -- > From: Travi

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread Olivier Grisel
It sounds like a bug. How many tokens do you have in your corpus? If you have the vectorized corpus in a variable X (e.g. `X = CountVectorizer().fit_transform(list_of_documents)`) you can do: >>> print(repr(X)) to get the dimension and number of non-zeros in the sparse matrix. -

Re: [Scikit-learn-general] Saved Classifier Model slow in prediction

2013-01-17 Thread JAGANADH G
On Wed, Jan 16, 2013 at 8:35 PM, Jacob Perkins wrote: > The tfidf transformer is the slow part - I've done a number of speed tests > with scikit-learn classifiers, and adding tfidf always slowed things down > significantly. It's also didn't seem to help much with accuracy. > > > > Hi Jacob, W