Re: [Scikit-learn-general] Improving Text Classification

2013-07-11 Thread Nigel Legg
I am just starting down the road towards having a text classifier for social media posts. As this may be used in a variety of situations (currently negotiating 2 freelance analytics positions with research agencies), the classifier will need to have a mechanism for retraining on a project by projec

Re: [Scikit-learn-general] Pystruct website and mailing list

2013-07-11 Thread Mathieu Blondel
You should have named the project scikit-struct :) Mathieu On Fri, Jul 12, 2013 at 8:26 AM, Robert Layton wrote: > Structured prediction in sklearn was one of the outcomes from the survey. > Would it be a better idea to send people to pystruct, rather than > implement it here? > > > On 12 July 2

Re: [Scikit-learn-general] Pystruct website and mailing list

2013-07-11 Thread Robert Layton
Structured prediction in sklearn was one of the outcomes from the survey. Would it be a better idea to send people to pystruct, rather than implement it here? On 12 July 2013 03:12, Andreas Mueller wrote: > Hey everybody. > This is spam about my "new" project pystruct. > Pystruct is my shot at

Re: [Scikit-learn-general] Text processing using nltk, sklearn and pandas

2013-07-11 Thread Tom Fawcett
(Belated continuation of a thread I started.) Joel and Olivier, thanks for your comments. I had seen the docs in http://scikit-learn.org/dev/modules/feature_extraction.html#text-feature-extraction but for some reason I thought there was significant functionality NLTK provided that sklearn didn

[Scikit-learn-general] Pystruct website and mailing list

2013-07-11 Thread Andreas Mueller
Hey everybody. This is spam about my "new" project pystruct. Pystruct is my shot at creating an easy-to-use structured prediction library in the spirit of scikit-learn. I just created a mailing list at https://groups.google.com/forum/#!forum/pystruct The documentation is at http://pystruct.github

Re: [Scikit-learn-general] Improving Text Classification

2013-07-11 Thread Harold Nguyen
Hi Ian, Thank you very much for writing this message, and especially for sharing your experience. I am actually doing the very same thing, and would love to collaborate with you, if possible. I'm not as far along in my journey as you are, but I hope we can help each other in the future! I'm categ

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Andreas Mueller
On 07/11/2013 04:51 PM, Lars Buitinck wrote: > 2013/7/11 Mathieu Blondel : >> What is everyone planning to work on? Just curious :) > Py3 was my aim, but that seems to be almost tackled, so I guess I'll > concentrate on getting my proposed scorer API in master. > I might want to try my hand at impl

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Gael Varoquaux
On Thu, Jul 11, 2013 at 06:46:13PM +0200, Olivier Grisel wrote: > https://github.com/GaelVaroquaux/scikit-learn/compare/logistic_cv > But I think you should issue a WIP PR early now that you have publicly > mentioned this branch :) Are you sure? We are drowning under PRs. G --

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Olivier Grisel
2013/7/11 Gael Varoquaux : > On Fri, Jul 12, 2013 at 12:11:45AM +0900, Mathieu Blondel wrote: >> * L2-logistic with integrated cross-validation (I have a prototype that >> is _fast_), pair programming with Jaques Grobler > >> Sounds interesting. What's the general approach? scipy.optimize

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Gael Varoquaux
On Fri, Jul 12, 2013 at 12:11:45AM +0900, Mathieu Blondel wrote: > * L2-logistic with integrated cross-validation (I have a prototype that >   is _fast_), pair programming with Jaques Grobler > Sounds interesting. What's the general approach? scipy.optimize based solver + > warm start? Ab

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Joel Nothman writes: > > > There are two ways to answer that:1. Because it hasn't been added > to sklearn/setup.py > 2. Because you don't need to add it to setup.py to develop with it, > if you build in-place > (use `make inplace` or `python setup.py build_ext -i`) Thanks Joel and also Gael.

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Issam
Hi Dominic, I added `config.add_subpackage('neural_network')` to setup.py (didn't know it was necessary), I checked out the pull request in my other machine and it worked! These are the steps I followed for the checkout 1) git clone https://github.com/scikit-learn/scikit-learn 2) cd scikit-lear

Re: [Scikit-learn-general] Scikit-learn in Paris! An AFPyro

2013-07-11 Thread Olivier Grisel
2013/7/11 Nelle Varoquaux : > Hello everyone, > > We are organizing an AFPYro during the scikit-learn in Paris. This is > a traditionnal event (with a high attendance rate) from the french > python community, that consists in meeting up at a pub, and enjoy > Paris with a fresh beer. Anyone is welco

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Mathieu Blondel
On Thu, Jul 11, 2013 at 11:29 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > * L2-logistic with integrated cross-validation (I have a prototype that > is _fast_), pair programming with Jaques Grobler > Sounds interesting. What's the general approach? scipy.optimize based solver +

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Olivier Grisel
Initial goal was Py3 but apparently it might get working before the sprint \o/. So I will sprint on other stuff instead, but using Python 3 :) Stuff that I am interested in: - maybe work (discuss design and review impl) with @larsmans on quadratic feature hashing - discuss the with the tree grow

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Mathieu Blondel
On Thu, Jul 11, 2013 at 11:32 PM, Vlad Niculae wrote: > Will you be joining online? People have been asking this on IRC ;) > I will try to but that will mostly depend on my motivation to do programming after a day of work :) I will try to fix long-standing ridge related issues such as fit_interc

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Peter Prettenhofer
I plan on merging some of the GBRT PRs and praise Gilles new decision tree impl. 2013/7/11 Lars Buitinck > 2013/7/11 Mathieu Blondel : > > What is everyone planning to work on? Just curious :) > > Py3 was my aim, but that seems to be almost tackled, so I guess I'll > concentrate on getting my p

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Lars Buitinck
2013/7/11 Mathieu Blondel : > What is everyone planning to work on? Just curious :) Py3 was my aim, but that seems to be almost tackled, so I guess I'll concentrate on getting my proposed scorer API in master. I might want to try my hand at implementing quadratic features in FeatureHasher. -- La

[Scikit-learn-general] Scikit-learn in Paris! An AFPyro

2013-07-11 Thread Nelle Varoquaux
Hello everyone, We are organizing an AFPYro during the scikit-learn in Paris. This is a traditionnal event (with a high attendance rate) from the french python community, that consists in meeting up at a pub, and enjoy Paris with a fresh beer. Anyone is welcome to join us. It will take place Thur

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Jaques Grobler
for me same as Gael's first two points :) 2013/7/11 Vlad Niculae > Hi Mathieu, > > Will you be joining online? People have been asking this on IRC ;) > > Personally I want to take care of unfinished business like the omp CV, > the RBM pull request, GSOC PRs, and I was thinking of trying to tack

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Vlad Niculae
Hi Mathieu, Will you be joining online? People have been asking this on IRC ;) Personally I want to take care of unfinished business like the omp CV, the RBM pull request, GSOC PRs, and I was thinking of trying to tackle Averaged SGD; apart from this I'll be side-sprinting on pystruct. Cheers, V

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Gael Varoquaux
On Thu, Jul 11, 2013 at 11:23:57PM +0900, Mathieu Blondel wrote: > What is everyone planning to work on? Just curious :) For me: * Merging stuff, * L2-logistic with integrated cross-validation (I have a prototype that is _fast_), pair programming with Jaques Grobler * Fast hierarchical clusteri

Re: [Scikit-learn-general] Error while building Scikit-learn in Windows (32-bit)

2013-07-11 Thread Maheshakya Wijewardena
I tried with MSVC. It gives me this error No module named msvccompiler in numpy.distutils; trying from distutils customize MSVCCompiler Missing compiler_cxx fix for MSVCCompiler customize MSVCCompiler using build_clib building 'libsvm-skl' library compiling C sources error: Unable to find vcvarsal

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Nelle Varoquaux
> What is everyone planning to work on? Just curious :) I'd like to implement the kernelCCA, but before doing that, I might have to refactor the PLS module, which is starting to be a bit crowded. I also plan on fixing some of the issues of the MDS (use an SVD when possible). Will you be joining u

Re: [Scikit-learn-general] Paris Sprint location

2013-07-11 Thread Mathieu Blondel
What is everyone planning to work on? Just curious :) Mathieu On Wed, Jul 10, 2013 at 6:12 PM, Alexandre Gramfort < alexandre.gramf...@telecom-paristech.fr> wrote: > hi everyone, > > our next sprint will take place at Telecom ParisTech > (http://www.telecom-paristech.fr) located at: > > 46 Rue B

Re: [Scikit-learn-general] Error while building Scikit-learn in Windows (32-bit)

2013-07-11 Thread Vlad Niculae
If you have MSVC from C++ express 2008 available could you try with that? Are you trying to build the latest master, does the last release work well? Vlad On Thu, Jul 11, 2013 at 5:17 PM, Maheshakya Wijewardena wrote: > I do not have MKL. > Can there be any other reason for this to happen? > I'

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Joel Nothman
There are two ways to answer that: 1. Because it hasn't been added to sklearn/setup.py 2. Because you don't need to add it to setup.py to develop with it, if you build in-place (use `make inplace` or `python setup.py build_ext -i`) On Fri, Jul 12, 2013 at 12:09 AM, Dominic Steinitz wrote: > Sorr

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 42, Issue 31

2013-07-11 Thread Gael Varoquaux
On Thu, Jul 11, 2013 at 03:04:39PM +0100, Dominic Steinitz wrote: > And here is the install directory which does *not* contain neural_network > after running There is a missing add_package in the setup.py. This PR is not correct yet :)

Re: [Scikit-learn-general] Error while building Scikit-learn in Windows (32-bit)

2013-07-11 Thread Maheshakya Wijewardena
I do not have MKL. Can there be any other reason for this to happen? I'm stuck with this. On Thu, Jul 11, 2013 at 7:00 PM, Lars Buitinck wrote: > 2013/7/11 Maheshakya Wijewardena : > > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > > build\temp > > .win32-2.7\Release\sk

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Sorry about the spam. I replied to wrong message. Issam writes: > > >Hi Dominic, > Did you get the pull request of the MLP? It seems you have > installed scikit from the main repository, where MLP haven't been > pushed onto yet. > You can double check if MLP is there b

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 42, Issue 31

2013-07-11 Thread Dominic Steinitz
ssam writes: > > > Hi Dominic, > Did you get the pull request of the MLP? It seems you have > installed scikit from the main repository, where MLP haven't been > pushed onto yet. > You can double check if MLP is there by going to the installation > directory '

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Issam
Hi Dominic, Did you get the pull request of the MLP? It seems you have installed scikit from the main repository, where MLP haven't been pushed onto yet. You can double check if MLP is there by going to the installation directory 'site-packages' of python, and see if the `neural network` fo

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Many thanks for all the help so far. I am running the tests after installation. I get the output below. Should I be worried? Also my python doesn't seem to be able to find the neural network module but it seemed to install ok. I can send the output from the install if necessary. > ~/Dropbox/Pri

Re: [Scikit-learn-general] Error while building Scikit-learn in Windows (32-bit)

2013-07-11 Thread Lars Buitinck
2013/7/11 Maheshakya Wijewardena : > c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: > build\temp > .win32-2.7\Release\sklearn\ensemble\_gradient_boosting.o: bad reloc address > 0x0 > in section `.data' > collect2.exe: error: ld returned 1 exit status > error: Command "g++ -sh

[Scikit-learn-general] Error while building Scikit-learn in Windows (32-bit)

2013-07-11 Thread Maheshakya Wijewardena
Hi, I've installed all dependencies for Scikit learn. But when I run python setup.py build command I get the following error. c:/mingw/bin/../lib/gcc/mingw32/4.7.2/../../../../mingw32/bin/ld.exe: build\temp .win32-2.7\Release\sklearn\ensemble\_gradient_boosting.o: bad reloc address 0x0 in section

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Lars Buitinck
2013/7/11 Joel Nothman : > That's good advice too. But it took me a while to realise that the GitHub > stored pull requests as git refs (and the book doesn't cover ls-remote). I have this stanza in my .git/config inside the scikit-learn source directory: [remote "upstream"] url = g...@github.

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Joel Nothman
That's good advice too. But it took me a while to realise that the GitHub stored pull requests as git refs (and the book doesn't cover ls-remote). - Joel On Thu, Jul 11, 2013 at 8:02 PM, Olivier Grisel wrote: > 2013/7/11 Dominic Steinitz : > > > > Hi Lars, > > > > Thanks for this. I am not clear

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Issam
Hi Dominic, For the MLP, I just made few updates on the code and added an example usage section (and an example file) in the PR description for convenience, https://github.com/scikit-learn/scikit-learn/pull/2120 In case you get a bug or obscure results, your feedback will be more than appreciat

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Joel Nothman writes: > > > Hi Dominic, > After cloning scikit-learn with: > $ git clone https://github.com/scikit-learn/scikit-learn > > you can use: > $ git fetch origin refs/pull/2120/head:mlp > > $ git checkout mlp Hi Joel, Thanks for the cookery instructions. Looking my magit-log, I can

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Olivier Grisel writes: > This is an excellent opportunity to properly learn git by reading the > first 3 chapters of the git book: > > http://www.git-scm.com/book > > It's probably not the way you intended to spend the rest of the > afternoon (or morning depending on your timezone) but is surel

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Olivier Grisel
2013/7/11 Dominic Steinitz : > > Hi Lars, > > Thanks for this. I am not clear what I have to do git-wise. I tried: > > git clone https://github.com/scikit-learn/scikit-learn/pull/2120 > Cloning into '2120'... > fatal: https://github.com/scikit-learn/scikit-learn/pull/2120/info/refs > not found: did

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Joel Nothman
Hi Dominic, After cloning scikit-learn with: $ git clone https://github.com/scikit-learn/scikit-learn you can use: $ git fetch origin refs/pull/2120/head:mlp $ git checkout mlp On Thu, Jul 11, 2013 at 7:50 PM, Dominic Steinitz wrote: > L

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Lars Buitinck writes: > Pybrain is very slow. Several attempts have been made to get a > performant multi-layer perceptron into scikit-learn, but none has been > merged into the master branch yet, let alone into a recent release. > The latest attempt is [1]; if you like to live on the edge and he

Re: [Scikit-learn-general] Improving Text Classification

2013-07-11 Thread Ian Ozsvald
Hello Mike. Could you give a summary of your problem? It sounds like you're categorising text (tweets? medical text? news articles?) into >2 categories (how many?), is that right? Is the goal really to optimise your f1 score, or maybe to only want accurate categorisations (precision) or maybe high

Re: [Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Lars Buitinck
2013/7/11 Dominic Steinitz : > Module ann (Artificial Neural Networks) has been removed from the > distribution. Users wanting this sort of algorithms should take a look into > pybrain > > This was part of the release notes for Scikit-learn 0.5 so possibly out of > date. > > Why is there no neural

[Scikit-learn-general] Neural Network in Scikit-learn

2013-07-11 Thread Dominic Steinitz
Perhaps my googling skills are failing me but all I could find on the subject was: > Module ann (Artificial Neural Networks) has been removed from the > distribution. Users wanting this sort of algorithms should take a look into > pybrain This was part of the release notes for Scikit-learn 0.5

Re: [Scikit-learn-general] FeatureHasher for numeric data

2013-07-11 Thread Lars Buitinck
2013/7/11 Gad Abraham : > I'm very much a sklearn beginner, and I'd like to use FeatureHasher to > reduce the dimensionality of a numeric matrix. Any hints on how to do this? > I've seen the examples showing how to use it with text. You mean the input is a NumPy array? There's no special support f

[Scikit-learn-general] FeatureHasher for numeric data

2013-07-11 Thread Gad Abraham
Hi, I'm very much a sklearn beginner, and I'd like to use FeatureHasher to reduce the dimensionality of a numeric matrix. Any hints on how to do this? I've seen the examples showing how to use it with text. Thanks, Gad --