Re: [Scikit-learn-general] Searchlight algorithm with scikit-learn?

2011-10-04 Thread Vincent Michel
Hi Michael, The code has been pushed. It still need some work, but you can give it a try ! Best, Vincent 2011/10/2 Michael Waskom > Hi Vincent, thanks for pointing me to this! Looks like a great resource. > I'll be on the lookout for the searchlight code, but this looks quite > helpful in a

Re: [Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Conrad Lee
On Tue, Oct 4, 2011 at 10:36 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > On Tue, Oct 04, 2011 at 10:27:01PM +0200, Lars Buitinck wrote: > > You must mean from Θ(n² lg n) to Θ(n²). A generic Θ(n² lg n) algo is > > listed in many textbooks [1], I sure hope we don't have the naive al

Re: [Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Gael Varoquaux
On Tue, Oct 04, 2011 at 10:27:01PM +0200, Lars Buitinck wrote: > You must mean from Θ(n² lg n) to Θ(n²). A generic Θ(n² lg n) algo is > listed in many textbooks [1], I sure hope we don't have the naive algo > scikit-learn? In the scikit, the only hierarchical clustering algorithm that we have is t

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Olivier Grisel
2011/10/4 Małgorzata Siudek : > >> are you looking the analytic formula for the decision function? >> >> > Yes, exactly, I would like to have numerical formula of line separating > different areas. >> maybe this example : >> >> http://scikit-learn.sourceforge.net/auto_examples/svm/plot_svm_nonlinea

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Alexandre Gramfort
> and how to do that for linear kernel? In first analysis I think linear will > be enough for me. In attachment I send script and input data, it's quite > easy sample, because I have only two different groups, but for first try it > will be enough for me look at http://scikit-learn.sourceforge.ne

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Małgorzata Siudek
Yes, exactly, I would like to have numerical formula of line separating different areas. unless you use a linear kernel there is no simple analytical formula and how to do that for linear kernel? In first analysis I think linear will be enough for me. In attachment I send script and

Re: [Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Lars Buitinck
2011/10/4 Conrad Lee : > I just noticed this recent paper on arXiv about faster hierarchical > clustering.  The author has come up with algorithms to reduce the asymptotic > complexity of many variants from O(N^3) to O(N^2).  Even better, he has come > up with a C++ implementation with a python/num

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Alexandre Gramfort
> Yes, exactly, I would like to have numerical formula of line separating > different areas. unless you use a linear kernel there is no simple analytical formula Alex -- All the data continuously generated in your IT inf

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Małgorzata Siudek
> Hi, > > >> I have found scikit-learn (version 0.8.1) recently and I found it useful to >> classify my data using SVM. I've modify given example: plot_custom_kernel.py >> to use a non-linear SVC with RBF kernel and I have one question. I would >> like to find a function of line separating tw

Re: [Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Olivier Grisel
2011/10/4 Alexandre Gramfort : > hi conrad, > > that looks interesting however this implementation is not compatible with > the scikit license ( GPLv3 ) so if we want it we'll have to reimplement it. > I'll take a look at the paper to see how hard this would be. Also there is a policy of trying to

Re: [Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Alexandre Gramfort
hi conrad, that looks interesting however this implementation is not compatible with the scikit license ( GPLv3 ) so if we want it we'll have to reimplement it. I'll take a look at the paper to see how hard this would be. Alex On Tue, Oct 4, 2011 at 3:28 PM, Conrad Lee wrote: > I just noticed t

[Scikit-learn-general] Faster hierarchical clustering

2011-10-04 Thread Conrad Lee
I just noticed this recent paper on arXiv about faster hierarchical clustering. The author has come up with algorithms to reduce the asymptotic complexity of many variants from O(N^3) to O(N^2). Even better, he has come up with a C++ implementation with a python/n

Re: [Scikit-learn-general] Interesting competition on sem-supervised feature extraction now running on kaggle.com

2011-10-04 Thread James Bergstra
The TheanoSGDClassifier that you mentioned is something I wrote, you can find it here: https://github.com/jaberg/scikit-learn/blob/ogrisel_image-patches/examples/applications/plot_image_classification_convolutional_features.py#L296 It is a pretty simple algorithm: - normalize the input features t

Re: [Scikit-learn-general] Interesting competition on sem-supervised feature extraction now running on kaggle.com

2011-10-04 Thread David Warde-Farley
On 2011-10-04, at 3:37 AM, Peter Prettenhofer wrote: > I haven't looked at Theano's SGD yet - do they calibrate the learning > rate on held-out data or do they use an heuristic? Just to clear up, Theano doesn't contain any learning rate logic at all. It's just a tool to let you define your cost

Re: [Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Alexandre Gramfort
Hi, > I have found scikit-learn (version 0.8.1) recently and I found it useful to > classify my data using SVM. I've modify given example: plot_custom_kernel.py > to use a non-linear SVC with RBF kernel and I have one question. I would > like to find a function of line separating two different are

Re: [Scikit-learn-general] permutation test score

2011-10-04 Thread Alexandre Gramfort
>>> I think that I would favor changing it so that it returns -log(p) (yay, >>> another API breakage :$ ). >> >> not really convinced as for permutations you're unlikely have numerical >> underflows. >> >> and I prefer - log10(p) for p-values. > > I prefer just straight probabilities when p-values

Re: [Scikit-learn-general] permutation test score

2011-10-04 Thread Lars Buitinck
2011/10/4 Alexandre Gramfort : >> I think that I would favor changing it so that it returns -log(p) (yay, >> another API breakage :$ ). > > not really convinced as for permutations you're unlikely have numerical > underflows. > > and I prefer - log10(p) for p-values. I prefer just straight probabi

Re: [Scikit-learn-general] permutation test score

2011-10-04 Thread Alexandre Gramfort
> Actually, I hadn't realized, but 'permutation_test_score' is named > 'score', and it does not follow the 'bigger is better rule'. So, in > addition to there being a documentation problem, there is a consistency > problem. unless you see it as a permutation test on a score. It's not a score funct

Re: [Scikit-learn-general] bibtex entry for the 0.9 release

2011-10-04 Thread Gael Varoquaux
On Sat, Oct 01, 2011 at 11:16:27PM +0100, Conrad Lee wrote: >So is there a bibtex entry anywhere? I just looked around in a few places >because I'd like to cite the project, but I didn't find any entry so >instead I'll just leave a footnote with the sourceforge url---unless >anyone

[Scikit-learn-general] svm - determining numerical function

2011-10-04 Thread Małgorzata Siudek
Hello, I have found scikit-learn (version 0.8.1) recently and I found it useful to classify my data using SVM. I've modify given example: plot_custom_kernel.py to use a non-linear SVC with RBF kernel and I have one question. I would like to find a function of line separating two different areas. I

Re: [Scikit-learn-general] permutation test score

2011-10-04 Thread Olivier Grisel
2011/10/4 Gael Varoquaux : > By the way, why we are modifying this `permutation_test_score` function, > it seems to me that the `score_func` argument should be optional, and > that without it, the `permutation_test_score` should try to use the > estimator's score. In such a situation, as such a sco

Re: [Scikit-learn-general] permutation test score

2011-10-04 Thread Olivier Grisel
2011/10/4 Gael Varoquaux : > By the way, why we are modifying this `permutation_test_score` function, > it seems to me that the `score_func` argument should be optional, and > that without it, the `permutation_test_score` should try to use the > estimator's score. In such a situation, as such a sco

Re: [Scikit-learn-general] Interesting competition on sem-supervised feature extraction now running on kaggle.com

2011-10-04 Thread Olivier Grisel
2011/10/4 Peter Prettenhofer : > @alexandre: thanks; basically yes -> I use the SGD classifier from > Bolt instead of sklearn because I had to patch it up a bit. > > @ogirsel: have you tried to run MiniBatchKMeans on the unlabeled data? > I'm curious whether that scales... I did run minibatchkmean

Re: [Scikit-learn-general] Interesting competition on sem-supervised feature extraction now running on kaggle.com

2011-10-04 Thread Peter Prettenhofer
I haven't looked at Theano's SGD yet - do they calibrate the learning rate on held-out data or do they use an heuristic? best, Peter PS: the patching up that I did was not related to learning rate or the learning algorithm in general. For the approach that I use I need to mask features (much lik