[Scikit-learn-general] Website down

2013-03-05 Thread Andreas Mueller
Hi everybody. The dns seems to be down again. Should we try to switch? Does Stefan still have the domain or did someone else take care of it? Cheers, Andy -- Symantec Endpoint Protection 12 positioned as A LEADER in The

Re: [Scikit-learn-general] Multinomial HMM Issue #1158

2013-03-05 Thread Andreas Mueller
Hi. Should we just deprecate / remove the HMM? We deemed sequence prediction off-topic (Lars' words and I agree) and there is no core-dev maintaining them. Is there any project this could move to? Statsmodel, pandas? There should be a go-to place for time-series modelling. There was scikit-times

Re: [Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)

2013-03-05 Thread Jason Rudy
Just anecdotally I can say the goodness of fit and speed seem comparable, but the models produced are slightly different. I'm working on making a more comprehensive comparison. On Tue, Mar 5, 2013 at 11:57 AM, Andreas Mueller wrote: > On 03/05/2013 08:15 PM, Jason Rudy wrote: > > So I've finall

Re: [Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Mohamed Radhouane Aniba
I solved my problem Thanks Ronnie On Mar 5, 2013, at 2:22 PM, Ronnie Ghose wrote: > http://stackoverflow.com/questions/13590247/using-libsvm-format-in-scikit > SO is amazing :) > > > On Tue, Mar 5, 2013 at 2:19 PM, Mohamed Radhouane Aniba > wrote: > Hello everyone, > > I am new to sciki

Re: [Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Mohamed Radhouane Aniba
Thank you Ronnie Sorry for this dumb question but I am a bit confused with the link you sent even if it seems straightforward. I am trying to replace the iris data in http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html#example-svm-plot-rbf-parameters-py I changed the port

[Scikit-learn-general] Multinomial HMM Issue #1158

2013-03-05 Thread David Reed
Hi, I added a comment to issue #1158 but since it is closed, I'm not sure if anyone would be alerted. I am not sure if this should be closed or perhaps a second issue should be opened. As already stated, the attribute n_symbols only gets created when an emission probability matrix is defined. Th

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Christian
For me it works fine. Cheers, Christian > test.arff @relation 'test' @attribute v1 {blonde,blue} @attribute v2 numeric @attribute v3 numeric @attribute class {yes,no} @data blonde,17.2 ,1,yes blue,27.2,2,yes blue,18.2,3,no < end test.arff barray [['blonde', 17.2, 1.0, 'yes'], ['blue', 27.2,

Re: [Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Robert Layton
On 6 March 2013 06:26, Andreas Mueller wrote: > Hi Jeff. > Thanks for your will to contribute. > As the dev guidelines state, there are certain issues that are tagged as > "easy": > > https://github.com/scikit-learn/scikit-learn/issues?labels=Easy&page=1&sort=updated&state=open > These might stil

Re: [Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)

2013-03-05 Thread Andreas Mueller
On 03/05/2013 08:15 PM, Jason Rudy wrote: > So I've finally got something to show. Gael, you were entirely > correct about it being a mouthful. I've been developing it as a > separate package for simplicity, but will be integrating with > scikit-learn as soon as I get the time. Here is what I

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Pavan Mallapragada
Great reference Robert! Thanks. Currently I am satisfied with the performance scipy.cluster given my data size. However, it will be great to have these fast cluster algorithms added. It will be interesting to look into these. On Mar 5, 2013, at 12:24 PM, Robert McGibbon wrote: > On Mar 5, 20

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread Kenneth C. Arnold
It was a pretty easy build on Mac -- I just used MacPorts to install and select an llvm. Of course Anaconda is even easier. I'd say Numba is a medium-term consideration. It's enough trouble getting everybody using C compilers, so adding LLVM to the mix is probably way too much of a change for the

Re: [Scikit-learn-general] Multivariate Adaptive Regression Splines (MARS, aka earth)

2013-03-05 Thread Jason Rudy
So I've finally got something to show. Gael, you were entirely correct about it being a mouthful. I've been developing it as a separate package for simplicity, but will be integrating with scikit-learn as soon as I get the time. Here is what I've got so far in case anyone wants to take a look:

Re: [Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Andreas Mueller
Hi Jeff. Thanks for your will to contribute. As the dev guidelines state, there are certain issues that are tagged as "easy": https://github.com/scikit-learn/scikit-learn/issues?labels=Easy&page=1&sort=updated&state=open These might still vary a lot. Maybe just browse around. How familiar are you

Re: [Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Ronnie Ghose
http://stackoverflow.com/questions/13590247/using-libsvm-format-in-scikit SO is amazing :) On Tue, Mar 5, 2013 at 2:19 PM, Mohamed Radhouane Aniba wrote: >Hello everyone, > > I am new to scikit-learn package, I am still trying sone of the examples > on the website. > You have an example on R

[Scikit-learn-general] Request for project to tackle

2013-03-05 Thread Jeff Van Voorst
Greetings, I have read the developer guidelines for scikit learn, and I would like to contribute (to boost my machine learning and python fu). Is there an outstanding, "easy" bug or feature that can be assigned to me or should I select one? Thanks, Jeff Van Voorst --

[Scikit-learn-general] Loading libsvm data formats

2013-03-05 Thread Mohamed Radhouane Aniba
Hello everyone, I am new to scikit-learn package, I am still trying sone of the examples on the website. You have an example on RBF SVM parameters that is very interesting but my only problem is that my data are in libsvm format I know that you have an option for loading this format through lo

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Rob Zinkov
The import method doesn't support sparse representations: " This function should be able to read most arff files. Not implemented functionality include: date type attributes string type attributes It can read files with numeric and nominal attributes. It cannot read files with sparse data ({}

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Robert McGibbon
On Mar 5, 2013, at 10:10 AM, Olivier Grisel wrote: > This code is in C++ and the scikit-learn core maintainers are not all > experts in C++ and prefer cython for optimized code. > > A cython rewrite of some of those algorithms would be of interest though. For anyone interested in either reimple

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Tom Fawcett
Thanks for your response, Christian. I experimented with the package. FYI, there’s a problem with the pypi arff reader. The package claims to handle numbers but it seems to encode everything (including numbers) as strings, like this: [['blonde' '17.2' '1' 'yes'] ['blue' '27.2' '2' 'yes'] [

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Olivier Grisel
2013/3/5 Robert McGibbon : > The fastcluster project by Dan Mullner, a professor of math and statistics > at Stanford, might be of interest. > > http://math.stanford.edu/~muellner/fastcluster.html > > These routines follow the same API of the hierarchical clustering routines > in scipy, including s

Re: [Scikit-learn-general] Hierarchical Clustering

2013-03-05 Thread Robert McGibbon
The fastcluster project by Dan Mullner, a professor of math and statistics at Stanford, might be of interest. http://math.stanford.edu/~muellner/fastcluster.html These routines follow the same API of the hierarchical clustering routines in scipy, including single linkage and complete linkage, b

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Lars Buitinck
2013/3/5 Rob Zinkov : > We really should have this support within the library. Does it make sense to > just use the functionality in the arff-package? Is it better than the one in scipy.io? -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Rob Zinkov
We really should have this support within the library. Does it make sense to just use the functionality in the arff-package? On Tue, Mar 5, 2013 at 7:34 AM, Christian wrote: > Hi Tom, > > recently I saw the arff-package in pypi. Seems working. > > import arff > import numpy as np > > barray = [

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:11 PM, Jaques Grobler wrote: > > Awesome, thanks. You have my inkscape file, right? > > > I am trying to make the user guide also more flat by putting your java > > script function in the file now :) > > I think as the user guide is much shorter now, it doesn't really hide > > thin

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:04 PM, Nelle Varoquaux wrote: > > > Maybe that is the problem the core problem. The documentation has not > been written to be without sections: before, the user guide was > divided into three parts: > 1. Installation > 2. Tutorials: an overview of the scikit > 3. Unsupervised l

Re: [Scikit-learn-general] Easy way to handle .arff files in sklearn?

2013-03-05 Thread Christian
Hi Tom, recently I saw the arff-package in pypi. Seems working. import arff import numpy as np barray = [] for row in arff.load('/home/chris/tools/weka-3-7-6/rd54_train.arff'): barray.append(list(row)) nparray = np.array(barray) print nparray.shape (4940, 56) HTH Christian > I’m trying t

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:11 PM, Jaques Grobler wrote: > > Awesome, thanks. You have my inkscape file, right? > > > I am trying to make the user guide also more flat by putting your java > > script function in the file now :) > > I think as the user guide is much shorter now, it doesn't really hide > > thin

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 04:04 PM, Nelle Varoquaux wrote: Maybe that is the problem the core problem. The documentation has not been written to be without sections: before, the user guide was divided into three parts: 1. Installation 2. Tutorials: an overview of the scikit 3. Unsupervised learning 4.

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Ronnie Ghose
interesting posts :). so 1) do we want a natural breaks method? https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization 2) have you considered looking at the distribution of the variable as they suggest? any small-d tends to allow this rather than the usual giant-d space. Do you have any

Re: [Scikit-learn-general] Announcement: scikit-image 0.8.0

2013-03-05 Thread Jaques Grobler
Congratulations :) Nice work 2013/3/4 Johannes Schönberger > Announcement: scikit-image 0.8.0 > > > We're happy to announce the 8th version of scikit-image! > > scikit-image is an image processing toolbox for SciPy that includes > algorithms > for segmentation,

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Jaques Grobler
> Awesome, thanks. You have my inkscape file, right? > I am trying to make the user guide also more flat by putting your java > script function in the file now :) > I think as the user guide is much shorter now, it doesn't really hide > things, but rather makes them easier to find. > We'll see. @

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Nelle Varoquaux
On 5 March 2013 15:39, Andreas Mueller wrote: > On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: > > Hi everyone, > > > > I'm actually not convinced about the new layout (sorry Andy :( ). I > > should also say, I'm not convinced about panda's website. > > > > The menu is, I think, quite confusing.

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: > Hi everyone, > > I'm actually not convinced about the new layout (sorry Andy :( ). I > should also say, I'm not convinced about panda's website. > > The menu is, I think, quite confusing. Overall, I think there are two > many links which may refer

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread nipun batra
It should. I would have straight away tried it, but read the following 2 posts: 1. http://stackoverflow.com/questions/11513484/1d-number-array-clustering 2. http://stats.stackexchange.com/questions/13781/clustering-1d-data Any thoughts? On Tue, Mar 5, 2013 at 8:24 PM, Ronnie Ghose wrote:

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:51 PM, nipun batra wrote: > Hi, > What clustering technique (with implementation in sklearn) is > recommended for 1d data? I'd recommend looking at it ;) It feels like there might be some sweeping algorithm to get the optimal solution for the k-means algorithm. KMeans should be f

Re: [Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread Ronnie Ghose
..does kmeans not work? On Tue, Mar 5, 2013 at 9:51 AM, nipun batra wrote: > Hi, > What clustering technique (with implementation in sklearn) is recommended > for 1d data? > > > -- > Everyone hates slow websites

[Scikit-learn-general] Suggested technique for 1 D clustering

2013-03-05 Thread nipun batra
Hi, What clustering technique (with implementation in sklearn) is recommended for 1d data? -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:46 PM, Ronnie Ghose wrote: > can you make the new website design part of a repo so we can submit > PRs or issues against it? > It is a branch in my sklearn fork, but the branch is not completely up to date, working on it. -

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Ronnie Ghose
can you make the new website design part of a repo so we can submit PRs or issues against it? On Tue, Mar 5, 2013 at 9:39 AM, Andreas Mueller wrote: > On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: > > Hi everyone, > > > > I'm actually not convinced about the new layout (sorry Andy :( ). I > > s

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 03:18 PM, Nelle Varoquaux wrote: > Hi everyone, > > I'm actually not convinced about the new layout (sorry Andy :( ). I > should also say, I'm not convinced about panda's website. > > The menu is, I think, quite confusing. Overall, I think there are two > many links which may refer

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Olivier Grisel
2013/3/5 Nelle Varoquaux : > Hi everyone, > > I'm actually not convinced about the new layout (sorry Andy :( ). I should > also say, I'm not convinced about panda's website. > > The menu is, I think, quite confusing. Overall, I think there are two many > links which may refer to the same thing: th

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Nelle Varoquaux
Hi everyone, I'm actually not convinced about the new layout (sorry Andy :( ). I should also say, I'm not convinced about panda's website. The menu is, I think, quite confusing. Overall, I think there are two many links which may refer to the same thing: the difference between "installation", "g

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 02:46 PM, Jaques Grobler wrote: > I like your changes Andy. It's definitely easier to navigate. I'm > currently also changing your graph from > http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.html > > into a documentation-linking version that can

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 02:55 PM, Gilles Louppe wrote: > I feel like the "About us" section on the homepage shouldn't be there. > I'd rather put a "About" link somewhere else than putting this in > front on the home page. Also, I would use the space that we now have > on the front page to highlight more impo

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Gilles Louppe
I feel like the "About us" section on the homepage shouldn't be there. I'd rather put a "About" link somewhere else than putting this in front on the home page. Also, I would use the space that we now have on the front page to highlight more important aspects of the package. On 5 March 2013 14:46,

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Jaques Grobler
I like your changes Andy. It's definitely easier to navigate. I'm currently also changing your graph from http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.htmlinto a documentation-linking version that can be added to the documentation. I'll try put an online build

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Peter Prettenhofer
libsvm does not support probability outputs for one-class SVM. One-class SVM is an algorithm for support estimation (not proper density estimation) - i.e. you get a confidence that P(X) > t - where t is somewhat concealed in the nu parameter. 2013/3/5 Lars Buitinck : > 2013/3/5 Bill Power : >> inv

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Bill Power
thanks lars i figured as much. do you know if there are any ppaers in the literature that i might be able to implement and then perhaps contribute the code to the package? or do i have to live with either using distances or a non-parameterised sigmoid function? thanks ---

Re: [Scikit-learn-general] one class svm probability

2013-03-05 Thread Lars Buitinck
2013/3/5 Bill Power : > investigating previous versions i saw that probability was available > in version 0.9 with predict_proba and predict_log_proba functions > http://scikit-learn.org/0.9/modules/generated/sklearn.svm.OneClassSVM.html > > but it's not here in the stable version > http://scikit-l

[Scikit-learn-general] one class svm probability

2013-03-05 Thread Bill Power
hi all. just looking at the one class svm and I'd like to get a probabililty rather than a distance output. i know that in regular svms you can get parameters for the sigmoid function from five-fold cross validation and that's done by setting the probability=True in the constructor. i presume it's

Re: [Scikit-learn-general] setup script refering to .c

2013-03-05 Thread amueller
Exactly. Not only would you need cython, it also needs to be a recent version. people with older versions would get cryptic error messages, leading to frustrated users and busy mailing lists. Matthieu Brucher schrieb: >Hi, > >If I remember correctly, this is done to avoid an explicit Cython

Re: [Scikit-learn-general] setup script refering to .c

2013-03-05 Thread Matthieu Brucher
Hi, If I remember correctly, this is done to avoid an explicit Cython dependency. Cheers, Matthieu 2013/3/5 Kevin Kunzmann > Hi all, > > why are the .c files used as sources in the setup scripts and not the > respective .pyx files? It came to my mind when I tried to extend the > decision tre

[Scikit-learn-general] setup script refering to .c

2013-03-05 Thread Kevin Kunzmann
Hi all, why are the .c files used as sources in the setup scripts and not the respective .pyx files? It came to my mind when I tried to extend the decision tree splitting criteria and my changes where not compiled... Would it not be safer to generate the .c on the fly using cythonize()? Can an

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 11:12 AM, Olivier Grisel wrote: > This looks good. Maybe we could reintroduce a canonical snippet on the > home page: > from sklearn.datasets import load_digits from sklearn.cross_validation import train_test_split from sklearn.svm import LinearSVC digits = load_di

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Olivier Grisel
This looks good. Maybe we could reintroduce a canonical snippet on the home page: >>> from sklearn.datasets import load_digits >>> from sklearn.cross_validation import train_test_split >>> from sklearn.svm import LinearSVC >>> digits = load_digits() >>> X_train, X_test, y_train, y_test = train_te

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
Ok, working now: http://amueller.github.com/ All feedback welcome :) I'd like to avoid bombarding the user with long lists / pages as much as possible. The "Getting Started" and "Development" pages now are a length that mostly fit on a screen and that I can still grasp. If we had an algorithms p

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread federico vaggi
Yup - you can just install those packages, then try to run the default example/tests, and both pass for me! For other packages, like mysqldb, which is a breeze to compile on Linux, but compiling it on Windows under 64 bit is incredibly painful. Here is a good guide if you want to do it on your ow

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
On 03/05/2013 09:11 AM, Andreas Mueller wrote: > As there is so much positive feedback, I might make something up tonight. > As I like to make small steps, I'd get rid of the defunc search bar and > add some more > menu items instead (and adjust the respective pages obv.) > I made a page but the CS

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread klo uo
So are you saying llvm isn't needed, if numba/llvmpy are installed from Christoph's packages? On Tue, Mar 5, 2013 at 9:21 AM, federico vaggi wrote: > For Windows, installing numba is a breeze using: > > http://www.lfd.uci.edu/~gohlke/pythonlibs/ > > Basically, all the gnarly extensions are avail

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread 党晓彬
+1 On Tue, Mar 5, 2013 at 4:19 PM, Alexandre Gramfort < alexandre.gramf...@inria.fr> wrote: > > ps: For those who are wondering: no, I didn't choose this time because > > Gael is offline. > > :) > > > I'd rather like to have his input :-/ he is back in a month, right? > > 3 weeks > > Alex > > >

Re: [Scikit-learn-general] numba, cython and relation to sklearn future

2013-03-05 Thread federico vaggi
For Windows, installing numba is a breeze using: http://www.lfd.uci.edu/~gohlke/pythonlibs/ Basically, all the gnarly extensions are available already compiled with all the dependencies handled properly. It's absolutely amazing and I strongly encourage everyone who uses Python on Windows sometim

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Alexandre Gramfort
> ps: For those who are wondering: no, I didn't choose this time because > Gael is offline. :) > I'd rather like to have his input :-/ he is back in a month, right? 3 weeks Alex -- Everyone hates slow websites. So do w

Re: [Scikit-learn-general] Flat is better than nested: Website edition

2013-03-05 Thread Andreas Mueller
As there is so much positive feedback, I might make something up tonight. As I like to make small steps, I'd get rid of the defunc search bar and add some more menu items instead (and adjust the respective pages obv.) ps: For those who are wondering: no, I didn't choose this time because Gael is