Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Anders Aagaard
That is a good point.. with persistant storage sync this isn't a major issue. Thanks a lot for the input so far everyone. I'll probably end up spending some time on this over the next month or so, if I end up with some interesting scripts I'll stick them on github. On Mon, Sep 1, 2014 at 4:34 P

Re: [Scikit-learn-general] Using subgroup discovery

2014-09-01 Thread Debanjan Bhattacharyya
Hi Gael I would say the Seminal Paper is the following Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In Principles of Data Mining and Knowledge Discovery, pages 78{87. Springer. And more papers are here Abudawood, T. and Flach, P. (2009). Evaluation measures for m

Re: [Scikit-learn-general] Using subgroup discovery

2014-09-01 Thread Gael Varoquaux
On Mon, Sep 01, 2014 at 11:38:46PM +0530, Debanjan Bhattacharyya wrote: > If you type "Cortana Subgroup Discovery" in scholar.google.com, you will get a > list of papers. Which is the seminal paper. I get only 15 papers with this query, and none have much citations. Cheers, Gaël ---

Re: [Scikit-learn-general] Using subgroup discovery

2014-09-01 Thread Debanjan Bhattacharyya
Hi Gaël Subgroup Discovery is a well established algorithm. If you type "Cortana Subgroup Discovery" in scholar.google.com, you will get a list of papers. Some more can be seen here http://datamining.liacs.nl/background.html I think it will satisfy all the conditions mentioned in the link you p

Re: [Scikit-learn-general] Using subgroup discovery

2014-09-01 Thread Gael Varoquaux
On Mon, Sep 01, 2014 at 11:01:25PM +0530, Debanjan Bhattacharyya wrote: > Subgroup Discovery is a great option to bridge the gap here (like we have > Cortana implemented in Java. Python will be much faster without a doubt.). > Is there any plan to get this incorporated within sklearn ? The necess

[Scikit-learn-general] Using subgroup discovery

2014-09-01 Thread Debanjan Bhattacharyya
Hi All I have been working for a while on sklearn, targeting different real life machine learning problems. I have experienced there is a large range of problems were the tree algorithms are not at par, because of the way the splits are done, (gt and lt) on values. This is specifically applicable

Re: [Scikit-learn-general] Print coordinate descent coefficients at each iteration

2014-09-01 Thread Danny Sullivan
Hey Alberto, Try with gil: print("something with gil") Let me know if that works Danny From: albert...@gmail.com Date: Mon, 1 Sep 2014 17:23:02 +0200 To: scikit-learn-general@lists.sourceforge.net Subject: [Scikit-learn-general] Print coordinate descent coefficients at each iteration

Re: [Scikit-learn-general] Print coordinate descent coefficients at each iteration

2014-09-01 Thread Lars Buitinck
2014-09-01 17:23 GMT+02:00 Alberto Torres : > I would like to print the coordinate descent coefficients at each iteration. > So far I've identified the code and variable I want to print. In particular > I want to print the variable w in function enet_coordinate_descent from > cd_fast.pyx. The probl

[Scikit-learn-general] Print coordinate descent coefficients at each iteration

2014-09-01 Thread Alberto Torres
Hi, I would like to print the coordinate descent coefficients at each iteration. So far I've identified the code and variable I want to print. In particular I want to print the variable *w* in function *enet_coordinate_descent* from *cd_fast.pyx*. The problem is I do not have any previous experien

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Sujit Pal
Hi Anders, >> The problem as I see it is the "tearing it down" bit, I don't want the jobs shutting down before the user has had a chance to get the resulting data, but I suspect if we let users shut them down themselfes a lot of them will sit around for no reason. With Amazon EMR you read and writ

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Olivier Grisel
2014-09-01 12:39 GMT+02:00 Anders Aagaard : > Data sync is a very good point.. and will vary greatly depending on how we > set things up. If we do a single major server thing we can probably get > people to scp things in, if we use containers that are started up and killed > off on VM's that's not

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Anders Aagaard
Data sync is a very good point.. and will vary greatly depending on how we set things up. If we do a single major server thing we can probably get people to scp things in, if we use containers that are started up and killed off on VM's that's not really a good option. I've used reverse sshfs (moun

[Scikit-learn-general] Error on AdditiveChi2Sampler documentation?

2014-09-01 Thread Victor Augusto Escorcia Castillo
Hi guys, I was testing (version 0.15.1) the AdditiveChi2Sampler kernel approx method and it seems that the output has a different size wrt the size cited in the documentation. The documentation said that it should be (n_examples, (2n+1)*n_features) and I'm getting (n_examples, (2n-1)*n_features).

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Gavin Gray
I've used git-annex recently. It works basically like git, with a few caveats. I don't know if sparkleshare deals with large files in the same way but git-annex has no problems with very large data files. -Gavin On Mon, Sep 1, 2014 at 9:36 AM, Gael Varoquaux <

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Olivier Grisel
Le 1 sept. 2014 10:37, "Gael Varoquaux" a écrit : > > On Mon, Sep 01, 2014 at 10:33:20AM +0200, Olivier Grisel wrote: > > However I do not know any ready made equivalent to dropbox that is > > vendor agnostic. > > I like SparkleShare: git-based distributed storage. > http://sparkleshare.org/ Base

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Gael Varoquaux
On Mon, Sep 01, 2014 at 10:33:20AM +0200, Olivier Grisel wrote: > However I do not know any ready made equivalent to dropbox that is > vendor agnostic. I like SparkleShare: git-based distributed storage. http://sparkleshare.org/ Gaël --

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Olivier Grisel
This might help: http://file-syncer.readthedocs.org/en/latest/ It looks like an equivalent to rsync based on libcloud. -- Olivier -- Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/

Re: [Scikit-learn-general] Shared scikit/ipython server

2014-09-01 Thread Olivier Grisel
> The problem as I see it is the "tearing it down" bit, I don't want the jobs > shutting down before the user has had a chance to get the resulting data, but > I suspect if we let users shut them down themselfes a lot of them will sit > around for no reason. I think it's important to provide th