[Scikit-learn-general] GSoC 2017 : "Parallel Decision Tree Building"

2017-03-22 Thread Aman Pratik
Hello Developers, This is Aman Pratik. I am currently pursuing my B.Tech from Indian Institute of Technology, Varanasi. After doing some research I have found some material on Decision Trees and Parallelization. Hence, I propose my first draft for the project "Parallel Decision Tree Building" for

[Scikit-learn-general] GSoC 2017

2017-03-02 Thread Aman Pratik
Hello Developers, This is Aman Pratik. I am currently pursuing my B.Tech from Indian Institute of Technology, Varanasi. I am a keen software developer and not very new to the open source community. I am interested in your project "*Improve online learning for linear models*" for GSoC 2017. I have

[Scikit-learn-general] GSoC 2017

2017-02-05 Thread Kanchana Ranasinghe
Hi I'm an undergrad student from Sri Lanka. Interested in working on a machine learning related project. I've been involved in some machine learning related research work at uni. I am currently going through your developers guide and issues page currently. Do let me know how else I can get involve

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-04-06 Thread Maniteja Nandana
Hi Andreas, Raghav and Jacob, Thank you for your inputs. I have attached the links to the final draft of the proposal. I would really be grateful if anyone has any other suggestions and would be happy to incorporate them. Thanks for your time. Wiki proposal

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-29 Thread Maniteja Nandana
Hi everyone, Thanks for the inputs. I have created a wiki page here for the work aimed to be done in better handling of missing data including working on the stalled PR on Matrix Factorization, KNN imp

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-25 Thread Raghav R V
Yes! Exactly the same! On Fri, Mar 25, 2016 at 6:21 PM, Maniteja Nandana < maniteja.modesty...@gmail.com> wrote: > Hi Raghav, > > Thanks a lot for the idea. I would be glad to work on it and along with > the "output dummy one-hot encoder features for imputer to specify if the > feature > value i

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-25 Thread Maniteja Nandana
Hi Raghav, Thanks a lot for the idea. I would be glad to work on it and along with the "output dummy one-hot encoder features for imputer to specify if the feature value is imputed or not", would the the idea to add " binary indicator feature (for each possibly missing feature) that indicate featu

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-25 Thread Andreas Mueller
On 03/25/2016 11:11 AM, Raghav R V wrote: > Hey Maniteja, > > I took a look at your proposal. As I said before I feel it is a bit > broad and you should try to narrow it down to a good theme. > > Since you have chosen more than one PRs which are missing value > related, I have a suggestion for

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-25 Thread Raghav R V
Hey Maniteja, I took a look at your proposal. As I said before I feel it is a bit broad and you should try to narrow it down to a good theme. Since you have chosen more than one PRs which are missing value related, I have a suggestion for a theme - "Better Missing Value Handling" You could grou

Re: [Scikit-learn-general] GSoC 2016

2016-03-24 Thread Raghav R V
Greetings everyone! Please note that the deadline for students to submit their Final PDF proposal is Friday, March 25th at 19:00 UTC. Students must submit a Final PDF proposal before the deadline or scikit-learn/Python Software Foundation will not be able to select them as a student. Students wit

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-23 Thread Maniteja Nandana
Hi Raghav, Thanks a lot for your reply. That helps so much. I understand that the proposal should be specific to a module but right now I am not sure which of these implementation are the most sought-after. I will update the proposal based on the inputs. I also have looked at the stalled PRs of

Re: [Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-23 Thread Raghav R V
Hey Maniteja, Having taken a quick look at the list... my thoughts - * The KNN imputation is an important addition that got stalled. * The semi-supervised NB with EM seems like a good addition, Olivier, Larsmans (and Joel?) have to comment on whether it should be a priority. * The haversine metri

[Scikit-learn-general] GSoC 2016

2016-03-21 Thread Raghav R V
Greetings everyone! This is a gentle reminder to the prospective GSoC applicants to post their proposals on the summerofcode.withgoogle website. As this is the place where your application becomes formal and the mentors comment and suggest improvements. Please note that the deadline is 25th and y

[Scikit-learn-general] GSoC suggestions : work on various stalled PRs and issues

2016-03-21 Thread Maniteja Nandana
Hello everyone, My name is Maniteja, a senior year computer science student from India ( github ) It was been a wonderful learning opportunity contributing to the library for the past few months and would like to thank everyone for their support and patiently answer

[Scikit-learn-general] GSoC 2016 proposal for the fused type project.

2016-03-21 Thread Devashish Deshpande
Hey everyone, My name is Devashish Deshpande. I will be applying for GSoC with scikit-learn for this summer. My aim is to contribute and work on bigger projects for scikit learn. I would love to keep contributing to scikit learn in the future and work towards making it even better. I have submitte

Re: [Scikit-learn-general] GSoC 2016 Proposal: Adding fused types to Cython files (YenChen Lin)

2016-03-21 Thread lin yenchen
Hello federico vaggi, Thanks for your advice. Would you please provide some example benchmarks for me? I'll do a survey on it and add it into my proposal. Best, YenChen -- Transform Data into Opportunity. Accelerate data

Re: [Scikit-learn-general] GSoC 2016 Proposal: Adding fused types to Cython files (YenChen Lin)

2016-03-21 Thread federico vaggi
Oh - and an extensive set of benchmarks would be very useful if people wanted to try to re-implement some of those core algorithms using alternative technologies like numba. On Mon, 21 Mar 2016 at 12:55 federico vaggi wrote: > This is incredibly well detailed and explained. The only suggestion

Re: [Scikit-learn-general] GSoC 2016 Proposal: Adding fused types to Cython files (YenChen Lin)

2016-03-21 Thread federico vaggi
This is incredibly well detailed and explained. The only suggestion I'd add is to add some benchmarks (mostly for speed, the change in memory usage should be predictable, I think). On Mon, 21 Mar 2016 at 05:37 lin yenchen wrote: > Hello everyone, > I'm Yen-Chen Lin, a senior student at Tsing Hu

[Scikit-learn-general] GSoC 2016 Proposal: Adding fused types to Cython files (YenChen Lin)

2016-03-20 Thread lin yenchen
Hello everyone, I'm Yen-Chen Lin, a senior student at Tsing Hua University major in Computer Science. (Here is my Github account ) At first, thanks for reviewing and providing lots of conducive advice on my previous PRs

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Sebastian Raschka
> Pardon me if I am saying something stupid, but isn't Theano/Tensorflow > about deep learning and not reinforcement learning. RL can be done with > deep learning, but it's more than that, and I suspect that it requires a > different API, in particular with the notion of actions. Sure, I understan

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Kyle Kastner
Any RL package will have be heavily focused on non-iid data (timeseries, basically) with the additional difficulty of the agent effecting/interacting with the environment it is operating in. I agree with you Gael - many packages for "deep learning" also don't handle this type of data/these models (

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Gael Varoquaux
Pardon me if I am saying something stupid, but isn't Theano/Tensorflow about deep learning and not reinforcement learning. RL can be done with deep learning, but it's more than that, and I suspect that it requires a different API, in particular with the notion of actions. G On Wed, Mar 02, 2016 a

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Sebastian Raschka
You mean a scikit-like interface to Theano/Tensorflow? That’s actually what skflow intends to do. > On Mar 2, 2016, at 3:02 PM, Nadim Farhat wrote: > > I was just thinking the same but , how about just making pipelines to Theano > , TensorFlow ? > > On Wed, Mar 2, 2016 at 3:00 PM Sebastia

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Gael Varoquaux
On Wed, Mar 02, 2016 at 11:34:17AM -0800, Jacob Schreiber wrote: > Reinforcement learning is an exciting field of machine learning, and you're > right that it seems underrepresented in Python. However, I don't think that it > falls within the strict scope of the scikit-learn API.  Indeed. There's

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Nadim Farhat
I was just thinking the same but , how about just making pipelines to Theano , TensorFlow ? On Wed, Mar 2, 2016 at 3:00 PM Sebastian Raschka wrote: > I am not a core developer and thus really can’t comment about the scope of > scikit-learn here :P. But I am a curious about how to implement it

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Sebastian Raschka
I am not a core developer and thus really can’t comment about the scope of scikit-learn here :P. But I am a curious about how to implement it in scikit-learn efficiently. I think an implementation based on Theano or TensorFlow may be a better place for such a module (maybe skflow, which has a s

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Michał Koziarski
I see. Thank you for quick answer. 2016-03-02 20:31 GMT+01:00 Andreas Mueller : > > > On 03/02/2016 02:21 PM, Michał Koziarski wrote: > > As far as I can tell, except PyBrain (which doesn't seem to be > > actively developed) there are no reinforcement learning libraries in > > Python. I was wonde

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Andreas Mueller
On 03/02/2016 02:21 PM, Michał Koziarski wrote: > As far as I can tell, except PyBrain (which doesn't seem to be > actively developed) there are no reinforcement learning libraries in > Python. I was wondering if community would be interested in using one > and making it a part of scikit-learn

Re: [Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Jacob Schreiber
Reinforcement learning is an exciting field of machine learning, and you're right that it seems underrepresented in Python. However, I don't think that it falls within the strict scope of the scikit-learn API. On Wed, Mar 2, 2016 at 11:21 AM, Michał Koziarski wrote: > Hello everyone, > > As far

[Scikit-learn-general] GSoC Project Proposal: Reinforcement Learning Module

2016-03-02 Thread Michał Koziarski
Hello everyone, As far as I can tell, except PyBrain (which doesn't seem to be actively developed) there are no reinforcement learning libraries in Python. I was wondering if community would be interested in using one and making it a part of scikit-learn. Does it lie within the scope of the projec

[Scikit-learn-general] GSoC Ideas Page

2016-03-02 Thread Chirag Nagpal
Hey Guys! Any plans of updating the GSoC Ideas page for 2016? (https://github.com/scikit-learn/scikit-learn/wiki/Google-summer-of-code-%28GSOC%29-2015) Chirag Nagpal Senior Year Undergrad University of Pune, India www.chiragnagpal.com ---

[Scikit-learn-general] GSoC project discussion

2016-03-01 Thread Devashish Deshpande
Hi everyone, Now that the project discussion phase for GSoC has started, I was wondering if the project page will be updated if possible? Can the project of adding fused type to cython files as discussed in issue #5973 and PR #5464 be taken up if the community is interested? It seems a lot of memo

Re: [Scikit-learn-general] GSoC midterms NOW!

2015-07-03 Thread Wei Xue
Thanks, Olivier, Loïc, Andreas and Vlad. I will keep moving forward! Wei Xue On Wed, Jul 1, 2015 at 3:09 AM, Olivier Grisel wrote: > Hi all, > > Sorry I am late on my emails, I am at a conference. > > I have not invested enough time to mentor Wei Xue on the GMM but he is > responsive and still

Re: [Scikit-learn-general] GSoC midterms NOW!

2015-07-01 Thread Olivier Grisel
Hi all, Sorry I am late on my emails, I am at a conference. I have not invested enough time to mentor Wei Xue on the GMM but he is responsive and still making progress on a regular basis albeit behind schedule. So I plan to make him pass. -- Olivier --

Re: [Scikit-learn-general] GSoC midterms NOW!

2015-06-30 Thread Andreas Mueller
Sorry, late to my emails. Terri actually wants the mid-terms done TODAY! On 06/30/2015 02:25 PM, Andreas Mueller wrote: > Hey All. > This is a friendly reminder to all the mentors and students that > mid-terms are coming up this Friday. > Mentors should if possible at all submit their reviews be

[Scikit-learn-general] GSoC midterms

2015-06-30 Thread Andreas Mueller
Hey All. This is a friendly reminder to all the mentors and students that mid-terms are coming up this Friday. Mentors should if possible at all submit their reviews before that. It would be great to have at least parts of the projects merged by then. I was out for the last week and I'm a bit beh

[Scikit-learn-general] GSoC 2015 - Improvements to the Cross Validation - Raghav R V

2015-06-16 Thread Raghav R V
Hey, I am starting this thread to assist in tracking the progress of my GSoC project and to post my weekly blog posts. Main objectives of my GSoC Project - 1. model_selection refactoring 2. Data independent CV Iterators. 3. Multiple Metric support 4. sample_weight etc support in grid_search 5. A

Re: [Scikit-learn-general] GSoC Community Bonding

2015-06-16 Thread Raghav R V
Hey Gael, This is my blog URL - https://rvraghav93.blogspot.in Apologies for the delay in the response!! :) - R -- ___ Scikit-learn-general mailing list Scikit-learn-general@li

Re: [Scikit-learn-general] GSoC Community Bonding

2015-05-27 Thread Artem
Hi Gael ​ My GSoC blog url is http://barmaley-exe.blogspot.com As required, there's relevant tag gsoc15 On Mon, May 25, 2015 at 3:08 PM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > Hi GSOC students, > > And welcome. I hope that you will have a fun and productive summer. > > To commun

Re: [Scikit-learn-general] GSoC Community Bonding

2015-05-26 Thread Wei Xue
Sure. Mine is http://xuewei4d.github.io. Currently, I am writing the equations of VBGMM, it is a lot. I switched to the code and the issue today. Thanks, Wei Xue On Mon, May 25, 2015 at 8:08 AM, Gael Varoquaux < gael.varoqu...@normalesup.org> wrote: > Hi GSOC students, > > And welcome. I hope t

Re: [Scikit-learn-general] GSoC Community Bonding

2015-05-25 Thread Gael Varoquaux
Hi GSOC students, And welcome. I hope that you will have a fun and productive summer. To communicate well on your project,  and to help you draw big picture from your work, the PSF requires that blog every week on your project. Can you send me the URL of your blog,  so that I add it to plane

[Scikit-learn-general] GSoC Community Bonding

2015-05-19 Thread Andreas Mueller
Hey all, in particular hey Mentors and hey GSoC Students! We are in the community bonding period right now, and I just want to make sure that mentors and students are engaged and talking. I'd really like you all to join gitter: https://gitter.im/scikit-learn/scikit-learn where currently most di

[Scikit-learn-general] [GSoC] Project Metric Learning

2015-05-02 Thread Artem
Hello Andreas Hello Michael First, I'm happy to be selected as this year's scikit-learn student, and hope to make a great work. According to my timeline , I'm going to use community

Re: [Scikit-learn-general] GSoC 2015: Semi-supervised learning proposal

2015-04-11 Thread Vinayak Mehta
I've made some changes to the proposal. https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-scikit-learn:-Cross-validation-and-Meta-estimators-for-Semi-supervised-learning On Wed, Apr 8, 2015 at 8:44 PM, Andreas Mueller wrote: > So what would you sugggest? > > On 04/07/2015 08:

Re: [Scikit-learn-general] GSoC 2015: Semi-supervised learning proposal

2015-04-08 Thread Andreas Mueller
So what would you sugggest? On 04/07/2015 08:01 PM, Charles Martin wrote: > I think this is a great idea and would be happy to help > > I have deep experience with these methods > > Charles Martin, PhD > > > On Tue, Apr 7, 2015 at 4:48 PM, Vinayak Mehta wrote: >> Hi Andy, everyone! >> >> @Andy >>

Re: [Scikit-learn-general] GSoC 2015: Semi-supervised learning proposal

2015-04-07 Thread Charles Martin
I think this is a great idea and would be happy to help I have deep experience with these methods Charles Martin, PhD On Tue, Apr 7, 2015 at 4:48 PM, Vinayak Mehta wrote: > Hi Andy, everyone! > > @Andy > Just saw your comment on melange. Thanks for the suggestions! I'm working on > them to mak

Re: [Scikit-learn-general] GSoC 2015: Semi-supervised learning proposal

2015-04-07 Thread Vinayak Mehta
Hi Andy, everyone! @Andy Just saw your comment on melange. Thanks for the suggestions! I'm working on them to make my proposal more clear. @Everyone It would be awesome if you could suggest new algorithms which you might think would be good for the semi_supervised module, as I've mentioned in the

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Vlad Niculae
> > In order to support discrete parameters, our tree implementation would need > to support categorical variables though. > Ah, good point, I didn’t think about that. But we could use the usual hacks (integer or one-hot encoding). I wonder how that compares to using GPs and rounding when it c

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Andreas Mueller
On 03/31/2015 07:30 PM, Mathieu Blondel wrote: > SMAC needs the variance of predictions so we'll need to get this PR > merged > https://github.com/scikit-learn/scikit-learn/pull/3645 > > I would really like #3645 to get merged in any case. I have not even seen that one! Shame on me! It looks pre

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Mathieu Blondel
On Wed, Apr 1, 2015 at 4:05 AM, Vlad Niculae wrote: > Hi Gael, > > > On 31 Mar 2015, at 14:01, Gael Varoquaux > wrote: > > > >> Why do you think the GP route is easier? > > > > Because we already have GPs. > We have a GP implementation but it's being rewritten... > Well, we already have rando

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Vlad Niculae
Hi Gael, > On 31 Mar 2015, at 14:01, Gael Varoquaux > wrote: > >> Why do you think the GP route is easier? > > Because we already have GPs. Well, we already have random forests too. Both cases would need quite a bit of machinery on top, and I don’t know the extent of it, but I thought it wo

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Gael Varoquaux
On Tue, Mar 31, 2015 at 11:29:41AM -0400, Andreas Mueller wrote: > To do the Bayesian optimization, I think it needs more than an example. Maybe I am naive, but I thought that it wasn't that bad. > Why do you think the GP route is easier? Because we already have GPs. Also, I am worried that the

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-31 Thread Andreas Mueller
To do the Bayesian optimization, I think it needs more than an example. Step two is really non-trivial. Why do you think the GP route is easier? On 03/28/2015 01:29 PM, Gael Varoquaux wrote: > Sorry for the slow reply, > > On Fri, Mar 27, 2015 at 11:49:46AM -0400, hamzeh alsalhi wrote: >> I have

Re: [Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-28 Thread Gael Varoquaux
Sorry for the slow reply, On Fri, Mar 27, 2015 at 11:49:46AM -0400, hamzeh alsalhi wrote: > I have revised my proposal to focus only on SMAC and to prioritize SMAC RF > because it can be worked on independently GP.  I actually believed that GP were an easier route forward. The way I would have t

[Scikit-learn-general] GSoC 2015: Global optimization based Hyper parameter optimization (SMAC)

2015-03-27 Thread hamzeh alsalhi
I have revised my proposal to focus only on SMAC and to prioritize SMAC RF because it can be worked on independently GP. Thank you for meeting with me in person Vlad, and for giving me feedback on ways to improve my proposal. https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-G

[Scikit-learn-general] GSoC 2015: Semi-supervised learning proposal

2015-03-27 Thread Vinayak Mehta
Hi everyone! I've updated my proposal. I know it's a bit late for asking a review but I would really appreciate it if you could help me by suggesting new additions and pointing out mistakes. :) Here's the link: https://docs.google.com/document/d/1JCbeakBtPTpfis2grw00I8Y1VVivssAdiHlm1ejS3E8/edit?us

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-26 Thread Raghav R V
Hey Gael, I am sorry that I missed this comment of yours - > > 1. The design of multiple metric support is important and would bring an immense usability gain. > But it will also require a framework of its own. I would say that this is to be considered in a second step. Could you expand a littl

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Artem
Hm, but similarity-based clustering works with inter-data similarities, doesn't it? The result's shape would be like [n_samples_in_transform, n_samples_in_train] which is not what we want. On Thu, Mar 26, 2015 at 12:36 PM, Mathieu Blondel wrote: > Something like this: > > class SimilarityTransfo

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Mathieu Blondel
Something like this: class SimilarityTransformer(TransformerMixin): def fit(self, X, y): self.X_ = X; return self def transform(self, X): return -euclidean_distances(X, self.X_) On Thu, Mar 26, 2015 at 6:28 PM, Artem wrote: > Yes, the only need for such similarity learner

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Artem
Oops, missed "Reply all" once again. Copying the message Yes, the only need for such similarity learners is to use them in a pipeline. It's especially convenient if one wants to do non-linear metric learning using Kernel PCA trick. Then it'd be just another step in the pipeline. What do you mean

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Mathieu Blondel
On Thu, Mar 26, 2015 at 5:49 PM, Artem wrote: > 1. Right, forgot to add that parameter. Well, I can apply an RBF kernel to > get a similarity matrix from a distance matrix inside transform. > > 2. Usual transformer returns neither distance, nor similarity, but > transforms the input space so that

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Artem
1. Right, forgot to add that parameter. Well, I can apply an RBF kernel to get a similarity matrix from a distance matrix inside transform. 2. Usual transformer returns neither distance, nor similarity, but transforms the input space so that usual Euclidean distance acts like the learned Mahalanob

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Mathieu Blondel
- Spectral clustering use similarities rather than distances and needs affinity="precomputed" (otherwise, it assumes that X is [n_samples, n_features]) - Instead of duplicating each class, you could create a generic transformer that outputs a similarity / distance matrix from X. M. On Thu, Mar 26

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-26 Thread Artem
Sorry, apparently I clicked reply and my previous message went to Mathieu only. Repeat them here: In case of vector y there's no other way, but to assume transitivity. Which is not general enough, but should work in a classification setting. After all, many of these methods are designed to aid KNN

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Mathieu Blondel
> Each of them is a transformer that utilizes y during fit, where y is a usual vector of labels of training samples, just like in case of classification. I am actually confused by this. How are you going to encode the similarities / dissimilarities between samples if y is a vector? > Another poss

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-25 Thread Raghav R V
Hi all, thanks a lot for the comments! I've just edited/formatted my prop. based on all of your comments... https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements Only thing to be done is to plan what I

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Michael Eickenberg
I do not know the exact state of the algorithm, but the author was working on sklearn compatibility at a sklearn sprint last summer. It seemed like the algorithmic side had been pretty much taken care of, but this needs to be checked. Michael On Wed, Mar 25, 2015 at 11:08 PM, Artem wrote: > Yes

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Artem
Yes, I saw the repo. Didn't know, though, that it's almost completed, thanks for checking! On Thu, Mar 26, 2015 at 1:05 AM, Michael Eickenberg < michael.eickenb...@gmail.com> wrote: > FWIW, although the NCA conversation on github ( > https://github.com/scikit-learn/scikit-learn/issues/3213) is on

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Michael Eickenberg
FWIW, although the NCA conversation on github ( https://github.com/scikit-learn/scikit-learn/issues/3213) is only an issue, Roland (https://github.com/RolT) actually has a full implementation of NCA, which is almost (up to a few details, such as the **kwargs, the class inheritance and some camel ca

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Andreas Mueller
You can always amend your melange proposal, so there is no reason not to submit an early version. On 03/25/2015 04:18 PM, Artem wrote: ​Ok, so I removed matrix y from the proposal . Therefore I also

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-25 Thread Artem
​Ok, so I removed matrix y from the proposal . Therefore I also shortened the first iteration by one week, since no changes to the current code are needed. This allowed me to extend the last iteration by

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-25 Thread Andreas Mueller
On 03/24/2015 07:39 PM, Vlad Niculae wrote: > Hi Raghav, hi everyone, > > If I may, I have a very high-level comment on your proposal. It clearly shows > that you are very involved in the project and understand the internals well. > However, I feel like it’s written from a way too technical per

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-25 Thread Boyuan Deng
Hi everyone: I have updated my proposal according to your suggestions. You can find the updates on the wiki page. Text in the melange system has also been updated. https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:--Cross-validation-and-Meta-estimators-for-Semi-supervised-Lear

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-25 Thread Raghav R V
Hi Vlad!! Thanks a tonne for the detailed review of my proposal. :) > Your proposal contains implementation details, but little or no discussion of why each change is important and how it impacts users Yes, I'll add a section discussing the motivation of the various deliverable. (which actually

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-24 Thread Gael Varoquaux
On Tue, Mar 24, 2015 at 07:39:17PM -0400, Vlad Niculae wrote: > 1. The design of multiple metric support is important and would bring an > immense usability gain. But it will also require a framework of its own. I would say that this is to be considered in a second step. G -

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Gael Varoquaux
> I think the problem with matrix-like Y is that Y would be symmetric. Thus for > doing cross-validation one would need to select both rows and columns. Correct. Then ideed it's off limits. These are specifically the kind of problem I would like not to have to worry about. The combination of all t

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Gael Varoquaux
On Tue, Mar 24, 2015 at 09:04:28PM -0400, Vlad Niculae wrote: > There were two API issues and I think both need thought. The first is the > matrix-like Y which at the moment overlaps semantically with multilabel and > multioutput-multiclass (though I think it could be seen as a form of > multi-t

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Gael Varoquaux
On Wed, Mar 25, 2015 at 03:25:40AM +0300, Artem wrote: > You mean matrix-like y? matrix-like y (ie y 2D: n_sample, n_features) is already covered in our API, so I see no problem with it. -- Dive into the World of Parallel

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Gael Varoquaux
On Wed, Mar 25, 2015 at 11:22:51AM +0900, Mathieu Blondel wrote: > The part I am most enthusiastic about is fixing the CV generators, though this > could be a merge nightmare since we are in the process of changing the API. We > need it to figure out which modifications are most likely to get in fi

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Mathieu Blondel
The part I am most enthusiastic about is fixing the CV generators, though this could be a merge nightmare since we are in the process of changing the API. We need it to figure out which modifications are most likely to get in first. Lars did some work on semi-supervised naive bayes. Since this is

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Mathieu Blondel
I think the problem with matrix-like Y is that Y would be symmetric. Thus for doing cross-validation one would need to select both rows and columns. This is why I suggested to add a _pairwise_y property like the _pairwise property that we use in kernel methods, e.g., https://github.com/scikit-learn

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Artem
Hi Vlad 1. Usually metric learning uses supervision in one of 2 forms: either two sets of similar (distance is less than some predefined value u) and dissimilar (distance is bigger than l) pairs, or a set of triplets (x, y, z) such that d(x, y) < d(x, z). Though, I think, it's possible to generali

Re: [Scikit-learn-general] [GSOC] Global optimization based Hyper parameter optimization Hamzeh Alsalhi

2015-03-24 Thread hamzeh alsalhi
Hi Andy! I improved my proposal. My background is somewhat beginner so I am doing my best to make sure I understand what I am getting myself into. I have removed the redundant problem statement. I added details for what I think implementation will consist of, right now I am mostly referencing the

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Vlad Niculae
Hi Artem, hi everybody, There were two API issues and I think both need thought. The first is the matrix-like Y which at the moment overlaps semantically with multilabel and multioutput-multiclass (though I think it could be seen as a form of multi-target regression…) The second is the `estima

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Boyuan Deng
Hi Vlad: Thank you for your comments! I think I should rename that part as something like "add new implementations and improve existing ones" and mention self-taught learning as an example. We can further discuss what semi-supervised algorithms (one or more) we want later on. Exact dates have

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Artem
You mean matrix-like y? Gael said > > FWIW It'll require some changes to cross-validation routines.​ > I'd rather we try not to add new needs and usecases to these before we > ​ ​ > release 1.0. We are already having a hard time covering in a homogeneous > ​ ​ > way all the possible options.​

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Vlad Niculae
Hi Boyuan, hi everyone, On top of what Andy said, I would like to add that you don’t have to commit to certain algorithms in the proposal, as long as you make the plan very clear, and you leave time for discussing alternatives, pros and cons with the community. Since you say there is some ove

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-24 Thread Vlad Niculae
Hi Raghav, hi everyone, If I may, I have a very high-level comment on your proposal. It clearly shows that you are very involved in the project and understand the internals well. However, I feel like it’s written from a way too technical perspective. Your proposal contains implementation detai

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Boyuan Deng
Hi Andreas: when I think there is a closed form solution Yes, I remember that in some paper they first give the analytical solution to the optimization problem, and then prove that it's the same result that iterative version will converge to. I'll find that paper and read it again. I think

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Olivier Grisel
I also share Gael's concerns with respect to extending our API in yet another direction at a time where we are trying to focus on ironing out consistency issues... -- Olivier -- Dive into the World of Parallel Programmin

[Scikit-learn-general] [GSOC] Global optimization based Hyper parameter optimization Hamzeh Alsalhi

2015-03-24 Thread Andy
Hi Hamzeh. Somehow I didn't see you posting in this years GSoC thread, maybe I was looking for the wrong email address. Here is some initial feedback on your GSoC proposal. Problem description and Project abstract seem a bit redundant. I don't think you mean "constitutional neural networks". It

Re: [Scikit-learn-general] [GSoC 2015] Cross-validation and Meta-Estimators for semi-supervised learning

2015-03-24 Thread Andy
Hi Boyuan. I looked over your application and it looks good so far. I think it could be a bit more ambitious. I know the idea page was not very elaborate. It might be interesting to improve the existing graph-based algorithms. There is some discussion in https://github.com/scikit-learn/scikit-l

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Artem
> > ​ > In other words, I would like to get in an "API freeze" state where we add/modify only essentials stuff to the API. ​Ok, then I suppose, the easiest way would be to create 2 kind of transformers for each method: one that transforms the space so that Euclidean distance acts like Mahalanobis'

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Joel Nothman
On 25 March 2015 at 00:01, Gael Varoquaux wrote: > > > To make this more concrete, the MetricLearner().metric_ estimator would > > require specialised set_params or clone behaviour, I assume. I.e. it > > involves hacking API fundamentals. > > It's more a general principle of "freeze": to be able

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Gael Varoquaux
> To make this more concrete, the MetricLearner().metric_ estimator would > require specialised set_params or clone behaviour, I assume. I.e. it > involves hacking API fundamentals. It's more a general principle of "freeze": to be able to settle down on something that we _know_ works and is robus

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Joel Nothman
On 24 March 2015 at 23:56, Gael Varoquaux wrote: > > So I just thought: what if metric learners will have an attribute > `metric` > > Before adding features and API entries, I'd really like to focus on > having a 1.0 release, with a fixed API that really solves the problems > that we currently ar

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Gael Varoquaux
> So I just thought: what if metric learners will have an attribute `metric` Before adding features and API entries, I'd really like to focus on having a 1.0 release, with a fixed API that really solves the problems that we currently are trying to solve. In other words, I would like to get in an

Re: [Scikit-learn-general] [GSoC] Metric Learning

2015-03-24 Thread Artem
> > ​ > I'd still call it ``transform`` probably, though. It would be a bit > confusing because it uses the squared transform, but it would make it > possible to build pipelines with clustering algorithms. ​ It's unfortunate that we already have a transform for "linear" metric learners. One could

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-24 Thread Joel Nothman
I agree with everything Andy says. I think the core developers are very enthusiastic to have a project along the lines of "Finish all the things that need finishing", but it's very impractical to do so much context switching both for students and mentors/reviewers. One of the advantages of GSoC is

Re: [Scikit-learn-general] GSoC 2015 Proposal: Multiple Metric Learning

2015-03-24 Thread Raghav R V
Hi Andy, Thanks a lot for your feedback... I'll update my proposal wiki based on your guidelines and also submit the same to melange too by today! Thanks, R On Tue, Mar 24, 2015 at 3:10 AM, Andreas Mueller wrote: > Hi Raghav. > > I feel that your proposal lacks some focus. > I'd remove the

  1   2   3   4   5   6   >