Hey,
I am starting this thread to assist in tracking the progress of my GSoC
project and to post my weekly blog posts.
Main objectives of my GSoC Project -
1. model_selection refactoring
2. Data independent CV Iterators.
3. Multiple Metric support
4. sample_weight etc support in grid_search
5. A
I've made some changes to the proposal.
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-scikit-learn:-Cross-validation-and-Meta-estimators-for-Semi-supervised-learning
On Wed, Apr 8, 2015 at 8:44 PM, Andreas Mueller wrote:
> So what would you sugggest?
>
> On 04/07/2015 08:
So what would you sugggest?
On 04/07/2015 08:01 PM, Charles Martin wrote:
> I think this is a great idea and would be happy to help
>
> I have deep experience with these methods
>
> Charles Martin, PhD
>
>
> On Tue, Apr 7, 2015 at 4:48 PM, Vinayak Mehta wrote:
>> Hi Andy, everyone!
>>
>> @Andy
>>
I think this is a great idea and would be happy to help
I have deep experience with these methods
Charles Martin, PhD
On Tue, Apr 7, 2015 at 4:48 PM, Vinayak Mehta wrote:
> Hi Andy, everyone!
>
> @Andy
> Just saw your comment on melange. Thanks for the suggestions! I'm working on
> them to mak
Hi Andy, everyone!
@Andy
Just saw your comment on melange. Thanks for the suggestions! I'm working
on them to make my proposal more clear.
@Everyone
It would be awesome if you could suggest new algorithms which you might
think would be good for the semi_supervised module, as I've mentioned in
the
>
> In order to support discrete parameters, our tree implementation would need
> to support categorical variables though.
>
Ah, good point, I didn’t think about that. But we could use the usual hacks
(integer or one-hot encoding). I wonder how that compares to using GPs and
rounding when it c
On 03/31/2015 07:30 PM, Mathieu Blondel wrote:
> SMAC needs the variance of predictions so we'll need to get this PR
> merged
> https://github.com/scikit-learn/scikit-learn/pull/3645
>
> I would really like #3645 to get merged in any case.
I have not even seen that one! Shame on me!
It looks pre
On Wed, Apr 1, 2015 at 4:05 AM, Vlad Niculae wrote:
> Hi Gael,
>
> > On 31 Mar 2015, at 14:01, Gael Varoquaux
> wrote:
> >
> >> Why do you think the GP route is easier?
> >
> > Because we already have GPs.
>
We have a GP implementation but it's being rewritten...
> Well, we already have rando
Hi Gael,
> On 31 Mar 2015, at 14:01, Gael Varoquaux
> wrote:
>
>> Why do you think the GP route is easier?
>
> Because we already have GPs.
Well, we already have random forests too.
Both cases would need quite a bit of machinery on top, and I don’t know the
extent of it, but I thought it wo
On Tue, Mar 31, 2015 at 11:29:41AM -0400, Andreas Mueller wrote:
> To do the Bayesian optimization, I think it needs more than an example.
Maybe I am naive, but I thought that it wasn't that bad.
> Why do you think the GP route is easier?
Because we already have GPs. Also, I am worried that the
To do the Bayesian optimization, I think it needs more than an example.
Step two is really non-trivial.
Why do you think the GP route is easier?
On 03/28/2015 01:29 PM, Gael Varoquaux wrote:
> Sorry for the slow reply,
>
> On Fri, Mar 27, 2015 at 11:49:46AM -0400, hamzeh alsalhi wrote:
>> I have
Sorry for the slow reply,
On Fri, Mar 27, 2015 at 11:49:46AM -0400, hamzeh alsalhi wrote:
> I have revised my proposal to focus only on SMAC and to prioritize SMAC RF
> because it can be worked on independently GP.
I actually believed that GP were an easier route forward.
The way I would have t
I have revised my proposal to focus only on SMAC and to prioritize SMAC RF
because it can be worked on independently GP.
Thank you for meeting with me in person Vlad, and for giving me feedback on
ways to improve my proposal.
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-G
Hi everyone!
I've updated my proposal. I know it's a bit late for asking a review but I
would really appreciate it if you could help me by suggesting new additions
and pointing out mistakes. :) Here's the link:
https://docs.google.com/document/d/1JCbeakBtPTpfis2grw00I8Y1VVivssAdiHlm1ejS3E8/edit?us
Hey Gael,
I am sorry that I missed this comment of yours -
> > 1. The design of multiple metric support is important and would bring
an immense usability gain.
> But it will also require a framework of its own. I would say that this is
to be considered in a second step.
Could you expand a littl
Hi all,
thanks a lot for the comments!
I've just edited/formatted my prop. based on all of your comments...
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements
Only thing to be done is to plan what I
On 03/24/2015 07:39 PM, Vlad Niculae wrote:
> Hi Raghav, hi everyone,
>
> If I may, I have a very high-level comment on your proposal. It clearly shows
> that you are very involved in the project and understand the internals well.
> However, I feel like it’s written from a way too technical per
Hi everyone:
I have updated my proposal according to your suggestions.
You can find the updates on the wiki page. Text in the melange system
has also been updated.
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:--Cross-validation-and-Meta-estimators-for-Semi-supervised-Lear
Hi Vlad!!
Thanks a tonne for the detailed review of my proposal. :)
> Your proposal contains implementation details, but little or no
discussion of why each change is important and how it impacts users
Yes, I'll add a section discussing the motivation of the various
deliverable. (which actually
On Tue, Mar 24, 2015 at 07:39:17PM -0400, Vlad Niculae wrote:
> 1. The design of multiple metric support is important and would bring an
> immense usability gain.
But it will also require a framework of its own. I would say that this is
to be considered in a second step.
G
-
On Wed, Mar 25, 2015 at 11:22:51AM +0900, Mathieu Blondel wrote:
> The part I am most enthusiastic about is fixing the CV generators, though this
> could be a merge nightmare since we are in the process of changing the API. We
> need it to figure out which modifications are most likely to get in fi
The part I am most enthusiastic about is fixing the CV generators, though
this could be a merge nightmare since we are in the process of changing the
API. We need it to figure out which modifications are most likely to get in
first.
Lars did some work on semi-supervised naive bayes. Since this is
Hi Vlad:
Thank you for your comments!
I think I should rename that part as something like "add new
implementations and improve existing ones" and mention self-taught
learning as an example. We can further discuss what semi-supervised
algorithms (one or more) we want later on.
Exact dates have
Hi Boyuan, hi everyone,
On top of what Andy said, I would like to add that you don’t have to commit to
certain algorithms in the proposal, as long as you make the plan very clear,
and you leave time for discussing alternatives, pros and cons with the
community.
Since you say there is some ove
Hi Raghav, hi everyone,
If I may, I have a very high-level comment on your proposal. It clearly shows
that you are very involved in the project and understand the internals well.
However, I feel like it’s written from a way too technical perspective. Your
proposal contains implementation detai
Hi Andreas:
when I think there is a closed form solution
Yes, I remember that in some paper they first give the analytical
solution to the optimization problem, and then prove that it's the same
result that iterative version will converge to. I'll find that paper and
read it again.
I think
Hi Boyuan.
I looked over your application and it looks good so far.
I think it could be a bit more ambitious. I know the idea page was not
very elaborate.
It might be interesting to improve the existing graph-based algorithms.
There is some discussion in
https://github.com/scikit-learn/scikit-l
I agree with everything Andy says. I think the core developers are very
enthusiastic to have a project along the lines of "Finish all the things
that need finishing", but it's very impractical to do so much context
switching both for students and mentors/reviewers.
One of the advantages of GSoC is
Hi Andy,
Thanks a lot for your feedback... I'll update my proposal wiki based on
your guidelines and also submit the same to melange too by today!
Thanks,
R
On Tue, Mar 24, 2015 at 3:10 AM, Andreas Mueller wrote:
> Hi Raghav.
>
> I feel that your proposal lacks some focus.
> I'd remove the
Hi Raghav.
I feel that your proposal lacks some focus.
I'd remove the two:
Mallow's Cp for LASSO / LARS
Implement built in abs max scaler, Nesterov's momentum and finish up the
Multilayer Perceptron module.
And as discussed in this thread probably also
Forge a self sufficient ML tutorial base
can you please also upload it to melange?
On 03/22/2015 08:52 PM, Raghav R V wrote:
2 things :
* The subject should have been "Multiple Metric Support in grid_search
and cross_validation modules and other general improvements" and not
multiple metric learning! Sorry for that!
* The link was n
Thanks for all the good comments!! I'll replace that section of my proposal
with some other more important work! :)
On Mon, Mar 23, 2015 at 7:53 PM, Matthieu Brucher <
matthieu.bruc...@gmail.com> wrote:
> > For practical purposes, I currently know of 2 (3?) sklearn books
> > published with PACKT.
> For practical purposes, I currently know of 2 (3?) sklearn books
> published with PACKT. There is also an OReilly book coming up:
> http://shop.oreilly.com/product/0636920030515.do
2 general books, 1 cookbook and I think there is another one
half-written as well. Didn't know about O'Reilly, good
On 03/22/2015 07:57 PM, Raghav R V wrote:
>
> 2. Given that there is a huge interest among students in learning
> about ML, do you think it would be within the scope of/beneficial to
> skl to have all the exercises and/or concepts, from a good quality
> book (ESL / PRML / Murphy) or an academi
Have you had a look at the issues tagged "easy"?
On 03/22/2015 05:47 PM, Boyuan Deng wrote:
Hi all:
This is the link to my proposal for the "Cross-validation and
Meta-estimators for Semi-supervised Learning" topic:
https://docs.google.com/document/d/1f2nfFEBk567QhKd2OJzDNM9t21Glkp0XxFgtbpy8Uj
2 things :
* The subject should have been "Multiple Metric Support in grid_search and
cross_validation modules and other general improvements" and not multiple
metric learning! Sorry for that!
* The link was not available due to the trailing "." (dot), which has been
fixed now!
Thanks
R
On Mon,
>
> 1. the link is broken
>
Ah! Sorry :) -
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements
.
2. that sounds quite difficult and unfortunately conducive to cheating
>
Hmm... Should I then simply op
1. the link is broken
2. that sounds quite difficult and unfortunately conducive to cheating
On Sun, Mar 22, 2015 at 7:57 PM, Raghav R V wrote:
> Hi,
>
> 1. This is my proposal for the multiple metric learning project as a wiki
> page -
> https://github.com/scikit-learn/scikit-learn/wiki/GSoC-
Hi,
1. This is my proposal for the multiple metric learning project as a wiki
page -
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:-Multiple-metric-support-for-CV-and-grid_search-and-other-general-improvements
.
Possible mentors : Andreas Mueller (amueller) and Joel Nothma
Hi Boyuan,
I have added your proposal to our wiki :) -
https://github.com/scikit-learn/scikit-learn/wiki/GSoC-2015-Proposal:--Cross-validation-and-Meta-estimators-for-Semi-supervised-Learning
I hope you don't mind the same. I've also added your name to the possible
candidate under the GSOC 15 pag
Hi all:
This is the link to my proposal for the "Cross-validation and
Meta-estimators for Semi-supervised Learning" topic:
https://docs.google.com/document/d/1f2nfFEBk567QhKd2OJzDNM9t21Glkp0XxFgtbpy8UjI/edit?usp=sharing
Please leave comments and help improving it!
Also I want to contribute a
Hi Boyuan,
It's good to hear you're an experienced scikit-learn user, and that it has
worked for you. It's also pleasing to hear someone's interested in this
project, because I feel the semi-supervised capabilities of scikit-learn
and its API have been left half-baked.
I strongly recommend, howev
Hi all:
I am a Master's student in the European Union's Erasmus Mundus LCT
program, studying natural language processing at Saarland University,
Germany and also doing machine learning and information retrieval at
Max-Planck Institute for Informatics, which is on the same campus.
These years
An ELM implementation from GSOC 2014 is awaiting review before merge, i.e.
it's very close to inclusion. Perhaps you could contribute comments at
https://github.com/scikit-learn/scikit-learn/pull/3306, but I don't think
another contribution of ELMs would be appropriate.
Also, the best persuasion t
Hi Everyone,
I am a sophomore pursuing my B.Tech in Computer Science and Engineering
having significant experience in Machine Learning and ANN. I am working on
a independent(for now) python library for several variants of extreme
Learning machine, some of which I have already implemented in Matlab
Hi scikit-learn community,
The Google Summer of Code 2015 is upon us once again. As always, the
success of the projects depends not only on the students, but also on
having a great team of mentors to guide and support them.
Thus, I'd like to make a call for mentors and project suggestions.
If yo
46 matches
Mail list logo