Re: GSoC proposal

2015-03-26 Thread Gábor Gévay
Hello,

Thank you very much for your comments! I will remove the part about
the windowing optimizations (though, that was my favourite part :) ),
and think about what other statistics could be added. And thank you
for the link with the collection of many relevant algorithms, they are
very interesting!

Best regards,
Gabor



2015-03-26 17:35 GMT-05:00 Paris Carbone par...@kth.se:
 Hi Gabor,

 Approximate statistics is a really good topic, I think there is a lot to do 
 if you focus there. One idea would also be to include some of your 
 contributions to the incremental machine learning library that will be 
 available by June. From there you will be able to also use sampling and 
 stream mining primitives out-of-the-box among others. Regarding window 
 optimisations, as Gyula said, there is not much to do simply because we are 
 working heavily on it already. Good luck and thanks for the proposal!

 Paris

 On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote:

 Hey Gabor,

 Thank you for the proposal. It has many interesting ideas and a good
 potential.

 My comments:

 We already have a large amount of ongoing work on the windowing
 optimizations, covering your suggestions in section 1. It would be better
 to drop that part from the project because thats very heavily on the
 research side and as I said we are working on this at SICS.

 I like the list that you made for section 2., and this should be the main
 emphasis on the project. It would indeed be very nice to have a wide range
 of statistics that we can compute (or approximate - this should be optional
 thoug) on streams and windows (maybe we should also add some practical
 stuff like top-k, distinct etc).

 Here is a list of interesting papers that seems to be related to this
 project

 https://gist.github.com/debasishg/8172796

 Cheers,
 Gyula

 On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.

 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.

 Best regards,
 Gabor




Re: GSoC proposal

2015-03-26 Thread Gyula Fóra
Hey Gabor,

Thank you for the proposal. It has many interesting ideas and a good
potential.

My comments:

We already have a large amount of ongoing work on the windowing
optimizations, covering your suggestions in section 1. It would be better
to drop that part from the project because thats very heavily on the
research side and as I said we are working on this at SICS.

I like the list that you made for section 2., and this should be the main
emphasis on the project. It would indeed be very nice to have a wide range
of statistics that we can compute (or approximate - this should be optional
thoug) on streams and windows (maybe we should also add some practical
stuff like top-k, distinct etc).

Here is a list of interesting papers that seems to be related to this
project

https://gist.github.com/debasishg/8172796

Cheers,
Gyula

On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.

 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.

 Best regards,
 Gabor



Re: GSoC proposal

2015-03-26 Thread Paris Carbone
Hi Gabor,

Approximate statistics is a really good topic, I think there is a lot to do if 
you focus there. One idea would also be to include some of your contributions 
to the incremental machine learning library that will be available by June. 
From there you will be able to also use sampling and stream mining primitives 
out-of-the-box among others. Regarding window optimisations, as Gyula said, 
there is not much to do simply because we are working heavily on it already. 
Good luck and thanks for the proposal! 

Paris

 On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote:
 
 Hey Gabor,
 
 Thank you for the proposal. It has many interesting ideas and a good
 potential.
 
 My comments:
 
 We already have a large amount of ongoing work on the windowing
 optimizations, covering your suggestions in section 1. It would be better
 to drop that part from the project because thats very heavily on the
 research side and as I said we are working on this at SICS.
 
 I like the list that you made for section 2., and this should be the main
 emphasis on the project. It would indeed be very nice to have a wide range
 of statistics that we can compute (or approximate - this should be optional
 thoug) on streams and windows (maybe we should also add some practical
 stuff like top-k, distinct etc).
 
 Here is a list of interesting papers that seems to be related to this
 project
 
 https://gist.github.com/debasishg/8172796
 
 Cheers,
 Gyula
 
 On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote:
 
 Hello,
 
 I will be applying to the Google Summer of Code, and I wrote most of
 the proposal:
 http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
 I would appreciate it if you could comment on it.
 
 Gyula Fora, git blame is telling me that you wrote most of the
 relevant parts of the windowing code, so I would be especially
 interested in what you think of my improvement ideas.
 
 Best regards,
 Gabor