Re: GSoC proposal
Hello, Thank you very much for your comments! I will remove the part about the windowing optimizations (though, that was my favourite part :) ), and think about what other statistics could be added. And thank you for the link with the collection of many relevant algorithms, they are very interesting! Best regards, Gabor 2015-03-26 17:35 GMT-05:00 Paris Carbone par...@kth.se: Hi Gabor, Approximate statistics is a really good topic, I think there is a lot to do if you focus there. One idea would also be to include some of your contributions to the incremental machine learning library that will be available by June. From there you will be able to also use sampling and stream mining primitives out-of-the-box among others. Regarding window optimisations, as Gyula said, there is not much to do simply because we are working heavily on it already. Good luck and thanks for the proposal! Paris On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote: Hey Gabor, Thank you for the proposal. It has many interesting ideas and a good potential. My comments: We already have a large amount of ongoing work on the windowing optimizations, covering your suggestions in section 1. It would be better to drop that part from the project because thats very heavily on the research side and as I said we are working on this at SICS. I like the list that you made for section 2., and this should be the main emphasis on the project. It would indeed be very nice to have a wide range of statistics that we can compute (or approximate - this should be optional thoug) on streams and windows (maybe we should also add some practical stuff like top-k, distinct etc). Here is a list of interesting papers that seems to be related to this project https://gist.github.com/debasishg/8172796 Cheers, Gyula On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote: Hello, I will be applying to the Google Summer of Code, and I wrote most of the proposal: http://compalg.inf.elte.hu/~ggevay/Proposal.pdf I would appreciate it if you could comment on it. Gyula Fora, git blame is telling me that you wrote most of the relevant parts of the windowing code, so I would be especially interested in what you think of my improvement ideas. Best regards, Gabor
Re: GSoC proposal
Hey Gabor, Thank you for the proposal. It has many interesting ideas and a good potential. My comments: We already have a large amount of ongoing work on the windowing optimizations, covering your suggestions in section 1. It would be better to drop that part from the project because thats very heavily on the research side and as I said we are working on this at SICS. I like the list that you made for section 2., and this should be the main emphasis on the project. It would indeed be very nice to have a wide range of statistics that we can compute (or approximate - this should be optional thoug) on streams and windows (maybe we should also add some practical stuff like top-k, distinct etc). Here is a list of interesting papers that seems to be related to this project https://gist.github.com/debasishg/8172796 Cheers, Gyula On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote: Hello, I will be applying to the Google Summer of Code, and I wrote most of the proposal: http://compalg.inf.elte.hu/~ggevay/Proposal.pdf I would appreciate it if you could comment on it. Gyula Fora, git blame is telling me that you wrote most of the relevant parts of the windowing code, so I would be especially interested in what you think of my improvement ideas. Best regards, Gabor
Re: GSoC proposal
Hi Gabor, Approximate statistics is a really good topic, I think there is a lot to do if you focus there. One idea would also be to include some of your contributions to the incremental machine learning library that will be available by June. From there you will be able to also use sampling and stream mining primitives out-of-the-box among others. Regarding window optimisations, as Gyula said, there is not much to do simply because we are working heavily on it already. Good luck and thanks for the proposal! Paris On 26 Mar 2015, at 22:59, Gyula Fóra gyula.f...@gmail.com wrote: Hey Gabor, Thank you for the proposal. It has many interesting ideas and a good potential. My comments: We already have a large amount of ongoing work on the windowing optimizations, covering your suggestions in section 1. It would be better to drop that part from the project because thats very heavily on the research side and as I said we are working on this at SICS. I like the list that you made for section 2., and this should be the main emphasis on the project. It would indeed be very nice to have a wide range of statistics that we can compute (or approximate - this should be optional thoug) on streams and windows (maybe we should also add some practical stuff like top-k, distinct etc). Here is a list of interesting papers that seems to be related to this project https://gist.github.com/debasishg/8172796 Cheers, Gyula On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay gga...@gmail.com wrote: Hello, I will be applying to the Google Summer of Code, and I wrote most of the proposal: http://compalg.inf.elte.hu/~ggevay/Proposal.pdf I would appreciate it if you could comment on it. Gyula Fora, git blame is telling me that you wrote most of the relevant parts of the windowing code, so I would be especially interested in what you think of my improvement ideas. Best regards, Gabor