Hi Anish, thank you and others who have expressed their interest in this GSoC project on JIRA [4]!
Apache [0] in general, is the great place where one can learn a lot but nobody will teach your per-se, so in order to be successful you better possess and demonstrate strong self-learning attitude early on. First thing to do would be to make yourself familiar with Zeppelin project - build it, try one of the existing notebooks, i.e from [1]. The next step is to engage with community [2] and demonstrate ability to contribute - update\add\improve some docs and help other users on user@ mailing list, answering questions. For this particular project communication with users and engagement in the community would be very important requirement. The early you start doing all above - higher are the chances of your application being accepted for this project. Then, you should also start preparing a formal proposal or application for this project, and earlier is better here as well. For this process, please refer the suggestions in email thread for another GSoC project I gave Subj: ZEPPELIN-682 [3]. Please read it carefully. >From there I will be happy to assist you with the proposal further down the road, but again the goal here for you would be - to put it in front of the public on this list, to get proof-readed and incorporate the feedback before actually submitting it to google on the students application deadline. The main objective for this whole project ZEPPELIN-684 [4] is: to build as many as possible (at least 4: 2 before mid-term, and 2 after) sophisticated notebooks that demonstrate different cases of using Zeppelin by the end of the summer. It is very flexible and up to you what to use here, but things we are particularly looking for include: - backend processing systems (Flink, Spark, Geode, etc) - Machine Learning, Deep Learning, etc - custom GUI using display systems [5] - new features from the Helium [6] and closer the results are to an actual 'data products" - the better. Ideally, it could solve some actual problem you or somebody else have already been working on. This will include creative choice of the public datasets and user cases (scenarios?) that one want to highlight in his proposal for this project. The expectations are that a high-level overview of scenarios and user-cases and datasets would be a part of your application proposal, as well as further timeline breakdown for the whole summer period with milestones\sprints at least every 2 weeks. References to any opensource code you wrote before are very welcome as part of proposal too. Hope this helps and looking forward proposal drafts! P.S Chances are high that in case of high-quality material as a result of this project - there will be a possibility to showcase it on one of the major industry conferences as well! 0. http://theapacheway.com 1. http://zeppelinhub.com/viewer 2. http://zeppelin.incubator.apache.org/community.html 3. http://markmail.org/search/ZEPPELIN-682+list:org.apache.zeppelin.dev 4. https://issues.apache.org/jira/browse/ZEPPELIN-684 5. http://zeppelin.incubator.apache.org/docs/0.6.0-incubating-SNAPSHOT/displaysystem/angular.html 6. https://cwiki.apache.org/confluence/display/ZEPPELIN/Helium+proposal -- Alex On Wed, Mar 2, 2016 at 11:11 AM, anish singh <anish18...@gmail.com> wrote: > Hello All, > I'm a 2nd year undergraduate student in Computer Science and Engineering at > the LNMIIT Jaipur, India. For the past four months I've extensively been > studying Apache Spark to develop a share price prediction program. I'm > deeply interested in the mentioned project and would like to take it up for > the summer of 2016 and request to be guided further in studying about > Zeppelin and the related issues so that I may be able to draw up my > proposal. > Thank You, > Anish. >