Re: [GSoC 2016][ZEPPELIN-684] DataMining - create Notebooks /w example of Analytics

Alexander Bezzubov Wed, 02 Mar 2016 20:56:29 -0800

Hi Anish,

thank you and others who have expressed their interest in this GSoC project
on JIRA [4]!

Apache [0] in general, is the great place where one can learn a lot but
nobody will teach your per-se, so in order to be successful you better
possess and demonstrate strong self-learning attitude early on.

First thing to do would be to make yourself familiar with Zeppelin project
- build it, try one of the existing notebooks, i.e from [1]. The next step
is to engage with community [2] and demonstrate ability to contribute
-  update\add\improve some docs and help other users on user@ mailing list,
answering questions.

For this particular project communication with users and engagement in the
community would be very important requirement.

The early you start doing all above - higher are the chances of your
application being accepted for this project.

Then, you should also start preparing a formal proposal or application for
this project, and earlier is better here as well. For this process, please
refer the suggestions in email thread for another GSoC project I gave
Subj: ZEPPELIN-682 [3]. Please read it carefully.
>From there I will be happy to assist you with the proposal further down the
road, but again the goal here for you would be - to put it in front of the
public on this list, to get proof-readed and incorporate the feedback
before actually submitting it to google on the students application
deadline.

The main objective for this whole project ZEPPELIN-684 [4] is: to build as
many as possible (at least 4: 2 before mid-term, and 2 after) sophisticated
notebooks that demonstrate different cases of using Zeppelin by the end of
the summer.

It is very flexible and up to you what to use here, but things we are
particularly looking for include:
- backend processing systems (Flink, Spark, Geode, etc)
- Machine Learning, Deep Learning, etc
- custom GUI using display systems [5]
- new features from the Helium [6]
and closer the results are to an actual 'data products" - the better.
Ideally, it could solve some actual problem you or somebody else have
already been working on.

This will include creative choice of the public datasets and user cases
(scenarios?) that one want to highlight in his proposal for this project.
The expectations are that a high-level overview of scenarios and user-cases
and datasets would be a part of your application proposal, as well as
further timeline breakdown for the whole summer period with
milestones\sprints at least every 2 weeks. References to any opensource
code you wrote before are very welcome as part of proposal too.

Hope this helps and looking forward proposal drafts!

P.S Chances are high that in case of high-quality material as a result of
this project - there will be a possibility to showcase it on one of the
major industry conferences as well!

0. http://theapacheway.com
1. http://zeppelinhub.com/viewer
2. http://zeppelin.incubator.apache.org/community.html
3. http://markmail.org/search/ZEPPELIN-682+list:org.apache.zeppelin.dev
4. https://issues.apache.org/jira/browse/ZEPPELIN-684
5.
http://zeppelin.incubator.apache.org/docs/0.6.0-incubating-SNAPSHOT/displaysystem/angular.html
6. https://cwiki.apache.org/confluence/display/ZEPPELIN/Helium+proposal

--
Alex

On Wed, Mar 2, 2016 at 11:11 AM, anish singh <[email protected]> wrote:

> Hello All,
> I'm a 2nd year undergraduate student in Computer Science and Engineering at
> the LNMIIT Jaipur, India. For the past four months I've extensively been
> studying Apache Spark to develop a share price prediction program. I'm
> deeply interested in the mentioned project and would like to take it up for
> the summer of 2016 and request to be guided further in studying about
> Zeppelin and the related issues so that I may be able to draw up my
> proposal.
> Thank You,
> Anish.
>

Re: [GSoC 2016][ZEPPELIN-684] DataMining - create Notebooks /w example of Analytics

Reply via email to