Re: [Zeppelin Notebooks][GSOC - 2016] About Project

2016-08-29 Thread anish singh
16 at 6:44 PM, anish singh wrote: > Hello, > > I am very thankful for selecting me for the project [Creating Zeppelin > Notebooks]. > > As I mentioned in the proposal, I would be having my end semester exams > till 9th May, so I would be able to start work only from that date.

Re: [GSoC - 2016][Zeppelin Notebooks] Fifth Notebook Review

2016-08-18 Thread anish singh
/open?id=0ByXTtaL2yHBuLW9maklXTGg0QlE Thanks, Anish. On Thu, Aug 18, 2016 at 1:28 PM, anish singh wrote: > Hello, > > Fifth notebook on the SNAP(Stanford Network Analysis Project) datasets is > ready for review at [0]. > > Documentation and blog for the notebook is ready at [1

[GSoC - 2016][Zeppelin Notebooks] Fifth Notebook Review

2016-08-18 Thread anish singh
Hello, Fifth notebook on the SNAP(Stanford Network Analysis Project) datasets is ready for review at [0]. Documentation and blog for the notebook is ready at [1]. For ease of viewing, a sample video showing a demo run of the notebook is available at [2]. Also, please let me know that for submis

[GSoC - 2016][Zeppelin Notebooks] Stanford Datasets Collection

2016-08-08 Thread anish singh
Hello, The Stanford Large Network Dataset Collection [0] mentions a list of datasets that are available for use and analysis from their site. The datasets are mainly graph datasets on various Internet activities such as on-line community interaction and reddit posts, amazon product and customer da

Re: [GSoC - 2016][Zeppelin Notebooks] Fourth Notebook Review

2016-08-08 Thread anish singh
Hello, The version of the notebook that I provided [0] is the latest one, but yes - -> The exception stack trace was because that paragraph was trying to create a file that was already present at that location/directory, meaning that paragraph was supposed to be run only once but by mistake I ran

[GSoC - 2016][Zeppelin Notebooks] Fourth Notebook Review

2016-08-08 Thread anish singh
Hello, I am pleased to let you know that fourth notebook is ready for review at [0]. The notebook uses both WARC and WET format of data. Warcbase library has been extensively used. The notebook contains seven sections ending with a search engine built using Apache Lucene. Documentation and blog f

Re: [GSoC - 2016][Zeppelin Notebooks] Issues with Common Crawl Datasets

2016-07-20 Thread anish singh
ell. > > Please keep us posted! > > 1. > http://spark.apache.org/docs/latest/tuning.html#memory-management-overview > 2. > > http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ > 3. https://groups.google.com/forum/#!forum/common-crawl > >

Re: [GSoC - 2016][Zeppelin Notebooks] Issues with Common Crawl Datasets

2016-07-19 Thread anish singh
er, choosing another instance (such as 32 GiB) may also not be sufficient (as per calculations in [0]). Please let me know if I'm missing something or how to proceed with this. [0]. https://drive.google.com/open?id=0ByXTtaL2yHBuYnJSNGt6T2U2RjQ Thanks, Anish. On Tue, Jul 12, 2016 at 12:35 PM,

Re: [GSoC - 2016][Zeppelin Notebooks] Issues with Common Crawl Datasets

2016-07-12 Thread anish singh
> That sounds great, Anish! > Congratulations on getting a new machine. > > No worries, please take your time and keep us posted on your exploration! > Quality is more important than quantity here. > > -- > Alex > > On Mon, Jul 4, 2016 at 10:40 PM, anish singh wrot

Re: [GSoC - 2016][Zeppelin Notebooks] Issues with Common Crawl Datasets

2016-07-04 Thread anish singh
> > I understand that this can be a big task, so do not worry if that > takes time (learning AWS, etc) - just keep us posted on your progress > weekly and I'll be glad to help! > > > 1. > http://zeppelin.apache.org/docs/0.6.0-SNAPSHOT/storage/storage.html#notebook-stor

[GSoC - 2016][Zeppelin Notebooks] Issues with Common Crawl Datasets

2016-07-04 Thread anish singh
Hello, (everything outside Zeppelin) I had started work on the common crawl datasets, and tried to first have a look at only the data for May 2016. Out of the three formats available, I chose the WET(plain text format). The data only for May is divided into segments and there are 24492 such segmen

[GSoC - 2016][Zeppelin Notebook] Third Notebook Review

2016-07-01 Thread anish singh
Hello, Third notebook is ready for review at [0]. This notebook uses Apache Flink for the analysis of data and the visualizations using html display and Helium application. The documentation and blog for the notebook is ready for review at [1]. The zeppelin hub viewer link for the notebook is at

Re: [GSoC-2016][Zeppelin Notebooks] Second Notebook Review

2016-06-16 Thread anish singh
really well. >> >> Appreciate for your hard work! >> >> Thanks, >> moon >> >> On Wed, Jun 15, 2016 at 9:00 AM anish singh wrote: >> >> > Hello, >> > >> > Second notebook is up for review at [0]. Custom visualizations for t

[GSoC-2016][Zeppelin Notebooks] Second Notebook Review

2016-06-15 Thread anish singh
Hello, Second notebook is up for review at [0]. Custom visualizations for the notebook were created using html display and Helium Application. I'm Sorry for delay in creating the notebook as filtering and cleaning the datasets for this notebook took a lot of time. Documentation for the notebook i

Re: Doubt about Zeppelin Server

2016-06-13 Thread anish singh
I found that directory : ZEPPELIN_HOME/zeppelin-web/dist Anish. On Mon, Jun 13, 2016 at 9:37 PM, anish singh wrote: > Hello, > > I just wanted to know that Zeppelin Server (localhost:8080) refers to > which directory inside 'incubator-zeppelin' directory. For example, ins

Doubt about Zeppelin Server

2016-06-13 Thread anish singh
Hello, I just wanted to know that Zeppelin Server (localhost:8080) refers to which directory inside 'incubator-zeppelin' directory. For example, inside any directory(in general), we can run 'python -m SimpleHTTPServer 8000' to run simple http server and the server (localhost:8000) is set up for th

Re: issues with Visualization

2016-06-07 Thread anish singh
ntainer. > > On Tue, Jun 7, 2016 at 12:23 PM, anish singh wrote: > > > Hello, > > > > Here are the links to the images that contain those visualizations [0] , > > [1], [2]. For the chord diagram in [0]. I simply used the visualization > at > > http://bl.ocks

Re: issues with Visualization

2016-06-06 Thread anish singh
n try it out. > You can use https://www.zeppelinhub.com/viewer if the notebook is hosted > somewhere public > > On Tue, Jun 7, 2016 at 11:40 AM, anish singh wrote: > > > Hello, > >> > >> I was trying to create custom visualizations using d3 and html display &g

Fwd: issues with Visualization

2016-06-06 Thread anish singh
> > Hello, > > I was trying to create custom visualizations using d3 and html display but > after running the paragraph containing the code(%html), the > results(visualizations) don't occur inside the paragraphs, instead they > come in the background outside the paragraphs, (as in the image). This

Re: [GSoC - 2016][Zeppelin Notebooks] First Notebook Review

2016-05-31 Thread anish singh
m90ZS5qc29u > > > On Sat, May 28, 2016 at 11:22 PM, moon soo Lee wrote: > > > Hi Anish, > > > > Checked your notebook and article, and they looks really good! > > Great work! > > > > Thanks, > > moon > > > > On Fri, May 27, 2016 at 1

[GSoC - 2016][Zeppelin Notebooks] First Notebook Review

2016-05-27 Thread anish singh
Hello, Firstly, congratulations for being accepted as a top level project. First Notebook on the World bank datasets is ready for review at [0]. Documentation and blog for the notebook is done and ready at [1]. Helium Functionality for custom visualization will be added to the notebook at [0].