Re: [GSoC 2015][COMDEV-119] Zeppelin GSoC Project: add more D3 visualization

Alex B. Mon, 30 Mar 2015 20:31:35 -0700

Thank you very much for all the effort that you put in this project!

We and GSoC management we'll keep you posted on the status of the proposal.


On Mon, Mar 30, 2015 at 2:45 AM, madhuka udantha <madhukaudan...@gmail.com>
wrote:

> Hi,
>
> I have submitted the GSoC Proposal 2015 under the title of "Extending
> visualization of Zeppelin with Rich GUI and Charting Manager".
>
> My gratitude goes to the active Community of Zeppelin,
> specially Alexander, Damien, Eran and Moon Soo Lee for your valuable ideas
> and feedback.
>
> I hope to write few blog posts on the current tasks I'm doing with regard
> to this project and share with you all.
>
> On Fri, Mar 27, 2015 at 12:20 PM, madhuka udantha <
> madhukaudan...@gmail.com> wrote:
>
>> Hi Alex,
>>
>> This is the email thread which I used to collaborate with the Zeppelin
>> Community[1] .
>> Currently I'm writing the Proposal upon the points discussed here and on
>> JIRA. If I have missed anything regard to the task please feel free to
>> share it.
>> I'll share my proposal when I finish writing it.
>>
>> Thanks.
>>
>>
>> [1] https://issues.apache.org/jira/browse/COMDEV-119
>>
>> On Wed, Mar 25, 2015 at 11:09 PM, madhuka udantha <
>> madhukaudan...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> According to the discussion to get a clear understanding I just drew 2
>>> sequence diagrams that  will explain
>>> how chart will react to changing the pivot.
>>>
>>> Safe Level
>>>
>>> https://issues.apache.org/jira/secure/attachment/12707251/Changing%20the%20pivot%20-%20Safe%20Level.png
>>>
>>>
>>> In safe level (default level) only limited amount of data is
>>> retrieved(sufficient to draw the chart).
>>> At initial stage local storage don't contain data. But when you make a
>>> pivot change data will be there to draw the graph. If data is out-dated we
>>> will get it from back-end.
>>>
>>> Restricted Level
>>>
>>> https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png
>>> <https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png>
>>>
>>> User will reach Restricted Level after he successfully pass the Safe
>>> Level. Then in local storage we will have up-to-date data. But for this
>>> level it will be using all the data in the database. So Charting will grab
>>> the data from storage and back-end.
>>>
>>> Your ideas are mostly welcomed.
>>>
>>>
>>> On Wed, Mar 25, 2015 at 2:55 PM, madhuka udantha <
>>> madhukaudan...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to know about the code structure and zeppelin architecture? Is
>>>> there any good post / article / wiki regarding the said.
>>>> Also if there is any quick start guide regarding development of
>>>> Zeppelin please share it with me.
>>>>
>>>> Thanks.
>>>>
>>>> On Mon, Mar 23, 2015 at 10:44 AM, madhuka udantha <
>>>> madhukaudan...@gmail.com> wrote:
>>>>
>>>>> Hi, moon
>>>>>
>>>>> Yes, Since
>>>>>
>>>>>> "Moving computation is cheaper than moving data"
>>>>>
>>>>> We can do computation in computing framework.
>>>>>
>>>>> For simple pivot changing or filtering can be handle in local storage
>>>>> with indexing databases depending on the current user level.
>>>>> As you saw, computations will be handle in the back ends.
>>>>>
>>>>> Great to hear about the building rich GUI, I will give me chart
>>>>> library ideas on there.
>>>>>
>>>>> Your ideas are always welcome, those will be helpful for my task and
>>>>> draft proposal
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Mon, Mar 23, 2015 at 7:59 AM, moon soo Lee <m...@apache.org> wrote:
>>>>>
>>>>>> Hi, madhuka udantha
>>>>>>
>>>>>> I think your idea about chart library and data transformation engine
>>>>>> sounds
>>>>>> cool. For the data transform modules, it's good idea to make this
>>>>>> pluggable
>>>>>> to data transform engine. But i'm not sure getting result locally and
>>>>>> do
>>>>>> transform for pivot or filtering to prevent run query again is good
>>>>>> idea.
>>>>>> Because of Zeppelin is (not limited but) trying to build analytical
>>>>>> environment on top of distributed computing framework, like Spark,
>>>>>> Flink,
>>>>>> Ignite, etc. Most of distributed computing framework Zeppelin trying
>>>>>> to
>>>>>> integrate is following the same paradigm "Moving computation is
>>>>>> cheaper
>>>>>> than moving data". In this manner, size of data that transform engine
>>>>>> need
>>>>>> to handle can be easily multiple TB. Which will take long time to
>>>>>> copy to
>>>>>> local machine and process. So i think transform module should be run
>>>>>> on
>>>>>> underlying distributed computing framework.
>>>>>>
>>>>>> And about Chart library, we have started discussion thread about
>>>>>> building
>>>>>> rich GUI inside of notebook. it might be related.
>>>>>>
>>>>>> Thanks,
>>>>>> moon
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 23, 2015 at 2:27 AM madhuka udantha <
>>>>>> madhukaudan...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > On Sun, Mar 22, 2015 at 7:05 PM, Corneau Damien <
>>>>>> cornead...@apache.org>
>>>>>> > wrote:
>>>>>> >
>>>>>> > > Hi,
>>>>>> > >
>>>>>> > > Being able to aggregate on the query side is a great idea and
>>>>>> would allow
>>>>>> > > us to transfer less data as well as having a full query
>>>>>> representation of
>>>>>> > > the visualization.
>>>>>> > >
>>>>>> > > However creating a SQL query dynamically is a pretty difficult
>>>>>> task, and
>>>>>> > > might be too much for that scope.
>>>>>> > >
>>>>>> > > Also I see some possible problems with this method:
>>>>>> > >  - Changing the pivot or simple filtering would mean running the
>>>>>> query
>>>>>> > > again
>>>>>> > >
>>>>>> > No, the query wont run again.
>>>>>> > In the first run of the query data is collected and stored locally-
>>>>>> local
>>>>>> > storage [1](using indexing techniques to make retrieval faster) So
>>>>>> changing
>>>>>> > pivot or simple filtering will use the local storage.
>>>>>> > If any attribute or data is missing in local storage then it will
>>>>>> retrieve
>>>>>> > only that and save the network bandwidth as well.
>>>>>> > Does my explanation make sense.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > >  - Being able to make pivot style SQL query would be really hard,
>>>>>> > >    we would need multiple sub-queries or even some times multiple
>>>>>> queries
>>>>>> > > (I tried a few times and could have the result wanted only with
>>>>>> > > visualization side pivot).
>>>>>> > >    It would end up with really bad SQL queries, especially with
>>>>>> the Hive
>>>>>> > > SQL or Spark SQL limitations and would take way more time to
>>>>>> process.
>>>>>> > >
>>>>>> > Agreed. I'm not planing to use pivot style queries.
>>>>>> >
>>>>>> > Any suggestions?
>>>>>> >
>>>>>> >
>>>>>> > Thanks.
>>>>>> >
>>>>>> >
>>>>>> > > On Sun, Mar 22, 2015 at 10:08 PM, IT CTO <goi....@gmail.com>
>>>>>> wrote:
>>>>>> > >
>>>>>> > > > Hi,
>>>>>> > > >
>>>>>> > > > The Chart library features sounds promising.
>>>>>> > > > As  for the data engine - one thing that I think is missing is
>>>>>> the
>>>>>> > > ability
>>>>>> > > > to use the visualization to drive the aggregation in the SQL.
>>>>>> today,
>>>>>> > you
>>>>>> > > > first write the SQL, you execute it, *limited by the number of
>>>>>> results
>>>>>> > > sent
>>>>>> > > > to the client*, and then you use viz to understand the results.
>>>>>> > > > Alternatively, if through the visualization I can generate a
>>>>>> better SQL
>>>>>> > > > which returns returns an aggregated data-set then I can analyze
>>>>>> a
>>>>>> > bigger
>>>>>> > > > amount of data.
>>>>>> > > >
>>>>>> > > > I hope I was clear enough in my explanation :-)
>>>>>> > > >
>>>>>> > > > Eran
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > On Fri, Mar 20, 2015 at 8:21 AM, madhuka udantha <
>>>>>> > > madhukaudan...@gmail.com
>>>>>> > > > >
>>>>>> > > > wrote:
>>>>>> > > >
>>>>>> > > > > Hi,
>>>>>> > > > >
>>>>>> > > > > Here is my proposing ideas.
>>>>>> > > > > According to COMDEV-119 jira. Charts are hard coded until now
>>>>>> and
>>>>>> > data
>>>>>> > > > > transformation issue was highlighted since different charts
>>>>>> have
>>>>>> > > > different
>>>>>> > > > > pivot fields eg: Area charts, Scatter, Surface charts, Bubble
>>>>>> charts,
>>>>>> > > > Radar
>>>>>> > > > > charts. etc..
>>>>>> > > > >
>>>>>> > > > > To solve this I am introducing a two major component one is
>>>>>> called
>>>>>> > > 'Chart
>>>>>> > > > > library' and 'Data transformation engine'. Chart library is
>>>>>> located
>>>>>> > > where
>>>>>> > > > > it shows the chats that are currently plugged. There we can
>>>>>> plug
>>>>>> > chart
>>>>>> > > > > types and those can be reused.
>>>>>> > > > >
>>>>>> > > > > *Chart library features *
>>>>>> > > > >
>>>>>> > > > >    - Users can select the chart from library
>>>>>> > > > >    - Those charts are pluggable to library
>>>>>> > > > >    - Charts can be plugged by config(json)/UI with wizard
>>>>>> > > > >    - Configuration/Meta file of the chart contains interface,
>>>>>> libs,
>>>>>> > > > themes
>>>>>> > > > >    and a data transformation types/mappings
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > *Data Transformation Engine*
>>>>>> > > > > 'Data transformation engine' contains data transformation
>>>>>> modules.
>>>>>> > > Those
>>>>>> > > > > modules are also pluggable to engine. Those have connections
>>>>>> to
>>>>>> > charts.
>>>>>> > > > > Data transformation engine sit between the data (sql) and
>>>>>> chart. So
>>>>>> > > this
>>>>>> > > > > module  converts data and map them to each chart pivot field
>>>>>> > > > >
>>>>>> > > > >    - This module will look at pivot fields of the chart
>>>>>> > > > >    - Selected attributes of the SQL query
>>>>>> > > > >    - Attribute value operations improvement (string split,
>>>>>> value
>>>>>> > > > >    aggregation, round number round)
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > Another improvement that I notice is that
>>>>>> > > > >
>>>>>> > > > >    - Query Edit auto-completion support (with Ctrl+space)
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > Your ideas are welcome here
>>>>>> > > > > Thanks
>>>>>> > > > >
>>>>>> > > > > On Fri, Mar 20, 2015 at 10:57 AM, madhuka udantha <
>>>>>> > > > > madhukaudan...@gmail.com>
>>>>>> > > > > wrote:
>>>>>> > > > >
>>>>>> > > > > > Hi All,
>>>>>> > > > > >
>>>>>> > > > > > I'm Udantha, MSc. Student at University of Moratuwa. This
>>>>>> GSoC 2015
>>>>>> > > > > > project, 0COMDEV-1190 captures my interest.
>>>>>> > > > > >
>>>>>> > > > > > I have abundant experiences of visualization techniques
>>>>>> creating
>>>>>> > > > numerous
>>>>>> > > > > > dashboards[1,2] with javascript, html5, angularJS, d3
>>>>>> charting etc.
>>>>>> > > > > >
>>>>>> > > > > > My current research area comprises of big data where I have
>>>>>> worked
>>>>>> > > with
>>>>>> > > > > > various types of data sets. Also I'm working with cluster
>>>>>> > > > representation
>>>>>> > > > > > and classification techniques where visualization amounts
>>>>>> to a
>>>>>> > > > > considerable
>>>>>> > > > > > part. I was following COMDEV-119 (jira) with Alexander
>>>>>> Bezzubov and
>>>>>> > > > > CORNEAU
>>>>>> > > > > > Damien for more than week.
>>>>>> > > > > >
>>>>>> > > > > > Thanks
>>>>>> > > > > >
>>>>>> > > > > > [1] http://wso2.com/products/user-engagement-server/
>>>>>> > > > > > [2] https://github.com/wso2/jaggery
>>>>>> > > > > > --
>>>>>> > > > > > Cheers,
>>>>>> > > > > > Madhuka Udantha
>>>>>> > > > > > http://madhukaudantha.blogspot.com
>>>>>> > > > > >
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > --
>>>>>> > > > > Cheers,
>>>>>> > > > > Madhuka Udantha
>>>>>> > > > > http://madhukaudantha.blogspot.com
>>>>>> > > > >
>>>>>> > > >
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > --
>>>>>> > > > Eran | CTO
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Cheers,
>>>>>> > Madhuka Udantha
>>>>>> > http://madhukaudantha.blogspot.com
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Madhuka Udantha
>>>>> http://madhukaudantha.blogspot.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers,
>>>> Madhuka Udantha
>>>> http://madhukaudantha.blogspot.com
>>>>
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Madhuka Udantha
>>> http://madhukaudantha.blogspot.com
>>>
>>
>>
>>
>> --
>> Cheers,
>> Madhuka Udantha
>> http://madhukaudantha.blogspot.com
>>
>
>
>
> --
> Cheers,
> Madhuka Udantha
> http://madhukaudantha.blogspot.com
>



-- 
--
Kind regards,
Alexander.

Re: [GSoC 2015][COMDEV-119] Zeppelin GSoC Project: add more D3 visualization

Reply via email to