[ https://issues.apache.org/jira/browse/PIG-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244794#comment-13244794 ]
Dimitris Bousis commented on PIG-2586: -------------------------------------- Hi all, My name is Dimitris Bousis, currently doing my Master in Computer Engineering & Informatics in University of Patras, Greece. My research interests include cloud & distributed computing with related technologies such as Hadoop, HBase, Cassandra , Pig & Hive. Though i have not started any research activity with the technologies (I plan to do so after the summer), i have taken an elective course in Hadoop,HDFS, HBase & Cassandra during my undergraduate studies. I am interested in applying for this GsoC 2012 project. Flow visualization is really useful when it comes in debugging and breaking down of any form structural query. From the mentor's comment above I assume that there should exist a web interface parsing the DOT format in order to present the plans produced by explain. Furthermore, I'd like to suggest D3.js a js lib that lets you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. This library uses HTML5,CSS3 and SVG to represent data within a page. Please comment this post for anything you consider necessary. Looking forward working with you this summer. Dimitris Bousis > A better plan/data flow visualizer > ---------------------------------- > > Key: PIG-2586 > URL: https://issues.apache.org/jira/browse/PIG-2586 > Project: Pig > Issue Type: Improvement > Components: impl > Reporter: Daniel Dai > Labels: gsoc2012 > > Pig supports a dot graph style plan to visualize the > logical/physical/mapreduce plan (explain with -dot option, see > http://ofps.oreilly.com/titles/9781449302641/developing_and_testing.html). > However, dot graph takes extra step to generate the plan graph and the > quality of the output is not good. It's better we can implement a better > visualizer for Pig. It should: > 1. show operator type and alias > 2. turn on/off output schema > 3. dive into foreach inner plan on demand > 4. provide a way to show operator source code, eg, tooltip of an operator > (plan don't currently have this information, but you can assume this is in > place) > 5. besides visualize logical/physical/mapreduce plan, visualize the script > itself is also useful > 6. may rely on some java graphic library such as Swing > This is a candidate project for Google summer of code 2012. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira