[
https://issues.apache.org/jira/browse/CRUNCH-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054166#comment-14054166
]
Gabriel Reid commented on CRUNCH-438:
-------------------------------------
{quote}For the experiment i've also integrate this with the
PlanningParameters.PIPELINE_DOTFILE_OUTPUT_DIR (CRUNCH-418) If the
PIPELINE_DOTFILE_OUTPUT_DIR path is set then 5 dotfiles will be produced.
I agree with Gabriel Reid that those diagrams are more like a debug tool. I the
PIPELINE_DOTFILE_OUTPUT_DIR is not for debugging purpose? then perhaps I should
revert this integration?{quote}
The way I see it, the PIPELINE_DOTFILE_OUTPUT_DIR and
{{PlanningParameters.PIPELINE_PLAN_DOTFILE}} are for helping Crunch users
understand what their pipelines are doing and for pinpointing performance
issues, etc (at least that's how I use it). I guess you could call it debugging
tools, but they're more for people using Crunch as a library. I think these new
dotfiles are more for understanding the inner workings of the planner, which is
why I think it's better to not just dump them in the
PIPELINE_DOTFILE_OUTPUT_DIR. Just my opinion of course.
Another nitpick on something minor: am I correct in assuming that
BASE_GRAPH_PLANE_DOTFILE should be BASE_GRAPH_PLAN_DOTFILE (i.e. PLAN vs PLANE)?
> Visualizations of some important internal/intermediate pipeline planning
> states
> -------------------------------------------------------------------------------
>
> Key: CRUNCH-438
> URL: https://issues.apache.org/jira/browse/CRUNCH-438
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.10.0, 0.8.3
> Reporter: Christian Tzolov
> Assignee: Christian Tzolov
> Attachments: CRUNCH-438.2.patch, CRUNCH-438.patch
>
>
> To improve the understability of the pipeline planning stages it would help
> to visualize some intermediate planning states like:
> - PCollection lineage. (visualizing the output-pcollection-targets structure)
> - MSCRPlanner's planning Graphs before and after the split up of dependent
> GBK nodes
> - RTNode hierarchy along with the Input and Output configurations as
> persistent in the Configuration before the execution of the pipeline.
> Most of the information can be intercepted in the MSCRPlanner#plan() method.
--
This message was sent by Atlassian JIRA
(v6.2#6252)