[ 
https://issues.apache.org/jira/browse/BEAM-8016?focusedWorklogId=345720&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-345720
 ]

ASF GitHub Bot logged work on BEAM-8016:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Nov/19 00:37
            Start Date: 19/Nov/19 00:37
    Worklog Time Spent: 10m 
      Work Description: KevinGG commented on issue #10132: [BEAM-8016] Pipeline 
Graph
URL: https://github.com/apache/beam/pull/10132#issuecomment-555276401
 
 
   Thanks for the comments!
   > * Are we getting rid of the tooltips displaying the intermediate results? 
Do they not fit in the new model?
   We are offering a `show` API to users so that they can visualize a larger 
set of their data dynamically instead of peeking through a random static sample 
(which we still offer if the user calls `show` in an ipython terminal not a 
notebook web frontend).
   And with new Beam pipeline graph proposal and some pipeline graph library 
WIP, the tooltip in the future might show other metadata such as 
elapse/throughput to provide a consistent user experience that is similar to 
what users have on Dataflow.
   > * What do the PCollections look like if the user did not specify the 
PCollection name as a variable?
   Those PCollections will not be cached. The idea is when building a pipeline, 
if the user does not assign a PCollection to a variable, they would not be able 
to build the pipeline further from it and they cannot invoke `show(pcoll)` 
because they don't have access to `pcoll` in their code.
   Before, we have had the `leaf pcollection` concept for PCollections who have 
never been used as inputs. It doesn't work for PCollections consumed by sinks 
(with input but no output) even if the user has assigned them to a variable and 
they look like hanging PCollections. It also doesn't work in a notebook 
environment where new transforms can be added at different locations and 
PCollections can be re-evaluated due to cell-re-execution.
   
   
   
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 345720)
    Time Spent: 2h  (was: 1h 50m)

> Render Beam Pipeline as DOT with Interactive Beam  
> ---------------------------------------------------
>
>                 Key: BEAM-8016
>                 URL: https://issues.apache.org/jira/browse/BEAM-8016
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-py-interactive
>            Reporter: Ning Kang
>            Assignee: Ning Kang
>            Priority: Major
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> With work in https://issues.apache.org/jira/browse/BEAM-7760, Beam pipeline 
> converted to DOT then rendered should mark user defined variables on edges.
> With work in https://issues.apache.org/jira/browse/BEAM-7926, it might be 
> redundant or confusing to render arbitrary random sample PCollection data on 
> edges.
> We'll also make sure edges in the graph corresponds to output -> input 
> relationship in the user defined pipeline. Each edge is one output. If 
> multiple down stream inputs take the same output, it should be rendered as 
> one edge diverging into two instead of two edges.
> For advanced interactivity highlight where each execution highlights the part 
> of the pipeline really executed from the original pipeline, we'll also 
> provide the support in beta.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to