[ 
https://issues.apache.org/jira/browse/BEAM-8016?focusedWorklogId=346185&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-346185
 ]

ASF GitHub Bot logged work on BEAM-8016:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Nov/19 19:11
            Start Date: 19/Nov/19 19:11
    Worklog Time Spent: 10m 
      Work Description: KevinGG commented on pull request #10132: [BEAM-8016] 
Pipeline Graph
URL: https://github.com/apache/beam/pull/10132#discussion_r348112567
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_runner.py
 ##########
 @@ -65,8 +65,12 @@ def __init__(self,
     """
     self._underlying_runner = (underlying_runner
                                or direct_runner.DirectRunner())
-    self._cache_manager = cache.FileBasedCacheManager(cache_dir, cache_format)
-    self._renderer = pipeline_graph_renderer.get_renderer(render_option)
+    if not ie.current_env().cache_manager():
+      ie.current_env().set_cache_manager(
+          cache.FileBasedCacheManager(cache_dir,
+                                      cache_format))
+    self._cache_manager = ie.current_env().cache_manager()
 
 Review comment:
   Yes, they will share the same cache manager since the PCollections, 
PTransforms and Pipelines the user has defined are global.
   
   The notebook users own their source code in notebooks, so they should always 
be able to take advantage of Interactive Beam features if they are using the 
same Pipeline and the same PCollection.
   
   Cache belongs to the implicit implementation details. So the user should not 
know about it or depend on it directly throughout notebook usage. The most they 
can do are `watch` and `show`.
   
   The cache eviction is done when kernel exits. Those runners never expire 
unless the user explicitly `del` them (I don't think there is valid use case 
for that). So the cache will always persist like globals and gets cleaned up at 
kernel exit. No difference in this case.
   
   And in our user flow, there isn't any use case where the user needs 2 
versions of cache. The user holds their data, do things back and forth. The 
user does not hold cache, do things back and forth. 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 346185)
    Time Spent: 4h 20m  (was: 4h 10m)

> Render Beam Pipeline as DOT with Interactive Beam  
> ---------------------------------------------------
>
>                 Key: BEAM-8016
>                 URL: https://issues.apache.org/jira/browse/BEAM-8016
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-py-interactive
>            Reporter: Ning Kang
>            Assignee: Ning Kang
>            Priority: Major
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> With work in https://issues.apache.org/jira/browse/BEAM-7760, Beam pipeline 
> converted to DOT then rendered should mark user defined variables on edges.
> With work in https://issues.apache.org/jira/browse/BEAM-7926, it might be 
> redundant or confusing to render arbitrary random sample PCollection data on 
> edges.
> We'll also make sure edges in the graph corresponds to output -> input 
> relationship in the user defined pipeline. Each edge is one output. If 
> multiple down stream inputs take the same output, it should be rendered as 
> one edge diverging into two instead of two edges.
> For advanced interactivity highlight where each execution highlights the part 
> of the pipeline really executed from the original pipeline, we'll also 
> provide the support in beta.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to