[ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=379373&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379373
 ]

ASF GitHub Bot logged work on BEAM-8618:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Jan/20 13:35
            Start Date: 30/Jan/20 13:35
    Worklog Time Spent: 10m 
      Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r372946063
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -280,6 +283,7 @@ def get(self, instruction_id, bundle_descriptor_id):
     try:
       # pop() is threadsafe
       processor = self.cached_bundle_processors[bundle_descriptor_id].pop()
+      self.last_access_time[bundle_descriptor_id] = time.time()
     except IndexError:
 
 Review comment:
   I still do not understand, the method is named `get`, so we access the 
bundle processor independently of whether we create it or not. It is cached, 
regardless of whether it is created and added to the cache, or retrieved from 
the cache. 
   
   Logically, you might want to update the time when putting the processor into 
the cache. That would be in `release`.
   
   What is the advantage of updating the time here? It should be sufficient to 
update it in `release`, directly before putting it back.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 379373)
    Time Spent: 2.5h  (was: 2h 20m)

> Tear down unused DoFns periodically in Python SDK harness
> ---------------------------------------------------------
>
>                 Key: BEAM-8618
>                 URL: https://issues.apache.org/jira/browse/BEAM-8618
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-harness
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>            Priority: Major
>             Fix For: 2.20.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to