[ 
https://issues.apache.org/jira/browse/BEAM-3645?focusedWorklogId=262204&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-262204
 ]

ASF GitHub Bot logged work on BEAM-3645:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Jun/19 10:52
            Start Date: 18/Jun/19 10:52
    Worklog Time Spent: 10m 
      Work Description: robertwb commented on pull request #8769: [WIP] 
[BEAM-3645] support multi processes for Python FnApiRunner with 
EmbeddedGrpcWorkerHandler
URL: https://github.com/apache/beam/pull/8769#discussion_r294728596
 
 

 ##########
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner.py
 ##########
 @@ -1418,6 +1449,51 @@ def process_bundle(self, inputs, expected_outputs, 
parallel_uid_counter=None):
 
     return result, split_results
 
+class ParallelBundleManager(BundleManager):
+  _uid_counter = 0
+  def process_bundle(self, inputs, expected_outputs):
+    input_value = list(inputs.values())[0]
+    if isinstance(input_value, list):
 
 Review comment:
   I was suggesting that we create a newclass (that could subclass list if we 
want, but has a partition method) to accomplish this. This way the fact that 
there is parallelism doesn't leak through the stack up and down (e.g the 
BundleManager code can be used unchanged, rather than being passed all the 
inputs and then a(n easy to forget) flag of which ones to ignore, and also 
simplifies the Buffer class in that it doesn't have an extra attribute 
redundantly remembering the parallelism of the context it must be used in (plus 
all the locking, state-tracking, sleeping etc.)
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 262204)
    Time Spent: 5h 10m  (was: 5h)

> Support multi-process execution on the FnApiRunner
> --------------------------------------------------
>
>                 Key: BEAM-3645
>                 URL: https://issues.apache.org/jira/browse/BEAM-3645
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Charles Chen
>            Assignee: Hannah Jiang
>            Priority: Major
>          Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance 
> gain over the previous DirectRunner.  We can do even better in multi-core 
> environments by supporting multi-process execution in the FnApiRunner, to 
> scale past Python GIL limitations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to