[ 
https://issues.apache.org/jira/browse/BEAM-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Hoyer updated BEAM-5431:
--------------------------------
    Description: 
I'd like to propose a new high-level transform "StarMap" for the Python SDK. 
The transform would be syntactic sugar for ParDo like Map, but would would 
automatically unpack arguments like 
[itertools.starmap|https://docs.python.org/3/library/itertools.html#itertools.starmap]
 from Python's standard library.

The use-case is to handle applying functions to tuples of arguments, which is a 
common pattern when using Beam's combine and group-by transforms. Right now, 
it's common to write functions with manual unpacking, e.g., 
{code:java}
def my_func(inputs):
  key, value = inputs
  ...

beam.Map(my_func) {code}
StarMap offers a much more readable alternative: 
{code:java}
def my_func(key, value):
  ...

beam.StarMap(my_func){code}
 

The need for StarMap is especially pressing with the advent of Python 3 support 
and the eventual wind-down of Python 2. Currently, it's common to achieve this 
pattern using unpacking in a function definition, e.g., beam.Map(lambda (k, v): 
my_func(k, v)), but this is invalid syntax in Python 3. My internal search of 
Google's codebase turns up quite a few matches for "beam\.Map(lambda\ (", none 
of which would work on Python 3.

 

  was:
I'd like to propose a new high-level transform "StarMap" for the Python SDK. 
The transform would be syntactic sugar for ParDo like Map, but would would 
automatically unpack arguments like 
[itertools.starmap|https://docs.python.org/3/library/itertools.html#itertools.starmap]
 from Python's standard library.

The use-case is to handle applying functions to tuples of arguments, which is a 
common pattern when using Beam's combine and group-by transforms. Right now, 
it's common to write functions with manual unpacking, e.g.,

 

 
{code:java}
def my_func(inputs):
  key, value = inputs
  ...

beam.Map(my_func) {code}
StarMap offers a much more readable alternative: 
{code:java}
def my_func(key, value):
  ...

beam.StarMap(my_func){code}
 

 

The need for StarMap is especially pressing with the advent of Python 3 support 
and the eventual wind-down of Python 2. Currently, it's common to achieve this 
pattern using unpacking in a function definition, e.g., beam.Map(lambda (k, v): 
my_func(k, v)), but this is invalid syntax in Python 3. My internal search of 
Google's codebase turns up quite a few matches for "beam\.Map\(lambda\ \(", 
none of which would work on Python 3.

 


> StarMap transform for Python SDK
> --------------------------------
>
>                 Key: BEAM-5431
>                 URL: https://issues.apache.org/jira/browse/BEAM-5431
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Stephan Hoyer
>            Assignee: Ahmet Altay
>            Priority: Major
>
> I'd like to propose a new high-level transform "StarMap" for the Python SDK. 
> The transform would be syntactic sugar for ParDo like Map, but would would 
> automatically unpack arguments like 
> [itertools.starmap|https://docs.python.org/3/library/itertools.html#itertools.starmap]
>  from Python's standard library.
> The use-case is to handle applying functions to tuples of arguments, which is 
> a common pattern when using Beam's combine and group-by transforms. Right 
> now, it's common to write functions with manual unpacking, e.g., 
> {code:java}
> def my_func(inputs):
>   key, value = inputs
>   ...
> beam.Map(my_func) {code}
> StarMap offers a much more readable alternative: 
> {code:java}
> def my_func(key, value):
>   ...
> beam.StarMap(my_func){code}
>  
> The need for StarMap is especially pressing with the advent of Python 3 
> support and the eventual wind-down of Python 2. Currently, it's common to 
> achieve this pattern using unpacking in a function definition, e.g., 
> beam.Map(lambda (k, v): my_func(k, v)), but this is invalid syntax in Python 
> 3. My internal search of Google's codebase turns up quite a few matches for 
> "beam\.Map(lambda\ (", none of which would work on Python 3.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to