This is an automated email from the ASF dual-hosted git repository.
altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push:
new 0aa7d15 [BEAM-7389] Add code examples for Map page
new ab37b0f Merge pull request #9265 from davidcavazos/map-page
0aa7d15 is described below
commit 0aa7d159418e47cd1897e07b7cac7ba7925d3103
Author: David Cavazos <[email protected]>
AuthorDate: Mon Aug 5 13:53:24 2019 -0700
[BEAM-7389] Add code examples for Map page
---
.../transforms/python/element-wise/map.md | 252 ++++++++++++++++++++-
1 file changed, 243 insertions(+), 9 deletions(-)
diff --git a/website/src/documentation/transforms/python/element-wise/map.md
b/website/src/documentation/transforms/python/element-wise/map.md
index 76d4d46..4c40d62 100644
--- a/website/src/documentation/transforms/python/element-wise/map.md
+++ b/website/src/documentation/transforms/python/element-wise/map.md
@@ -19,24 +19,258 @@ limitations under the License.
-->
# Map
-<table align="left">
- <a target="_blank" class="button"
+
+<script type="text/javascript">
+localStorage.setItem('language', 'language-py')
+</script>
+
+<table>
+ <td>
+ <a class="button" target="_blank"
href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Map">
- <img src="https://beam.apache.org/images/logos/sdks/python.png"
width="20px" height="20px"
- alt="Pydoc" />
- Pydoc
+ <img src="https://beam.apache.org/images/logos/sdks/python.png"
+ width="20px" height="20px" alt="Pydoc" />
+ Pydoc
</a>
+ </td>
</table>
<br>
+
Applies a simple 1-to-1 mapping function over each element in the collection.
## Examples
-See [BEAM-7389](https://issues.apache.org/jira/browse/BEAM-7389) for updates.
-## Related transforms
+In the following examples, we create a pipeline with a `PCollection` of
produce with their icon, name, and duration.
+Then, we apply `Map` in multiple ways to transform every element in the
`PCollection`.
+
+`Map` accepts a function that returns a single element for every input element
in the `PCollection`.
+
+### Example 1: Map with a predefined function
+
+We use the function `str.strip` which takes a single `str` element and outputs
a `str`.
+It strips the input element's whitespaces, including newlines and tabs.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_simple %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 2: Map with a function
+
+We define a function `strip_header_and_newline` which strips any `'#'`, `' '`,
and `'\n'` characters from each element.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_function %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 3: Map with a lambda function
+
+We can also use lambda functions to simplify **Example 2**.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_lambda %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 4: Map with multiple arguments
+
+You can pass functions with multiple arguments to `Map`.
+They are passed as additional positional arguments or keyword arguments to the
function.
+
+In this example, `strip` takes `text` and `chars` as arguments.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_multiple_arguments %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 5: MapTuple for key-value pairs
+
+If your `PCollection` consists of `(key, value)` pairs,
+you can use `MapTuple` to unpack them into different function arguments.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_tuple %}```
+
+Output `PCollection` after `MapTuple`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 6: Map with side inputs as singletons
+
+If the `PCollection` has a single value, such as the average from another
computation,
+passing the `PCollection` as a *singleton* accesses that value.
+
+In this example, we pass a `PCollection` the value `'# \n'` as a singleton.
+We then use that value as the characters for the `str.strip` method.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_side_inputs_singleton %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+### Example 7: Map with side inputs as iterators
+
+If the `PCollection` has multiple values, pass the `PCollection` as an
*iterator*.
+This accesses elements lazily as they are needed,
+so it is possible to iterate over large `PCollection`s that won't fit into
memory.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_side_inputs_iter %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plants %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+> **Note**: You can pass the `PCollection` as a *list* with
`beam.pvalue.AsList(pcollection)`,
+> but this requires that all the elements fit into memory.
+
+### Example 8: Map with side inputs as dictionaries
+
+If a `PCollection` is small enough to fit into memory, then that `PCollection`
can be passed as a *dictionary*.
+Each element must be a `(key, value)` pair.
+Note that all the elements of the `PCollection` must fit into memory for this.
+If the `PCollection` won't fit into memory, use
`beam.pvalue.AsIter(pcollection)` instead.
+
+```py
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py
tag:map_side_inputs_dict %}```
+
+Output `PCollection` after `Map`:
+
+```
+{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map_test.py
tag:plant_details %}```
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/transforms/element_wise/map.py">
+ <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png"
+ width="20px" height="20px" alt="View on GitHub" />
+ View on GitHub
+ </a>
+ </td>
+</table>
+<br>
+
+## Related transforms
+
* [FlatMap]({{ site.baseurl
}}/documentation/transforms/python/elementwise/flatmap) behaves the same as
`Map`, but for
each input it may produce zero or more outputs.
-* [Filter]({{ site.baseurl
}}/documentation/transforms/python/elementwise/filter) is useful if the
function is just
+* [Filter]({{ site.baseurl
}}/documentation/transforms/python/elementwise/filter) is useful if the
function is just
deciding whether to output an element or not.
* [ParDo]({{ site.baseurl
}}/documentation/transforms/python/elementwise/pardo) is the most general
element-wise mapping
- operation, and includes other abilities such as multiple output collections
and side-inputs.
\ No newline at end of file
+ operation, and includes other abilities such as multiple output collections
and side-inputs.
+
+<table>
+ <td>
+ <a class="button" target="_blank"
+
href="https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Map">
+ <img src="https://beam.apache.org/images/logos/sdks/python.png"
+ width="20px" height="20px" alt="Pydoc" />
+ Pydoc
+ </a>
+ </td>
+</table>
+<br>