[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

ASF GitHub Bot (Jira) Thu, 06 Feb 2020 08:36:43 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382973&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382973
 ]


ASF GitHub Bot logged work on BEAM-7246:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Feb/20 16:35
            Start Date: 06/Feb/20 16:35
    Worklog Time Spent: 10m 
      Work Description: mszb commented on pull request #10712: [BEAM-7246] 
Added Google Spanner Write Transform
URL: https://github.com/apache/beam/pull/10712#discussion_r375945639
 
 

 ##########
 File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
 ##########
 @@ -109,20 +111,74 @@
 
 ReadFromSpanner takes this transform in the constructor and pass this to the
 read pipeline as the singleton side input.
+
+Writing Data to Cloud Spanner.
+
+The WriteToSpanner transform writes to Cloud Spanner by executing a
+collection a input rows (WriteMutation). The mutations are grouped into
+batches for efficiency.
+
+WriteToSpanner transform relies on the WriteMutation objects which is exposed
+by the SpannerIO API. WriteMutation have five static methods (insert, update,
+insert_or_update, replace, delete). These methods returns the instance of the
+_Mutator object which contains the mutation type and the Spanner Mutation
+object. For more details, review the docs of the class SpannerIO.WriteMutation.
+For example:::
+
+  mutations = [
+                WriteMutation.insert(table='user', columns=('name', 'email'),
+                values=[('sara'. '[email protected]')])
+              ]
+  _ = (p
+       | beam.Create(mutations)
+       | WriteToSpanner(
+          project_id=SPANNER_PROJECT_ID,
+          instance_id=SPANNER_INSTANCE_ID,
+          database_id=SPANNER_DATABASE_NAME)
+        )
+
+You can also create WriteMutation via calling its constructor. For example:::
+
+  mutations = [
+      WriteMutation(insert='users', columns=('name', 'email'),
+                    values=[('sara", '[email protected]')])
+  ]
+
+For more information, review the docs available on WriteMutation class.
+
+WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set
+to 1MB (1048576 bytes) by default. This parameter used to reduce the number of
 
 Review comment:
   Thanks. I'll update the code!
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 382973)
    Time Spent: 17h 40m  (was: 17.5h)

> Create a Spanner IO for Python
> ------------------------------
>
>                 Key: BEAM-7246
>                 URL: https://issues.apache.org/jira/browse/BEAM-7246
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-gcp
>            Reporter: Reuven Lax
>            Assignee: Shehzaad Nakhoda
>            Priority: Major
>          Time Spent: 17h 40m
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-7246) Create a Spanner IO for Python

Reply via email to