[ https://issues.apache.org/jira/browse/BEAM-7246?focusedWorklogId=382973&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-382973 ]
ASF GitHub Bot logged work on BEAM-7246: ---------------------------------------- Author: ASF GitHub Bot Created on: 06/Feb/20 16:35 Start Date: 06/Feb/20 16:35 Worklog Time Spent: 10m Work Description: mszb commented on pull request #10712: [BEAM-7246] Added Google Spanner Write Transform URL: https://github.com/apache/beam/pull/10712#discussion_r375945639 ########## File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py ########## @@ -109,20 +111,74 @@ ReadFromSpanner takes this transform in the constructor and pass this to the read pipeline as the singleton side input. + +Writing Data to Cloud Spanner. + +The WriteToSpanner transform writes to Cloud Spanner by executing a +collection a input rows (WriteMutation). The mutations are grouped into +batches for efficiency. + +WriteToSpanner transform relies on the WriteMutation objects which is exposed +by the SpannerIO API. WriteMutation have five static methods (insert, update, +insert_or_update, replace, delete). These methods returns the instance of the +_Mutator object which contains the mutation type and the Spanner Mutation +object. For more details, review the docs of the class SpannerIO.WriteMutation. +For example::: + + mutations = [ + WriteMutation.insert(table='user', columns=('name', 'email'), + values=[('sara'. 's...@dev.com')]) + ] + _ = (p + | beam.Create(mutations) + | WriteToSpanner( + project_id=SPANNER_PROJECT_ID, + instance_id=SPANNER_INSTANCE_ID, + database_id=SPANNER_DATABASE_NAME) + ) + +You can also create WriteMutation via calling its constructor. For example::: + + mutations = [ + WriteMutation(insert='users', columns=('name', 'email'), + values=[('sara", 's...@example.com')]) + ] + +For more information, review the docs available on WriteMutation class. + +WriteToSpanner transform also takes 'max_batch_size_bytes' param which is set +to 1MB (1048576 bytes) by default. This parameter used to reduce the number of Review comment: Thanks. I'll update the code! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 382973) Time Spent: 17h 40m (was: 17.5h) > Create a Spanner IO for Python > ------------------------------ > > Key: BEAM-7246 > URL: https://issues.apache.org/jira/browse/BEAM-7246 > Project: Beam > Issue Type: Bug > Components: io-py-gcp > Reporter: Reuven Lax > Assignee: Shehzaad Nakhoda > Priority: Major > Time Spent: 17h 40m > Remaining Estimate: 0h > > Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only). > Testing in this work item will be in the form of DirectRunner tests and > manual testing. > Integration and performance tests are a separate work item (not included > here). > See https://beam.apache.org/documentation/io/built-in/. The goal is to add > Google Clound Spanner to the Database column for the Python/Batch row. -- This message was sent by Atlassian Jira (v8.3.4#803005)