[ 
https://issues.apache.org/jira/browse/BEAM-5404?focusedWorklogId=144855&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144855
 ]

ASF GitHub Bot logged work on BEAM-5404:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Sep/18 14:26
            Start Date: 17/Sep/18 14:26
    Worklog Time Spent: 10m 
      Work Description: nielm opened a new pull request #6407: [BEAM-5404] Use 
Java serialization for MutationGroup objects.
URL: https://github.com/apache/beam/pull/6407
 
 
   The Cloud Spanner connector uses a custom serialization system for 
MutationGroup objects
   
   Java serialization is much more efficient than the custom serialization
   system used by MutationGroupEncode -- in both speed, and space: the
   encoded byte arrays are 1/10th of the size.
   
   This PR replaces the custom serialization with a simple Java serialization 
using the Beam SerializableCoder class.
   
   @chamikaramj 
   
   Post-Commit Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
 </br> [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 144855)
            Time Spent: 10m
    Remaining Estimate: 0h

> Inefficient Serialization of Spanner MutationGroup in pipeline
> --------------------------------------------------------------
>
>                 Key: BEAM-5404
>                 URL: https://issues.apache.org/jira/browse/BEAM-5404
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.6.0
>            Reporter: Niel Markwick
>            Assignee: Chamikara Jayalath
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Cloud Spanner connector uses a custom serialization mechanism to convert 
> MutationGroup objects into a byte array. 
> This mechanism is very inefficient producing byte arrays approx 10x larger 
> than simple Java Serialization of the MutationGroup objects, which increases 
> the resources needed by the connector to ~40x the size of the original 
> mutations.
> There are no obvious benefits to using this custom serialization system, as 
> the objects are deserialized within the pipeline itself. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to