wsry opened a new pull request #9710: [FLINK-11859][runtime]Improve 
SpanningRecordSerializer performance by serializing record length to data 
buffer directly.
URL: https://github.com/apache/flink/pull/9710
 
 
   ## What is the purpose of the change
   
   The purpose of this pr is to improve the performance of 
SpanningRecordSerializer. Currently, the data and length filed of a serialized 
record are stored separately in two buffer (the lengthBuffer and the 
serializationBuffer), thus need two times of copy when transferring the 
intermediate data to BufferBuilder. This pr tries to optimize the serialization 
process by removing the lengthBuffer and write the length field to 
serializationBuffer directly, which can avoid the copy of length buffer.
   
   ## Brief change log
   
     - *Remove the length buffer of SpanningRecordSerializer and serialize the 
record length to data buffer directly. More specifically, the initial 4 bytes 
of the data buffer is reserved for length field and after the the serialization 
of record, the reserved space will be filled with record length.*
   
   
   ## Verifying this change
   
    - This change is already covered by existing tests, such as 
*SpanningRecordSerializerTest* and *SpanningRecordSerializationTest*.
    - The performance gain is proved by the micro-benchmark and the whole 
results can be found in this jira 
[FLINK-11859](https://issues.apache.org/jira/browse/FLINK-11859). 
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
     - The serializers: (**yes** / no / don't know)
     - The runtime per-record code paths (performance sensitive): (**yes** / no 
/ don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to