kpumuk opened a new pull request, #3398:
URL: https://github.com/apache/thrift/pull/3398

   <!-- Explain the changes in the pull request below: -->
   
   While working on serialization depth limits improvement, noticed a quick 
performance win in a hot path when writing binary strings to memory buffer or 
socket. For strings that are already have `ASCII-8BIT` encoding, C extension 
will call into Ruby, which will return the original string. Even more, Ruby 
code would always duplicate the string if it is frozen, even when the encoding 
is already binary, which is unnecessary.
   
   Instead, this PR re-implements Ruby land's 
`Thrift::Bytes.force_binary_encoding` in C, and corrects Ruby logic to not 
duplicate strings unneceserily.
   
   ## Benchmark
   
   Benchmarked using 
   ```bash
   test/rb/benchmarks/protocol_benchmark.rb` with `--large-runs 5 --small-runs 
50000
   ```
   
   ### Accelerated binary scenarios only
   
   | Scenario | Before | After | Delta |
   | --- | ---: | ---: | ---: |
   | c binary write large (1MB) structure 5 times | 0.758s | 0.473s | -37.6% |
   | c binary read large (1MB) structure 5 times | 0.546s | 0.544s | -0.2% |
   | c binary write 50000 small structures | 0.372s | 0.236s | -36.4% |
   | c binary read 50000 small structures | 0.228s | 0.224s | -1.4% |
   | **c-only suite total** | **1.901s** | **1.484s** | **-21.9%** |
   | **c-only write total** | **1.130s** | **0.709s** | **-37.2%** |
   | **c-only read total** | **0.773s** | **0.769s** | **-0.5%** |
   
   ### Full suite summary
   
   | Metric | Before | After | Delta |
   | --- | ---: | ---: | ---: |
   | Full suite total | 35.912s | 33.222s | -7.5% |
   | Write scenarios total | 16.282s | 13.473s | -17.3% |
   | Read scenarios total | 19.635s | 19.579s | -0.3% |
   
   ### Selected write-path highlights
   
   | Scenario | Before | After | Delta |
   | --- | ---: | ---: | ---: |
   | ruby binary write large (1MB) structure 5 times | 1.286s | 0.999s | -22.3% 
|
   | c binary write large (1MB) structure 5 times | 0.741s | 0.473s | -36.2% |
   | ruby compact write large (1MB) structure 5 times | 0.760s | 0.505s | 
-33.7% |
   | ruby json write large (1MB) structure 5 times | 6.387s | 5.366s | -16.0% |
   | ruby binary write 50000 small structures | 0.611s | 0.486s | -20.5% |
   | c binary write 50000 small structures | 0.365s | 0.237s | -35.0% |
   | ruby compact write 50000 small structures | 0.377s | 0.258s | -31.5% |
   | ruby json write 50000 small structures | 2.973s | 2.439s | -18.0% |
   
   Read-side scenarios stayed essentially flat overall, while write-side 
allocation counts remained unchanged, which is consistent with reducing 
write-path overhead rather than reducing allocation volume.
   
   ### Serializer / Deserializer benchmark
   
   This also brings a huge win for `Thrift::Serializer`:
   
   | Operation | Before | After | Delta | Allocations Before | Allocations 
After |
   | --- | ---: | ---: | ---: | ---: | ---: |
   | `Thrift::Serializer` with `CompactProtocolFactory` (`50000` 
serializations) | 0.407s | 0.259s | -36.3% | 1,800,031 | 1,800,031 |
   | `Thrift::Deserializer` with `CompactProtocolFactory` (`50000` 
deserializations) | 0.251s | 0.252s | +0.8% | 850,007 | 850,007 |
   
   <!-- We recommend you review the checklist/tips before submitting a pull 
request. -->
   
   - [x] Did you create an [Apache 
Jira](https://issues.apache.org/jira/projects/THRIFT/issues/) ticket? 
[THRIFT-5948](https://issues.apache.org/jira/browse/THRIFT-5948)
   - [x] If a ticket exists: Does your pull request title follow the pattern 
"THRIFT-NNNN: describe my issue"?
   - [x] Did you squash your changes to a single commit?  (not required, but 
preferred)
   - [x] Did you do your best to avoid breaking changes?  If one was needed, 
did you label the Jira ticket with "Breaking-Change"?
   - [ ] If your change does not involve any code, include `[skip ci]` anywhere 
in the commit message to free up build resources.
   
   <!--
     The Contributing Guide at:
     https://github.com/apache/thrift/blob/master/CONTRIBUTING.md
     has more details and tips for committing properly.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to