InCerryGit opened a new pull request, #331:
URL: https://github.com/apache/arrow-dotnet/pull/331

   ## Summary
   
   - Avoid temporary byte-array allocations for small 
`StringArray.Builder.Append(string)` values by encoding into stack memory 
before appending.
   - Pre-reserve offsets, validity, and value-buffer capacity for known-count 
`AppendRange` inputs.
   - Add focused correctness coverage for nulls, empty strings, custom 
encodings, large-string fallback, collection inputs, and non-collection 
enumerables.
   
   `AppendRange(ICollection<string>)` now performs a counting prepass to 
reserve value-buffer capacity before appending, so collection inputs are 
enumerated twice by design.
   
   ## Benchmark
   
   BenchmarkDotNet ShortRun, `StringBuilderAppendBenchmark`, 10,000 ASCII 
strings of length 32:
   
   | Method | Before | After |
   | --- | ---: | ---: |
   | `AppendSmallStrings` | 432.0 us / 1.66 MB | 341.0 us / 1157.5 KB |
   | `AppendRangeSmallStrings` | 426.2 us / 1.66 MB | 311.8 us / 353.68 KB |
   
   ## Validation
   
   - `dotnet test test/Apache.Arrow.Tests/Apache.Arrow.Tests.csproj -c Release 
--filter "FullyQualifiedName~Apache.Arrow.Tests.StringArrayTests"`
   - `rtk dotnet build "Apache.Arrow.sln" -c Release`
   - LSP diagnostics clean on changed files
   - Code review completed before commit; no blockers found


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to