InCerryGit opened a new pull request, #331: URL: https://github.com/apache/arrow-dotnet/pull/331
## Summary - Avoid temporary byte-array allocations for small `StringArray.Builder.Append(string)` values by encoding into stack memory before appending. - Pre-reserve offsets, validity, and value-buffer capacity for known-count `AppendRange` inputs. - Add focused correctness coverage for nulls, empty strings, custom encodings, large-string fallback, collection inputs, and non-collection enumerables. `AppendRange(ICollection<string>)` now performs a counting prepass to reserve value-buffer capacity before appending, so collection inputs are enumerated twice by design. ## Benchmark BenchmarkDotNet ShortRun, `StringBuilderAppendBenchmark`, 10,000 ASCII strings of length 32: | Method | Before | After | | --- | ---: | ---: | | `AppendSmallStrings` | 432.0 us / 1.66 MB | 341.0 us / 1157.5 KB | | `AppendRangeSmallStrings` | 426.2 us / 1.66 MB | 311.8 us / 353.68 KB | ## Validation - `dotnet test test/Apache.Arrow.Tests/Apache.Arrow.Tests.csproj -c Release --filter "FullyQualifiedName~Apache.Arrow.Tests.StringArrayTests"` - `rtk dotnet build "Apache.Arrow.sln" -c Release` - LSP diagnostics clean on changed files - Code review completed before commit; no blockers found -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
