JNSimba opened a new pull request, #356:
URL: https://github.com/apache/doris-spark-connector/pull/356
## Summary
- Enable gzip compression by default for StreamLoad writes to reduce network
transfer and improve write performance
- Arrow format is excluded since binary columnar data has low compression
benefit
- Fix `compress_type` check from `"gzip"` to `"gz"` to match Doris
server-side expected value
- Users can disable compression by setting
`doris.sink.properties.compress_type=`
## Changes
- `AbstractStreamLoadProcessor.java`: Add `putIfAbsent("compress_type",
"gz")` for non-Arrow formats, simplify compression check
- `DorisWriterITCase.scala`: Add ITCase for explicit gz compression and
explicit no-compression
## Test plan
- [ ] Existing CSV/JSON/Arrow write ITCases pass (covers default gzip
compression for CSV/JSON, no compression for Arrow)
- [ ] New `testSinkCsvGzCompression` passes (explicit gz compression)
- [ ] New `testSinkCsvNoCompression` passes (explicit disable compression)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]