spiridonov commented on PR #645: URL: https://github.com/apache/arrow-go/pull/645#issuecomment-3779708779
Thank you for taking a look @zeroshade! I have changed a few things: * Added a private `newSchema` constructor, so that there is a way to initialize `Schema` without cloning fields and meta. This is used internally by `WithEndianness()` and `AddField()`. The latter used to effectively allocate fields twice (first inside itself and then by calling `NewSchema`). * Rolled back my `Fields()` change as a middle ground. So your old behavior remains. Each Loki query can touch hundreds of streams that are spread over thousands of data objects, which results in thousands of arrow records being processed. Small functions such as `NewSchema` or `Fields` called thousands times turn into gigabytes of allocations quickly. I find it a bit tricky with `iter.Seq[Field]`. By itself it does not allocate anything. But calling `yield` on each iteration effectively allocates a record on the heap because 1/ this fails escape test (a record is passed to another function and can be used after the current closure returns) and 2/ the record has pointers inside. Using `iter.Seq[Field]` will not allocate a slice, but it will allocate the same amount of records on the heap anyway. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
