pitrou commented on pull request #9471: URL: https://github.com/apache/arrow/pull/9471#issuecomment-779969465
I don't think you'll find a lot of software that takes care to secure-erase a S3 private key after having used it. I'm not sure the AWS SDK for C++ even does it. We can think of other concerns when using uninitialized buffers. For example, let's say you call `MakeArrayOfNulls` and it gives you back a values buffer full of (stale) random data. You then send it using Arrow IPC with compression enabled. The values buffer, while irrelevant, will be poorly compressed because it has random data instead of having been, say, zero-initialized. Another concern yet is that several runs of the same program will produce non-deterministic output. Which is annoying if you try to validate output files using e.g. a checksum (think reproducible builds, but for data). All in all, I think there are good reasons to initialize null-masked value slots deterministically. The main annoyance is that I can't think of a way to test systematically for it (apart from relying on Valgrind errors, but that will only catch a subset of cases). cc @emkornfield @wesm for opinions. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
