jineshparakh commented on PR #17713: URL: https://github.com/apache/pinot/pull/17713#issuecomment-3912412713
@shounakmk219 Thanks for the review! Good catch on the other `getBlob()` callers. I'd actually noticed the missing null checks in `lastModified(), touch(), open(), and copyFile()` as well, but deliberately kept this PR scoped to just `deleteBatch` to keep things focused. The reason is that those methods aren't as straightforward to fix. The reason is that those methods aren't as straightforward to fix. They each have different return types and semantics. For instance, the PinotFS contract specifies that `touch()` should create an empty file if it doesn't exist, but the GCS implementation currently doesn't do this, so the fix would involve implementing file creation, not just a null guard. In `copyFile()`, the destination blob is created before the source blob is read, so if the source is null we'd also need to clean up the partially created destination. And for `lastModified()`, we'd need to decide whether to return 0L (like `LocalPinotFS` does for missing files) or throw. Each one needs a closer look at the PinotFS contract and how other implementations handle it. Happy to tackle those in a follow-up PR. On the `existsBlob(Blob blob)` suggestion, it actually does more than a null check. It also calls `blob.exists()`, which is an additional call to GCS to verify the blob still exists on the server. For `deleteBatch`, a simple `blob != null` should be sufficient because it avoids that extra network overhead, especially since this method processes batches of up to 100 blob IDs. Side note: `getUpdateTime()` used in `lastModified()` and `touch()` is also [deprecated](https://docs.cloud.google.com/java/docs/reference/google-cloud-storage/latest/com.google.cloud.storage.BlobInfo#com_google_cloud_storage_BlobInfo_getUpdateTime__) in the GCS SDK in favor of `getUpdateTimeOffsetDateTime()`. Could address that alongside the null-safety fixes in the follow-up. Please do let me know if you still feel that all these things should be handled in this PR itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
