kunwp1 commented on code in PR #4111:
URL: https://github.com/apache/texera/pull/4111#discussion_r2595250111
##########
amber/src/main/scala/org/apache/texera/web/service/WorkflowService.scala:
##########
@@ -349,6 +349,6 @@ class WorkflowService(
}
}
// Delete big objects
- BigObjectManager.deleteAllObjects()
+ LargeBinaryManager.deleteAllObjects()
Review Comment:
The current version doesn’t handle the large-binary lifecycle properly.
Right now, we expose APIs that users can invoke directly in UDFs or operator
executors. To manage the lifecycle correctly, either (1) the user would need to
explicitly specify that a large binary was created in a particular execution,
or (2) the operator executor would need to know that it is running under a
specific execution context.
Both approaches will make the engine logic complicated, so for this first
version we chose a simpler behavior: aggressively removing all large binaries.
The other option is to let the admin manually remove them to save storage
space. In the future, we plan to introduce another abstraction layer that hides
these API details from users and lets the system manage large binaries
automatically.
Let me know if any part of this doesn't make sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]