Hi.
After struggeling with storing LOs in the database and making efficient
backups/restore-routines I've decided to move LOs out of the DB and use a
"filename"-column containing an UUID referencing the file one the filesystem.
Having this schema I need a robust book-keeping mechanism so I know when I can
delete files on the filesystem when no longer referenced from the database.
In an email-program I store the whole email-body as a binary in a file and
store all header-info in an email_message table (for this example's sake).
I have a book-keeping table :
create table origo_email_message_file_operation( filename VARCHAR not null,
operationVARCHAR not NULL, PRIMARY KEY (filename, operation) );
And a email_message table holding the messages, with a "filename"-column
referencing the BLOB.
For DELETE I use this routine:
deleteData(fileName, messageId) {
startTX()
deleteEmail() // Deletes the entry in email_message table
INSERT INTO origo_email_message_file_operation(filename, operation) VALUES(
'3b1d18ae-7b54-c055-1016-d928daec7294','DELETE'); deleteEmail(messageId)
commitTX()
}
So for DELETE-operations the file on disk isn't deleted by the main-program,
but by a cleanup-job which runs as a cron-job and the
inspects origo_email_message_file_operation for DELETE-entries and then delete
the referenced files, then removes the DELETE-entries from
origo_email_message_file_operation.
INSERT is like this:
insertData(fileName) {
startTX()
// First, register the INSERT in case it fails
INSERT INTO origo_email_message_file_operation(filename, operation)
VALUES('3b1d18ae-7b54-c055-1016-d928daec7294','INSERT');
commitTX()
startTX()
insertEmail() // Inserts the entry in email_message table
DELETE FROM origo_email_message_file_operation WHERE filename =
'3b1d18ae-7b54-c055-1016-d928daec7294'AND operation = 'INSERT';
// If this commits there is no entry left
in origo_email_message_file_operation and we're all good
commitTX()
}
UPDATE is implemented as INSERT + DELETE.
The challenge with this is what to do if INSERT rolls back. If INSERT rolls
back then we end up with an INSERT-entry in origo_email_message_file_operation
with no corresponding "filename"-entry in email_message. But I fail to se how a
cleanup job can know the difference between such an INSERT-entry
in origo_email_message_file_operation caused by ROLLBACK and and INSERT-entry
caused by an in-progress insertEmail() operation.
Does anyone have a robust mechanism for cleaning up files in such scenarios?
Thanks.
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
[email protected] <mailto:[email protected]>
www.visena.com <https://www.visena.com>
<https://www.visena.com>