Good afternoon,

I am not sure if this is the correct place to ask questions but am running
out of options.

I am a MySQL DBA and work for a college that uses Jackrabbit to store
student and class documents. Our LMS is a 23rd party application and when
we pressed them about data cleanup we were references to the
'GarbageCollector' and provided some minimal instructions.

Thanks to the junior java developer that we have we found the mark and
sweep functions (bits included below).

>From my perspective as the DBA knowing that I have a 1TB table/database
that has never been cleaned up and reviewing what little info I have it
appears that the first run will flag everything for deletion.

I am hoping that we missed something and there is alternate criteria
that is applied to the process as not all of the documents should be
deleted.

Is there someone that can help a non java programmer figure this out?

Thanks In Advance
Joe Gibbs


https://github.com/nabils/jackrabbit/blob/master/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/data/GarbageCollector.java

"the mark() function just gets information about the data in the JCR
(modified date, size, id), then it iterates over all non-deleted nodes
updating their modified time to now, meaning that everything older than the
current timestamp has been deleted -

then the sweep() function calls DELETE FROM ${tablePrefix}${table} WHERE
LAST_MODIFIED<${currentTimestamp} to delete everything that wasn't touched
by mark()"

"The query it's running to update the modified date is UPDATE
${tablePrefix}${table} SET LAST_MODIFIED=${currentTimestamp} WHERE
ID=${ID}? AND LAST_MODIFIED<${currentTimestamp}"


The query it's running to update the modified date is UPDATE
${tablePrefix}${table} SET LAST_MODIFIED=${currentTimestamp} WHERE
ID=${ID}? AND LAST_MODIFIED<${currentTimestamp}

Reply via email to