-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
I just revived a database that was in a version 3.23 server and moved it
to a 4.1 There are big fields of TEXT based data. They have a way of
compressing the amount of TEXT data by identifying common subchunks and
putting them in a "subchunk" table and replacing them with a marker
inside the main text that will pull in that subchunk whenever the parent
chunk is requested. This subchunking seems to have been done kind of
ad hoc, because I've noticed the database still has quite a bit of
duplicated chunks from one record to another. The client does not want
to buy another drive to store data (even tho he really should for
other reasons anyway but who cares what I think) , so he wants it
compressed, and oh well I look on it as an opportunity for some
housecleaning. Now that we have 4.1 what is the best practice for
automated looking for common subchunks, factoring them out, and then
replacing the original parent text with itself with the chunk cut out
and a marker inserted. The hard part is finding them, ovbiously. The
rest is easy.
- best practices for finding duplicate chunks Gerald Taylor
- Re: best practices for finding duplicate chunks Gerald Taylor
- Re: best practices for finding duplicate chunks Alexey Polyakov