I'm running into a brick wall with some regexp comparisons. Here's the 
situation:
During the file upload at some point, all of our html files became corrupted 
with duplicate artifacts. For example, at a random place it will have 
something like this "come see our><p align = "center">come see our new show!"

The duplication always seems to be ~10-20 characters long, and the placement 
isn't standardized. It always seems to occur near some markup code, but may 
occur once, twice, or 10 times in a file.

Is there a way to ask linux to compare for strings nearby each other in this 
fashion? Because it's an html file, there will be lots of duplicate stretches 
of markup code, so it needs to specify that the duplications will be nearly 
next to each other.

Any help would be appreciated,
David Reynolds
-- 
A human being should be able to change a diaper, plan an invasion,
butcher a hog, design a building, write a sonnet, set a bone, comfort the
dying, take orders, give orders, solve equations, pitch manure, program
a computer,  cook a tasty meal, fight efficiently, die gallantly. 
Specialization is for insects.          -- Robert Heinlein

Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com

Reply via email to