I am running Active Perl 5.8.8.
I am converting a large enterprise database into a new system and have
run across a free-form text field in which users have entered all manner
of garbage.
One scenario is where two sentences have been run together with no
ending '.' or space. Here are some examples:
madeStyle
facilitatedOne
Anti-magneticQuality
As you can see, the new sentence begins with an upper-case letter, so if
I can just break apart the construct like this I'll be OK: "madeStyle"
should become "made. Style".
Difficulty: the fields contain hundreds of words both preceding and
following the "bad" words, so I have to be able to pick out the
lower-case words that contain one embedded upper-case character.
Ant ideas?
Barry Brevik
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs