Willy wrote: > > what i would like to do is the following::: > > open a file of undetermined format, > take all non alphanumeric characters (other than spaces, tabs, \n etc) > and parse the output around them...
What about punctuation characters? > open (IN,"file.unknown"); You should _always_ verify that the file was opened. > while (<IN>) > { > s/insert regular expression here/\n\n; > push(@array,$_); # or just shunt it out to another file :P > } > > i'd love suggestions on this :) also, if memory serves, each time > the regular expression is matched, then $1..$n gets the match value (or > am i thinking of something else? Only if the match value is enclosed in parentheses. > i would like to be able to manipulate them at some other point > in the program- maybe find the use of various unknown tags, or > substitute them with html tags that would make the document more > legible..... (not very worried about that right now though) There are various modules available on CPAN that allow you to manipulate HTML documents. http://search.cpan.org/ John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]