On Fri, 7 Sep 2007 15:04:50 +0100 Richard Lyons <[EMAIL PROTECTED]> wrote:
> Hi, all you script wizards. > > I thought this would be easy, but I haven't found anything to crib > from... > > I need a script to read a text file (actually tex) and parse lines of a > table that may or may not span newline characters in the file. > Basically, there are lines of the form > > {some text} & {some more text} & {text c} & {text d} \\ > > where the braces are only for clarity and do not occur in the files, and > where the bits of text may include whitespace which may include newline > characters. There may also be escaped ampersands in the text ('\&'), and > the text fragments may be empty. > > I suspect perl may be the way forward. I need to be able to read each > file, parse each set of three ampersands with a double backslash > breaking it into four substrings, manipulate the substrings and write > the file anew. A typical manipulation will be to take text c and copy > it to text d. I shall also try to strip leading and trailing whitespace > to tidy up the file. > > Any and all pointers will be gratefully received! Take a look at the perl Text::ParseWords module 'man text::parsewords'). It may do what you want, depending on your needs with respect to quoting and escaping. > richard Celejar -- mailmin.sourceforge.net - remote access via secure (OpenPGP) email ssuds.sourceforge.net - A Simple Sudoku Solver and Generator -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]