Thanks for responding. Unfortunately, the data already exists. I have no way of instituting limitations on the format, much less reformatting it to suit my needs. It is true that I can make some general assumptions about the data (unrealistically long strings are unlikely to occur), but I can't write a steadfastly robust reader under such assumptions.
The problem is that even if I impose an assumption of limited length strings, that doesn't prescribe a method for handling the possibility of an error. If a string really is too long and the reader fails to detect it, I'm not sure how to insure that the reader or subsequent map task fails in a clean fashion. If I could at least impose an assumption of this sort...and then detect and fail cleanly on violations of the assumption, that would go a long way. I'll think about it. Thanks. On Feb 22, 2012, at 14:59 , Steve Lewis wrote: > It sounds like you may need to give up a little to make things work - > Suppose, for example, that you placed a limit on the length of a quoted > string, > say 1024 characters - the reader can then either start at the beginning or > read back by, say 1024 characters to see if the start is in a quote and > proceed accordingly - it quoted strings can be of arbitrary length there may > be no good solution ________________________________________________________________________________ Keith Wiley kwi...@keithwiley.com keithwiley.com music.keithwiley.com "I do not feel obliged to believe that the same God who has endowed us with sense, reason, and intellect has intended us to forgo their use." -- Galileo Galilei ________________________________________________________________________________