> I'm afraid I'm not getting what you mean by "unique occurrence"... Why is > there only one unique occurrence of 'abc', when the string contains 'abc' > four times? Why are there two unique occurrences of 'de', but only one of > 'bc'? Why are there no unique occurences at all of 'abcd'?
I'm probably not stating myself well (I'm known for that). Maybe unique occurrence isn't what I'm really trying to say. If we have a stream of text (say we have a file that is several 10s of million bytes in size) and we're limited to how much we can load into memory at a time, or we're recieving it over a connection of some kind (e.g., serial or tcp) and we have a varying number of delimiters, with a varying delimiter length (delimiter1, delim2, del3). The value of the delimiter is the delimiter and an unspecified number of bytes, up to the next known delimiter. (value of delimiter 'del2' in the string 'del1abcdel2def' is 'del2def'. I don't understand exactly why this format was decided upon, this was the poser handed to my co-worker and this is what he came up with as a proof of concept. Of course, this requires that no delimiter can be a substring of another. Better? -- Alan
