Re: Interesting little regex

Alan Young Fri, 24 Feb 2006 14:21:36 -0800

> I'm afraid I'm not getting what you mean by "unique occurrence"...  Why is
> there only one unique occurrence of 'abc', when the string contains 'abc'
> four times?  Why are there two unique occurrences of 'de', but only one of
> 'bc'?  Why are there no unique occurences at all of 'abcd'?


I'm probably not stating myself well (I'm known for that).  Maybe
unique occurrence isn't what I'm really trying to say.

If we have a stream of text (say we have a file that is several 10s of
million bytes in size) and we're limited to how much we can load into
memory at a time, or we're recieving it over a connection of some kind
(e.g., serial or tcp) and we have a varying number of delimiters, with
a varying delimiter length (delimiter1, delim2, del3).  The value of
the delimiter is the delimiter and an unspecified number of bytes, up
to the next known delimiter. (value of delimiter 'del2' in the string
'del1abcdel2def' is 'del2def'.

I don't understand exactly why this format was decided upon, this was
the poser handed to my co-worker and this is what he came up with as a
proof of concept.  Of course, this requires that  no delimiter can be
a substring of another.

Better?
--
Alan

Re: Interesting little regex

Reply via email to