On Friday, April 12, 2002, at 12:38 , Raghupathy, Ramesh . wrote:
> I am sorry I was not clear in my question.
>
> The word1 and word2 may occur on different lines of the file and may
> occur in different combinations.
>
> for e.g :
>
> (not showing the new lines..)
>
>
> ....word1.....word1.....word1....word2....word1...word2....word2....word2.
> ...
> .....
>
> In this example I like to extract the text which is between the 3rd
> word1
> and the 1st word2 and also the text between the 4th word1 and 2nd word2.
>
> Thanks.
this looks like you would want an XML DTD and then just use the
XML parser with that DTD as the reference.....
allow me to reconstruct what I think you are asking for
m1: word1 <lA>
m2: word1 <lB>
m3: word1 <LC> word2
<LD>
m4: word1 <LE> word2
<LF>
word2
<LG>
word2
if we can assume then that the set {lA,lB,LD,LF,LG} will only
contain white space elements - and that we are only interested
in the information nested at LC and LE - you have one set of issues.
It's the moment that you decided that all of the <XX> 'messages' have
to be 'preserved' that things get messy....
using the 'm?' side annotation I put up, message m1 would need to
get out of the data stream
m1: lA , m2, LG
and we then have to parse out of m2 it's information and nested data....
ciao
drieux
---
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]