changing the first occurrence only of a string in a line

2009-06-19 Thread Peter Alcibiades
How would you do the following in Rev?

We have a file consisting of records with tab separated fields.  Each field 
has a tag followed by contents.  Some tags occur more than once in some 
records which thus have varying numbers of fields. Duplicates are always 
consecutive.  I want to eliminate all the occurrences of any tag except 
the first one.  The duplicate tags can occur any place in the record, but 
if they are duplicated, will always be consecutive.

Doing this in SED is not particularly difficult, but it does require going 
out to shell, and so its not cross platform.  You just change the tag 
using the local scope to something else.  SED then only changes the first 
occurrence in a record.  Then you use the global scope and change all of 
them.  Then you go back and change the first one back to what it was.  In 
fact, if using SED like this, the only thing you need it for is to do the 
local, first tag, change - once this is done, the rest can be done in Rev.  
But it would be nice to stay in Rev for the whole thing.  

Is there a way in Rev to pick the first occurence of a string in a record, 
change it and not subsequent occurences, and then move on to the next 
record and do the same thing? 

That is, mimic the 'local' editing mode of SED?  

Bet you all thought them dinosaurs like SED had to be extinct by now!  But 
no, they are still trampling around in the swamps of text manipulation

For the sake of clarity, a record might look like this:

A aa TAB B bb TAB B bb TAB B bb TAB C cc TAB D dd TAB D dd

and what is wanted is to change the first occurrence of B to, for 
example !1, and the first occurence of D to, for example !2, or anyway 
something that will not occur by chance, to allow the subsequent editing 
to work globally on the file.  This is what SED does in local mode.

Peter
___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: changing the first occurrence only of a string in a line

2009-06-19 Thread Bernard Devlin
Peter, given the sample you provide, I believe this will work:

on mouseUp
   put A aa TAB B bb TAB B bb TAB B bb TAB C cc TAB D dd TAB D dd 
cr after tData
   put A aa TAB B bb TAB B bb TAB B bb TAB C cc TAB D dd TAB D dd 
cr after tData
   --replace TAB with tab in tData
   put B 1!  cr after tSubs
   put C 2!  cr after tSubs
   set the wholematches to true
   repeat for each line tLine in tSubs
  put word 1 of tLine into tFind
  put word 2 of tLine into tSubstitute
  put wordoffset(tFind,tData) into tFoundPos
  if tFoundPos  0 then
 put tSubstitute into word tFoundPos of tData
  end if
   end repeat
   put tData
end mouseUp

However, one of the frustrating things about Rev is that the PCRE
engine behind replaceText and matchText is so poorly documented.  For
example, replaceText can have its functionality altered by the
inclusion of modifiers such as (?i) for case insensitive and (?m)
for multiline.  There are other modifiers
(http://uk.php.net/manual/en/reference.pcre.pattern.modifiers.php)
some of which are unacceptable, and others that seem to be acceptable.
 Nowhere have I found these documented with regard to Rev.  The (?A)
anchored modifier (if it was one of the acceptable tokens) may have
worked in your case, but it is apparently unnaceptable.

Really you ought to just be able to take your SED expressions and use
them with replaceText, as SED expressions are regex AFAIK.  There may
be some differences between SED and pcre, but since the differences
between pcre and Rev are not documented it is vexatious.

Bernard

On Fri, Jun 19, 2009 at 7:33 AM, Peter
Alcibiadespalcibiades-fi...@yahoo.co.uk wrote:
 How would you do the following in Rev?

 We have a file consisting of records with tab separated fields.  Each field
 has a tag followed by contents.  Some tags occur more than once in some
 records which thus have varying numbers of fields. Duplicates are always
 consecutive.  I want to eliminate all the occurrences of any tag except
 the first one.  The duplicate tags can occur any place in the record, but
 if they are duplicated, will always be consecutive.

 Doing this in SED is not particularly difficult, but it does require going
 out to shell, and so its not cross platform.  You just change the tag
 using the local scope to something else.  SED then only changes the first
 occurrence in a record.  Then you use the global scope and change all of
 them.  Then you go back and change the first one back to what it was.  In
 fact, if using SED like this, the only thing you need it for is to do the
 local, first tag, change - once this is done, the rest can be done in Rev.
 But it would be nice to stay in Rev for the whole thing.

 Is there a way in Rev to pick the first occurence of a string in a record,
 change it and not subsequent occurences, and then move on to the next
 record and do the same thing?

 That is, mimic the 'local' editing mode of SED?

 Bet you all thought them dinosaurs like SED had to be extinct by now!  But
 no, they are still trampling around in the swamps of text manipulation

 For the sake of clarity, a record might look like this:

 A aa TAB B bb TAB B bb TAB B bb TAB C cc TAB D dd TAB D dd

 and what is wanted is to change the first occurrence of B to, for
 example !1, and the first occurence of D to, for example !2, or anyway
 something that will not occur by chance, to allow the subsequent editing
 to work globally on the file.  This is what SED does in local mode.

 Peter
 ___
 use-revolution mailing list
 use-revolution@lists.runrev.com
 Please visit this url to subscribe, unsubscribe and manage your subscription 
 preferences:
 http://lists.runrev.com/mailman/listinfo/use-revolution

___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: changing the first occurrence only of a string in a line

2009-06-19 Thread Alex Tweedly
I'm not going toanswer your direct question (replace first occurrence) 
because that is a subset of the whole question, which I think is more 
interesting :-)


How would I remove the duplicates as described below ?
(I have assumed that in each field the tag is separated from
the content by a space, so used 'word' to extract the tag)

[I tested this in the msg box]

put A aaa  tab  B bbb  tab  B bb2  C ccc cr into tData
put A a2a  tab  B b2b  tab  B 2b2  C ccc tab C cc2 
cr after tData


set the itemDel to tab
repeat for each line L in tData
  put empty into lastTag
  put empty into tLine
  repeat for each item itm in L
if word 1 of itm  lastTag then
   put itm  tab after tLine
   put word 1 of itm into lastTag
   end if
  end repeat
  delete the last char of tLine   -- trailing tab character
  put tLine  cr afer tOutput
end repeat
put tOutput


-- Alex.


Peter Alcibiades wrote:

How would you do the following in Rev?

We have a file consisting of records with tab separated fields.  Each field 
has a tag followed by contents.  Some tags occur more than once in some 
records which thus have varying numbers of fields. Duplicates are always 
consecutive.  I want to eliminate all the occurrences of any tag except 
the first one.  The duplicate tags can occur any place in the record, but 
if they are duplicated, will always be consecutive.


Doing this in SED is not particularly difficult, but it does require going 
out to shell, and so its not cross platform.  You just change the tag 
using the local scope to something else.  SED then only changes the first 
occurrence in a record.  Then you use the global scope and change all of 
them.  Then you go back and change the first one back to what it was.  In 
fact, if using SED like this, the only thing you need it for is to do the 
local, first tag, change - once this is done, the rest can be done in Rev.  
But it would be nice to stay in Rev for the whole thing.  

Is there a way in Rev to pick the first occurence of a string in a record, 
change it and not subsequent occurences, and then move on to the next 
record and do the same thing? 

That is, mimic the 'local' editing mode of SED?  

Bet you all thought them dinosaurs like SED had to be extinct by now!  But 
no, they are still trampling around in the swamps of text manipulation


For the sake of clarity, a record might look like this:

A aa TAB B bb TAB B bb TAB B bb TAB C cc TAB D dd TAB D dd

and what is wanted is to change the first occurrence of B to, for 
example !1, and the first occurence of D to, for example !2, or anyway 
something that will not occur by chance, to allow the subsequent editing 
to work globally on the file.  This is what SED does in local mode.
  


___
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution