Re: Regex help

Maurits van Rees Fri, 10 Dec 2004 07:48:21 -0800

On Fri, Dec 10, 2004 at 01:58:04PM +0100, Francois Cerbelle wrote:
> My source file looks like :
> ----------------------------------------------------------------
> Text {_Index blabla:we _} bla {_Index {_StartRange_} bla _} bla
> vla
> stuff
> bbgfd {_Index {_EndRange_} bla _}
> ----------------------------------------------------------------
> 
> I would like to have this :
> {_Index blabla:we _}
> {_Index {_StartRange_} bla _}
> {_Index {_EndRange_} bla _}


I don't know FrameMaker, so there may be errors in my code because of
that, but I do know something about regexes.  I put your text in a file
called test:

[EMAIL PROTECTED]:~$ cat test
Text {_Index blabla:we _} bla {_Index {_StartRange_} bla _} bla
vla
stuff
bbgfd {_Index {_EndRange_} bla _}

Some grepping and sedding _should_ be able to produce nice results.
The following comes close:

[EMAIL PROTECTED]:~$ cat test | egrep "{|}" | sed "s/^[^{]*{/{/g;s/}[^}]*$/}/g"
{_Index blabla:we _} bla {_Index {_StartRange_} bla _}
{_Index {_EndRange_} bla _}

All code is on one line; try to keep it that way when you copy it,
else you could be in for some debugging. ;-)

Alright, let's take it a step further.  This seems more like what you
want, with the (_Start|EndRange_} removed and each Index thingie on
one line:

[EMAIL PROTECTED]:~$ cat test | egrep "{|}" | sed "s/^[^{]*{/{/g;s/}[^}]*$/}/g" 
| sed "s/{_StartRange_}//g" | sed "s/{_EndRange_}//g" | sed "s/_}/_}\n/g" | sed 
"s/^[^{]*{/{/g;s/}[^}]*$/}/g" | egrep "{|}"
{_Index blabla:we _}
{_Index  bla _}
{_Index  bla _}

> Or better:
> blabla:we
> bla
> bla

Okay, the code starts looking creepy now, but I simply keep increasing
the previous code:

[EMAIL PROTECTED]:~$ cat test | egrep "{|}" | sed "s/^[^{]*{/{/g;s/}[^}]*$/}/g" 
| sed "s/{_StartRange_}//g" | sed "s/{_EndRange_}//g" | sed "s/_}/_}\n/g" | sed 
"s/^[^{]*{/{/g;s/}[^}]*$/}/g" | egrep "{|}" | sed "s/{_Index *//g;s/ *_}//g"
blabla:we
bla
bla

> or even better :
> %s/blabla:we//g
> %s/bla//g
> %s/bla//g

[EMAIL PROTECTED]:~$ cat test | egrep "{|}" | sed "s/^[^{]*{/{/g;s/}[^}]*$/}/g" 
| sed "s/{_StartRange_}//g" | sed "s/{_EndRange_}//g" | sed "s/_}/_}\n/g" | sed 
"s/^[^{]*{/{/g;s/}[^}]*$/}/g" | egrep "{|}" | sed "s/{_Index *//g;s/ *_}//g" | 
sed "s/^/%s\//g;s/$/\/\/g/g"
%s/blabla:we//g
%s/bla//g
%s/bla//g

You may want to do this:

[EMAIL PROTECTED]:~$ cat test | egrep "{|}" | sed "s/^[^{]*{/{/g;s/}[^}]*$/}/g" 
| sed "s/{_StartRange_}//g" | sed "s/{_EndRange_}//g" | sed "s/_}/_}\n/g" | sed 
"s/^[^{]*{/{/g;s/}[^}]*$/}/g" | egrep "{|}" | sed "s/{_Index *//g;s/ *_}//g" | 
sed "s/^/%s\//g;s/$/\//g" | sort | uniq > test2
[EMAIL PROTECTED]:~$ cat test2
%s/bla/
%s/blabla:we/

Then edit the file test2 so it has the translations at the end of the
line. Result could be:

[EMAIL PROTECTED]:~$ cat test2
%s/bla/hello
%s/blabla:we/hello hello

And then finish it:
[EMAIL PROTECTED]:~$ cat test2 | sed "s/$/\/g/g"
%s/bla/hello/g
%s/blabla:we/hello hello/g


> So, I would just have to delete the strings not to be translated and
> double strings, to insert translations and to feed sed with the file.
> 
> I am very bad in regex and I dont succeed. But I know regex can help
> me a lot, reducing the amount of work and keeping consistency in the
> translations.
> 
> Is there a regex expert here ? Please help me.

I've never been called a regex expert, but I hope this code does the
trick.

> Thanks

You're welcome. :)

-- 
Maurits van Rees | http://maurits.vanrees.org/ [Dutch/Nederlands] 
Public GnuPG key: keyserver.net ID 0x1735C5C2
"Let your advance worrying become advance thinking and planning."
 - Winston Churchill

signature.asc
Description: Digital signature

Re: Regex help

Reply via email to