Solution [Re: Need help with narroely focused use case of Emacs]

2024-07-05 Thread Richard Owlett

On 06/29/2024 12:17 PM, to...@tuxteam.de wrote:

On Sat, Jun 29, 2024 at 06:37:23AM -0500, Richard Owlett wrote:

[...]


When searching for information on regular expressions I came across one that
did it by searching for
{"1 thru 9" OR "10 thru 99" OR "100 thru 999"} .
I lost the reference ;<


That would be something like ([0-9]|[1-9][0-9]|[1-9][0-9][0-9])
since [x-y] expresses a range of characters, the | does OR and
the () do grouping [1].

If you allow yourself to be a bit sloppy [2], and allow numbers
with leading zeros, many regexps flavors have the "limited count
operator" {min,max}, with which you might say [0-9]{1,3} (you
won't need the grouping here, since the repeat operator binds
strongly enough to not mess up the rest of your regexp.

CAVEAT IMPLEMENTOR: Depending on the flavor of your regexps, the
() and sometimes the | need a backslash in front to give them
their magic meaning. In Emacs they do, in Perl (and PCRE, which
is most probably the engine behind Pluma) they don't. In grep
(and sed) you can switch behavior with an option (-E was it,
IIRC).

Cheers

[1] This grouping is (again, depening on your regexp flavour)
a "capturing grouping", meaning that you can refer later
to what was matched by the sub-expression in the parens.
There are also (flavor blah blah) non-capturing groupings.

[2] You always are somewhat sloppy with regexps. Actually you
are being sloppy already, since every classical textbook
will tell you that they totally suck at understanding
"nested stuff", which HTML is, alas. But under the right
conditions they can butcher it alright :-)



Looks like KDE's Kate is viable solution for editing the particular HTML 
files of interest. It seems to be an appropriate mix of Pluma's ease of 
use and Emacs' power. And for some reason I had already installed it.




Re: Need help with narroely focused use case of Emacs

2024-06-30 Thread mick.crane

On 2024-06-30 14:21, Greg Wooledge wrote:

On Sun, Jun 30, 2024 at 12:32:15 +0100, mick.crane wrote:

got it thanks.









I don't know what you're trying to do, but ERE [0-7]{1,2} matches one-
or two-digit *octal* numbers (e.g. 5, 07, 72, 77) but not numbers that
contains the digits 8 or 9.

Do you have a book whose verses are enumerated in octal?


Looked at the original question, having first misunderstood it I said 
could be done with search and replace in an editor then realised I 
wasn't sure how to do what was asked.
So now I know you can use regular expressions in Geany and a bit more 
about the format.

Previous post could have been clearer but I was trying to be brief.



Re: Need help with narroely focused use case of Emacs

2024-06-30 Thread Andy Smith
Hello,

On Sun, Jun 30, 2024 at 09:21:57AM -0400, Greg Wooledge wrote:
> Do you have a book whose verses are enumerated in octal?

No one clarified that this was the *Christian* Bible. 

Thanks,
Andy



Re: Need help with narroely focused use case of Emacs

2024-06-30 Thread Greg Wooledge
On Sun, Jun 30, 2024 at 12:32:15 +0100, mick.crane wrote:
> got it thanks.
> 
> 
> 
> 
> 
> 
> 

I don't know what you're trying to do, but ERE [0-7]{1,2} matches one-
or two-digit *octal* numbers (e.g. 5, 07, 72, 77) but not numbers that
contains the digits 8 or 9.

Do you have a book whose verses are enumerated in octal?



Re: Need help with narroely focused use case of Emacs

2024-06-30 Thread mick.crane

On 2024-06-29 20:29, Greg Wooledge wrote:

On Sat, Jun 29, 2024 at 20:18:02 +0100, mick.crane wrote:

Oh, I see what the question was.
There is "use regular expressions", "use multi line matching" in Geany
I'm not very good at regular expressions.
I'd probably do it 3 times
"search for" 
"search for" 
"search for" 


There's more than one regular expression syntax, so the first step is
to figure out which *kind* of regular expression you're writing.

In a Basic Regular Expression (BRE), you can write "one to three
digits" as:

[[:digit:]]\{1,3\}

In an Extended Regular Expression (ERE), you'd remove the backslashes:

[[:digit:]]{1,3}

Some people would use [0-9] instead of [[:digit:]].  [0-9] should work
in any locale I'm aware of, but is theoretically less portable than
[[:digit:]].  If you're actually doing this by typing a regex into an
editor, then [0-9] might be preferred because it's easier to type.  If
you're writing a program, you should probably go with [[:digit:]].


got it thanks.










Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread mick.crane

On 2024-06-29 20:29, Greg Wooledge wrote:

On Sat, Jun 29, 2024 at 20:18:02 +0100, mick.crane wrote:

Oh, I see what the question was.
There is "use regular expressions", "use multi line matching" in Geany
I'm not very good at regular expressions.
I'd probably do it 3 times
"search for" 
"search for" 
"search for" 


There's more than one regular expression syntax, so the first step is
to figure out which *kind* of regular expression you're writing.

In a Basic Regular Expression (BRE), you can write "one to three
digits" as:

[[:digit:]]\{1,3\}

In an Extended Regular Expression (ERE), you'd remove the backslashes:

[[:digit:]]{1,3}

Some people would use [0-9] instead of [[:digit:]].  [0-9] should work
in any locale I'm aware of, but is theoretically less portable than
[[:digit:]].  If you're actually doing this by typing a regex into an
editor, then [0-9] might be preferred because it's easier to type.  If
you're writing a program, you should probably go with [[:digit:]].


Ta,
I'd had a quick look.
the regular expression thing looks to do one character at a time.


I couldn't see how to do a wild card and if it didn't exist.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Greg Wooledge
On Sat, Jun 29, 2024 at 20:18:02 +0100, mick.crane wrote:
> Oh, I see what the question was.
> There is "use regular expressions", "use multi line matching" in Geany
> I'm not very good at regular expressions.
> I'd probably do it 3 times
> "search for" 
> "search for" 
> "search for" 

There's more than one regular expression syntax, so the first step is
to figure out which *kind* of regular expression you're writing.

In a Basic Regular Expression (BRE), you can write "one to three
digits" as:

[[:digit:]]\{1,3\}

In an Extended Regular Expression (ERE), you'd remove the backslashes:

[[:digit:]]{1,3}

Some people would use [0-9] instead of [[:digit:]].  [0-9] should work
in any locale I'm aware of, but is theoretically less portable than
[[:digit:]].  If you're actually doing this by typing a regex into an
editor, then [0-9] might be preferred because it's easier to type.  If
you're writing a program, you should probably go with [[:digit:]].



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread mick.crane

On 2024-06-29 16:09, Max Nikulin wrote:

On 29/06/2024 20:07, mick.crane wrote:

On 2024-06-29 12:34, Max Nikulin wrote:

To manipulate with HTML it is better to write a script in some
programming language, e.g. for python there are lxml etree and
BeautifulSoup packages. This way it is easier to maintain valid
document structure with paired opening and closing tags.

I have not tried Emacs lisp facilities for dealing with HTML.


open in Geany

[...]

click search select replace
copy paste selection into "search for"


By "Emacs *lisp* facilities for dealing with HTML" I mead something
like `libxml-parse-html-region'. Notice that I was suggesting against
search


Oh, I see what the question was.
There is "use regular expressions", "use multi line matching" in Geany
I'm not very good at regular expressions.
I'd probably do it 3 times
"search for" 
"search for" 
"search for" 



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread David Wright
On Sat 29 Jun 2024 at 17:08:04 (+0200), Vincent Lefevre wrote:
> On 2024-06-28 20:53:50 +, Michael Kjörling wrote:
> > Yes, it almost certainly can be done with a single sed (or other
> > similar tool) invocation where the regular expression matches
> > precisely what you want it to match. But unless this is something you
> > will do very often, I tend to prefer readability over being clever,
> > even if the readable version is somewhat less performant.
> 
> To match a range inside a regexp, $(rgxg range 1 119) is readable. :)
> 
> rgxg is provided by the package of the same name.

Perhaps best to ignore the narrow focus on 119 in the OP.
For bible verses per chapter, the largest number is 176.
(An accidental choice of 119 might be explained by that
psalm having the most verses. Only Psalms requires three
digits as it happens; I think the runner-up has only about
half that.)

It would be tedious and error-prone to have to specify the
maximum range for each chapter. Different versions of the
bible don't even agree with each other on numbers of verses.

Cheers,
David.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Curt
On 2024-06-29,   wrote:
>
>> Owlett is a notorious troll who never listens to reason.
>
> This is wrong, borderline defamatory. Richard Owlett is not a

Andy Smith:

 It's not an authentic Owlett thread unless it contains an enormous
 XY problem, a monomaniacal obsession with a solution already
 part-dreamed up by the OP, several factual errors, and a constant
 trickle of confounding small details that were never provided up
 front, now delivered with glee.

IOW, a troll. So go fuck yourself, as you should have done years ago.

Defamatory. What are you, a fucking lawyer? Sue me then, you little snit. 






Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread tomas
On Sat, Jun 29, 2024 at 06:37:23AM -0500, Richard Owlett wrote:

[...]

> When searching for information on regular expressions I came across one that
> did it by searching for
>{"1 thru 9" OR "10 thru 99" OR "100 thru 999"} .
> I lost the reference ;<

That would be something like ([0-9]|[1-9][0-9]|[1-9][0-9][0-9])
since [x-y] expresses a range of characters, the | does OR and
the () do grouping [1].

If you allow yourself to be a bit sloppy [2], and allow numbers
with leading zeros, many regexps flavors have the "limited count
operator" {min,max}, with which you might say [0-9]{1,3} (you
won't need the grouping here, since the repeat operator binds
strongly enough to not mess up the rest of your regexp.

CAVEAT IMPLEMENTOR: Depending on the flavor of your regexps, the
() and sometimes the | need a backslash in front to give them
their magic meaning. In Emacs they do, in Perl (and PCRE, which
is most probably the engine behind Pluma) they don't. In grep
(and sed) you can switch behavior with an option (-E was it,
IIRC).

Cheers

[1] This grouping is (again, depening on your regexp flavour)
   a "capturing grouping", meaning that you can refer later
   to what was matched by the sub-expression in the parens.
   There are also (flavor blah blah) non-capturing groupings.

[2] You always are somewhat sloppy with regexps. Actually you
   are being sloppy already, since every classical textbook
   will tell you that they totally suck at understanding
   "nested stuff", which HTML is, alas. But under the right
   conditions they can butcher it alright :-)

-- 
tomás


signature.asc
Description: PGP signature


Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread tomas
On Sat, Jun 29, 2024 at 04:02:56PM -, Curt wrote:
> On 2024-06-29, Michael Kjörling  wrote:
> >> 
> >> HUH ??
> >
> > ..._focus on the goal_.
> >
> 
> 
> Owlett is a notorious troll who never listens to reason.

This is wrong, borderline defamatory. Richard Owlett is not a
troll [1]. He may be uncommon in the way he approaches things,
and I do understand his ways may annoy some people.

If they annoy you, you always may choose to not respond. Others
will chime in. Much more polite and much more effective for the
whole mailing list.

Lobbing insults at people doesn't help anyone.

Cheers

[1] by the very definition of "troll", who isn't interested in the topic
   itself, but just in eliciting a response.
-- 
t


signature.asc
Description: PGP signature


Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Lee
Hi,

> > So you may prefer to use regexes as
> > Murphy intended, handling both the opening and closing tags at the same
> > time, leaving the intervening text intact.
>
> In this particular case I suspect it would become overly complex.
> I've already discovered that the order of edits is important.

I guess it depends on what you're used to.  I don't think this bit is
overly complex .. your opinion might be different

$ cat /tmp/z
cat /dev/null > txtfile.html
for v in $(seq 1 12); do echo ' text
text text ' >> txtfile.html; done
sed -Ei.bak 's@([^<]*)@\1@g' txtfile.html

$ bash z

$ cat txtfile*
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 
 text text text 

$

Regards,
Lee



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Curt
On 2024-06-29, Michael Kjörling  wrote:
>> 
>> HUH ??
>
> ..._focus on the goal_.
>


Owlett is a notorious troll who never listens to reason.

But you people adore this kind of troll, inexplicably, perhaps because
he allows you to expand endlessly on your reams of essentially useless
knowledge.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Max Nikulin

On 29/06/2024 20:07, mick.crane wrote:

On 2024-06-29 12:34, Max Nikulin wrote:

To manipulate with HTML it is better to write a script in some
programming language, e.g. for python there are lxml etree and
BeautifulSoup packages. This way it is easier to maintain valid
document structure with paired opening and closing tags.

I have not tried Emacs lisp facilities for dealing with HTML.


open in Geany

[...]

click search select replace
copy paste selection into "search for"


By "Emacs *lisp* facilities for dealing with HTML" I mead something like 
`libxml-parse-html-region'. Notice that I was suggesting against 
search





Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Vincent Lefevre
On 2024-06-28 20:53:50 +, Michael Kjörling wrote:
> Yes, it almost certainly can be done with a single sed (or other
> similar tool) invocation where the regular expression matches
> precisely what you want it to match. But unless this is something you
> will do very often, I tend to prefer readability over being clever,
> even if the readable version is somewhat less performant.

To match a range inside a regexp, $(rgxg range 1 119) is readable. :)

rgxg is provided by the package of the same name.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Andy Smith
Hello,

On Sat, Jun 29, 2024 at 01:46:27PM +, Michael Kjörling wrote:
> On 29 Jun 2024 06:12 -0500, from rowl...@access.net (Richard Owlett):
> >> there may be other  closing tags you don't want to
> >> change because they close other  tags we haven't seen.
> > 
> > Chuckle ;} The appropriate "" to be replaced by "" is ALWAYS
> > preceded by "#160;" .
> 
> As far as I can see, neither of this was stated in the original
> question. Please don't add arbitrary requirements later to invalidate
> potential answers.

It's not an authentic Owlett thread unless it contains an enormous
XY problem, a monomaniacal obsession with a solution already
part-dreamed up by the OP, several factual errors, and a constant
trickle of confounding small details that were never provided up
front, now delivered with glee.

Otherwise it's just sparkling timewasting.

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Michael Kjörling
On 29 Jun 2024 05:51 -0500, from rowl...@access.net (Richard Owlett):
>> Ignoring the question about Emacs
> 
> Emacs *CAN NOT* be ignored.

I did not say to ignore _Emacs_. I said that I was ignoring the
_question_ about Emacs, to instead...

>> and focusing on the goal (your
   
>> question otherwise is an excellent example of a XY question), this is
>> not something regular expressions are very good at.
> 
> HUH ??

..._focus on the goal_.

(It is usually a good idea to read at least a whole sentence before
responding to it.)

The _goal_ in this case being your stated specific series of string
replacements.

If you want to use Emacs to do that, no one is stopping you from doing
so. You can directly adapt what I suggested to an Emacs workflow. But
just because a nailgun can be used to hang a painting doesn't mean
that a nailgun is the _appropriate_ tool for that particular job;
without detracting from its usability in _other_ applications.

Sometimes really all you are looking for is a small hammer.

-- 
Michael Kjörling  https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Michael Kjörling
On 29 Jun 2024 06:12 -0500, from rowl...@access.net (Richard Owlett):
>>> $ for v in $(seq 1 119); do sed -i 's,>> id="V'$v'">,,g' ./*.html; done
>> 
>> Having done that (or similar), don't forget to change the relevant
>>  closing tags to  closing tags. However, there may be
>> other  closing tags you don't want to change because they close
>> other  tags we haven't seen.
> 
> Chuckle ;} The appropriate "" to be replaced by "" is ALWAYS
> preceded by "#160;" .

As far as I can see, neither of this was stated in the original
question. Please don't add arbitrary requirements later to invalidate
potential answers.

-- 
Michael Kjörling  https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread mick.crane

On 2024-06-29 12:34, Max Nikulin wrote:

On 29/06/2024 11:48, to...@tuxteam.de wrote:

   Do M-x (hold Meta, most of the time your Alt key, then "x").
   You get a command for a prompt. Enter "query-replace-regexp"


And to get help for this function

C-h f query-replace-regexp RET

To open user manual switch to the help buffer and press "i".

A side note since an answer to the asked question has been posted.

To manipulate with HTML it is better to write a script in some
programming language, e.g. for python there are lxml etree and
BeautifulSoup packages. This way it is easier to maintain valid
document structure with paired opening and closing tags.

I have not tried Emacs lisp facilities for dealing with HTML.


open in Geany


  thru [at most]

  abcdefg

  thru [at most]

abcdefg

  thru [at most]

abcdefg

click search select replace
copy paste selection into "search for"
paste  in "replace with"
click "In Document"


  abcdefg

abcdefg

abcdefg



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/29/2024 06:51 AM, debian-u...@howorth.org.uk wrote:

Richard Owlett  wrote:

On 06/28/2024 03:53 PM, Michael Kjörling wrote:

On 28 Jun 2024 14:04 -0500, from rowl...@access.net (Richard
Owlett):

I need to replace ANY occurrence of
  
thru [at most]
  
by
  

I'm reformatting a Bible stored in HTML format for a particular
set of vision impaired seniors (myself included). Each chapter is
in its own file.

How do I open a file.
Do the above replacement.
Save and close the file.


Ignoring the question about Emacs


Emacs *CAN NOT* be ignored.
It is the _available_ editor known to be capable of handling regular
expressions.


Err, pluma is available I believe.


May I quote my original post?

On 06/28/2024 02:04 PM, Richard Owlett wrote:

Pluma is my editor of choice.

I've never used it but I just
started it and used the Replace... entry on the Search menu to bring up
a dialog box. In the dialog box there is a tick box labelled "Match
regular expression". So I ticked that and then tested it by editing an
html file using an RE.

So Pluma is an "_available_ editor known to be capable of handling
regular expressions."


So you evidently have a later version than I have available for this 
particular machine.

One does get latest and greatest by simply wishing for it.



And as others have pointed out, sed is available and it's easy to
install others. So there are many possible answers to your question
other than emacs.


My definition of "available" includes knowledge of how to use it.
I've investigated it for some past projects and found easier way to 
accomplish those particular tasks. Part of my interest in Emacs stems 
from having seen what co-workers could do with its predecessor TECO 
decades ago.


Updating MY system is NONtrivial!





Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Greg Wooledge
On Sat, Jun 29, 2024 at 07:43:47 -0400, Dan Ritter wrote:
> The option "g" means that said should do this multiple times if
> it occurs in the same file (globally, like grep) instead of the
> default behavior which is to find the first match and just
> change that.

The g option in sed's s command means it will apply the substitution
multiple times per *line*.  Not per file.  It always applies multiple
times per file, unless you restrict the line range with a prefix.

hobbit:~$ printf 'foo foo\nfoo foo\n' | sed s/foo/bar/
bar foo
bar foo
hobbit:~$ printf 'foo foo\nfoo foo\n' | sed s/foo/bar/g
bar bar
bar bar



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Greg Wooledge
On Fri, Jun 28, 2024 at 21:23:03 -0600, Charles Curley wrote:
> On Fri, 28 Jun 2024 20:53:50 +
> Michael Kjörling  wrote:
> 
> > $ for v in $(seq 1 119); do sed -i 's, > id="V'$v'">,,g' ./*.html; done
> > 
> > Be sure to have a copy in case something goes wrong; and diff(1) a few
> > files afterwards to make sure that the result is as you intended.
> 
> Having done that (or similar), don't forget to change the relevant
>  closing tags to  closing tags. However, there may be
> other  closing tags you don't want to change because they close
> other  tags we haven't seen. So you may prefer to use regexes as
> Murphy intended, handling both the opening and closing tags at the same
> time, leaving the intervening text intact.

https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags#answer-1732454



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Dan Ritter
Richard Owlett wrote: 
> On 06/28/2024 03:53 PM, Michael Kjörling wrote:
> > On 28 Jun 2024 14:04 -0500, from rowl...@access.net (Richard Owlett):
> > > I need to replace ANY occurrence of
> > >  
> > >thru [at most]
> > >  
> > > by
> > >  
> > > 
> > > I'm reformatting a Bible stored in HTML format for a particular set of
> > > vision impaired seniors (myself included). Each chapter is in its own 
> > > file.
> > > 
> > > How do I open a file.
> > > Do the above replacement.
> > > Save and close the file.
> > 
> > Ignoring the question about Emacs
> 
> Emacs *CAN NOT* be ignored.
> It is the _available_ editor known to be capable of handling regular
> expressions.

If your machine doesn't have sed, it is not a working Debian
system. 

Every Debian machine comes with sed by default.  Even the
rescue image has sed. The installer environment, before Debian
is actually installed, has sed. sed is a basic tool that
everyone has access to. emacs needs to be installed, and often
is not.

I know from past experience that it's useless to offer you any
solution that deviates from the vision you have for the way the
world ought to work, but this is a sufficiently common kind of
problem that a full answer will be useful to other people.

> > and focusing on the goal (your
> > question otherwise is an excellent example of a XY question), this is
> > not something regular expressions are very good at.
> 
> HUH ??

An XY question is when someone asks "How can I do specific thing
X?" but what they want to do is task Y, which is more easily
accomplished in a different way that doesn't involve X at all.
Usually this means that they have read something that tells them
about X in a different context, and they think that is an
essential part of solving their Y problem.

If we're lucky, they tell us what Y is. Frequently, XY questions
just show up as "How do I do X?" without context.

It happens a lot on this mailing list.

Or, maybe your expression of disbelief was about regular
expressions? A regular expression (regexp) is a specific kind of
formal language for specifying a pattern of tokens -- what we
often call a "string". If the regexp describes a candidate
string, we call that a "match". A common editing task is to find
all the matches for a regexp and replace them with some other
string.

The program "grep" takes its name from a sequence of editor
commands: global regular expression print. 

Michael says that regexps aren't great at this particular task
because there's a variable component in the pattern which is
hard to describe. He comes up with a clever solution based on
the fact that the variable component is going to be an integer
sequence.


> > However, since
> > it's presumably a once-only operation, I assume that you can live with
> > it being done in a suboptimal way in terms of performance.
> > 
> > In that case, assuming for simplicity that all the files are in a
> > single directory, you could try something similar to:
> > 
> > $ for v in $(seq 1 119); do sed -i 's, > id="V'$v'">,,g' ./*.html; done
 
This sets up a loop which will execute 119 times, incrementing
the variable $v from 1 to 119. Inside the loop, it calls `sed`
to execute inplace (-i) which means it will change the files it
encounters rather than spitting out new files on standard out.

The command passed to sed is

s,,,g

s means string substitution. It takes a pattern, a replacement,
and options, separated by the next character after the s, which
in this case is a comma.



is the pattern. Because of the loop, the value $v is going to be
replaced by the shell before sed sees this, so on various runs
through the loop sed will see:



...




You'll probably need to adjust this for other books.

Anyway, whenever sed sees the pattern above, it will replace it
with:



which is what you said you wanted.

The option "g" means that said should do this multiple times if
it occurs in the same file (globally, like grep) instead of the
default behavior which is to find the first match and just
change that.

./*.html

tells sed to operate on all the files in the current directory
ending in .html -- yes, shells implement a version of regexp for
file pattern matching. And that's the end of the loop.


> I'll have to investigate sed further.
> My project is not yet to the point of automatically editing ALL chapters. I
> need to first establish how to edit all VERSES of an individual chapter.

The solution Michael presented can be run on just one file
instead of all the .html files in the current directory.


> ROFL ;} No one would define me as a "programmer". I took an introduction to
> computers course as a E.E. student in the 60's. Most of my jobs required
> background in component level analog electronics. Got one assignment because
> I was not "afraid" of 8080 ;}

The true UNIX philosophy is that at any moment, any user can
stop being "just a user" and use the tools present to do some
programming to solve their problems. 



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread debian-user
Richard Owlett  wrote:
> On 06/28/2024 03:53 PM, Michael Kjörling wrote:
> > On 28 Jun 2024 14:04 -0500, from rowl...@access.net (Richard
> > Owlett):  
> >> I need to replace ANY occurrence of
> >>  
> >>thru [at most]
> >>  
> >> by
> >>  
> >>
> >> I'm reformatting a Bible stored in HTML format for a particular
> >> set of vision impaired seniors (myself included). Each chapter is
> >> in its own file.
> >>
> >> How do I open a file.
> >> Do the above replacement.
> >> Save and close the file.  
> > 
> > Ignoring the question about Emacs   
> 
> Emacs *CAN NOT* be ignored.
> It is the _available_ editor known to be capable of handling regular 
> expressions.

Err, pluma is available I believe. I've never used it but I just
started it and used the Replace... entry on the Search menu to bring up
a dialog box. In the dialog box there is a tick box labelled "Match
regular expression". So I ticked that and then tested it by editing an
html file using an RE.

So Pluma is an "_available_ editor known to be capable of handling
regular expressions."

And as others have pointed out, sed is available and it's easy to
install others. So there are many possible answers to your question
other than emacs.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/28/2024 11:48 PM, to...@tuxteam.de wrote:

On Fri, Jun 28, 2024 at 02:04:37PM -0500, Richard Owlett wrote:

Pluma is my editor of choice.
*BUT* it can NOT handle Search and Replace operations involving regular
expressions.


I would be *very* surprised if an editor, these days and age
can't do regular expressions. Really.


Emacs can. It has much verbose documentation.
But examples seem rather scarce.


Of course, Emacs is the best editor out there, by a long shot.
But learning it is a long and panoramic road. You should at
least have a rough idea that you want to take it.


Definitely interested
I worked for DEC in the 70's. Though an tech in Power Supply 
Engineering, I was exposed to TECO and have recently seen claims that 
Emacs is TECO done right. I've been exposed to many editors since but 
TECO is memorable.





I need to replace ANY occurrence of
 
   thru [at most]
 
by
 

I'm reformatting a Bible stored in HTML format for a particular set of
vision impaired seniors (myself included). Each chapter is in its own file.

How do I open a file.


Two ways of skinning that cat:

   - in a terminal, type "emacs "
   - in an open Emacs instance (be it terminal or GUI, your
 choice), type C-x C-f (hold CTRL, then "x", while holding
 CTRL then "f"). You get a prompt in the bottom line (the
 so-called minibuffer), enter your file name there. You
 get tab completions.

Then there are menus...


Do the above replacement.


   Go to the top of your buffer (this is what you would call
   "your file": Emacs calls the things which hold your text
   while you are on them "buffers").
   Do M-x (hold Meta, most of the time your Alt key, then "x").
   You get a command for a prompt. Enter "query-replace-regexp"
   (you get tab completions, so "que" TAB "re" TAB should suffice,
   roughly speaking). Enter the regular expression you're looking
   for. Then ENTER, then your replacement.


Save and close the file.


   To save, C-x C-s. I don't quite know what you mean by
   "close".

   To quit Emacs, C-x C-c.

Now I don't quite understand what you mean above with your
example, and whether it can be expressed by a regular expression
at all, but that is for a second go.


When searching for information on regular expressions I came across one 
that did it by searching for

   {"1 thru 9" OR "10 thru 99" OR "100 thru 999"} .
I lost the reference ;<





First, find out whether your beloved Pluma can deliver. I'm
sure it can. Unless you want to embark in the Emacs adventure
(very much recommended, mind you, but not the most efficient
path to your problem at hand).


I'm still essentially at the stage of flow-charting how I need to handle 
individual chapters. As there ~1000 chapters, I'll want to use something 
that can handle macros eventually.


Thank you.



Cheers





Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Max Nikulin

On 29/06/2024 11:48, to...@tuxteam.de wrote:

   Do M-x (hold Meta, most of the time your Alt key, then "x").
   You get a command for a prompt. Enter "query-replace-regexp"


And to get help for this function

C-h f query-replace-regexp RET

To open user manual switch to the help buffer and press "i".

A side note since an answer to the asked question has been posted.

To manipulate with HTML it is better to write a script in some 
programming language, e.g. for python there are lxml etree and 
BeautifulSoup packages. This way it is easier to maintain valid document 
structure with paired opening and closing tags.


I have not tried Emacs lisp facilities for dealing with HTML.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/28/2024 10:23 PM, Charles Curley wrote:

On Fri, 28 Jun 2024 20:53:50 +
Michael Kjörling  wrote:


$ for v in $(seq 1 119); do sed -i 's,,,g' ./*.html; done

Be sure to have a copy in case something goes wrong; and diff(1) a few
files afterwards to make sure that the result is as you intended.


Having done that (or similar), don't forget to change the relevant
 closing tags to  closing tags. However, there may be
other  closing tags you don't want to change because they close
other  tags we haven't seen.


Chuckle ;} The appropriate "" to be replaced by "" is 
ALWAYS preceded by "#160;" .



So you may prefer to use regexes as
Murphy intended, handling both the opening and closing tags at the same
time, leaving the intervening text intact.


In this particular case I suspect it would become overly complex.
I've already discovered that the order of edits is important.






Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/28/2024 03:53 PM, Michael Kjörling wrote:

On 28 Jun 2024 14:04 -0500, from rowl...@access.net (Richard Owlett):

I need to replace ANY occurrence of
 
   thru [at most]
 
by
 

I'm reformatting a Bible stored in HTML format for a particular set of
vision impaired seniors (myself included). Each chapter is in its own file.

How do I open a file.
Do the above replacement.
Save and close the file.


Ignoring the question about Emacs 


Emacs *CAN NOT* be ignored.
It is the _available_ editor known to be capable of handling regular 
expressions.



and focusing on the goal (your
question otherwise is an excellent example of a XY question), this is
not something regular expressions are very good at.


HUH ??


However, since
it's presumably a once-only operation, I assume that you can live with
it being done in a suboptimal way in terms of performance.

In that case, assuming for simplicity that all the files are in a
single directory, you could try something similar to:

$ for v in $(seq 1 119); do sed -i 's,,,g' 
./*.html; done


I'll have to investigate sed further.
My project is not yet to the point of automatically editing ALL 
chapters. I need to first establish how to edit all VERSES of an 
individual chapter.





Be sure to have a copy in case something goes wrong; and diff(1) a few
files afterwards to make sure that the result is as you intended.


ROFL ;} No one would define me as a "programmer". I took an introduction 
to computers course as a E.E. student in the 60's. Most of my jobs 
required background in component level analog electronics. Got one 
assignment because I was not "afraid" of 8080 ;}




Yes, it almost certainly can be done with a single sed (or other
similar tool) invocation where the regular expression matches
precisely what you want it to match. But unless this is something you
will do very often, I tend to prefer readability over being clever,
even if the readable version is somewhat less performant.






Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/28/2024 02:33 PM, Van Snyder wrote:

On Fri, 2024-06-28 at 14:04 -0500, Richard Owlett wrote:

Pluma is my editor of choice.
*BUT* it can NOT handle Search and Replace operations involving
regular
expressions.

Emacs can. It has much verbose documentation.
But examples seem rather scarce.


nedit can handle regular expressions in search and replace operations.
I find nedit easier to use than emacs.


I've see references to nedit before.
But circumstances require I use this system in its current configuration.

Thank you.



Re: Need help with narroely focused use case of Emacs

2024-06-29 Thread Richard Owlett

On 06/28/2024 02:17 PM, didier gaumet wrote:

Le 28/06/2024 à 21:04, Richard Owlett a écrit :

Pluma is my editor of choice.
*BUT* it can NOT handle Search and Replace operations involving 
regular expressions.

[...]

Hello Richard,

According to the Mate wiki, Pluma handles regular expressions the Perl way:
https://wiki.mate-desktop.org/mate-desktop/applications/pluma/


Hadn't seen that page. I based my opinion on what I saw when doing a 
Search and Replace. Also Pluma's Help function doesn't mention it.



https://perldoc.perl.org/perlre


That page is thin on examples. But now knowing that Pluma does things 
"the Perl way" I can do a web search.


Thank you.






Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread tomas
On Fri, Jun 28, 2024 at 09:17:14PM +0200, didier gaumet wrote:
> Le 28/06/2024 à 21:04, Richard Owlett a écrit :
> > Pluma is my editor of choice.
> > *BUT* it can NOT handle Search and Replace operations involving regular
> > expressions.
> [...]
> 
> Hello Richard,
> 
> According to the Mate wiki, Pluma handles regular expressions the Perl way:
> https://wiki.mate-desktop.org/mate-desktop/applications/pluma/
> https://perldoc.perl.org/perlre

See? I was sure of that. And Perl style regexps are actually somewhat
friendlier than Emacs style (they're roughly one decennium younger).

Thanks, Didier :-)

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread tomas
On Fri, Jun 28, 2024 at 02:04:37PM -0500, Richard Owlett wrote:
> Pluma is my editor of choice.
> *BUT* it can NOT handle Search and Replace operations involving regular
> expressions.

I would be *very* surprised if an editor, these days and age
can't do regular expressions. Really.

> Emacs can. It has much verbose documentation.
> But examples seem rather scarce.

Of course, Emacs is the best editor out there, by a long shot.
But learning it is a long and panoramic road. You should at
least have a rough idea that you want to take it.

> I need to replace ANY occurrence of
> 
>   thru [at most]
> 
> by
> 
> 
> I'm reformatting a Bible stored in HTML format for a particular set of
> vision impaired seniors (myself included). Each chapter is in its own file.
> 
> How do I open a file.

Two ways of skinning that cat:

  - in a terminal, type "emacs "
  - in an open Emacs instance (be it terminal or GUI, your
choice), type C-x C-f (hold CTRL, then "x", while holding
CTRL then "f"). You get a prompt in the bottom line (the
so-called minibuffer), enter your file name there. You
get tab completions.

Then there are menus...

> Do the above replacement.

  Go to the top of your buffer (this is what you would call
  "your file": Emacs calls the things which hold your text
  while you are on them "buffers").
  Do M-x (hold Meta, most of the time your Alt key, then "x").
  You get a command for a prompt. Enter "query-replace-regexp"
  (you get tab completions, so "que" TAB "re" TAB should suffice,
  roughly speaking). Enter the regular expression you're looking
  for. Then ENTER, then your replacement.

> Save and close the file.

  To save, C-x C-s. I don't quite know what you mean by
  "close".

  To quit Emacs, C-x C-c.

Now I don't quite understand what you mean above with your
example, and whether it can be expressed by a regular expression
at all, but that is for a second go.

First, find out whether your beloved Pluma can deliver. I'm
sure it can. Unless you want to embark in the Emacs adventure
(very much recommended, mind you, but not the most efficient
path to your problem at hand).

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread Charles Curley
On Fri, 28 Jun 2024 20:53:50 +
Michael Kjörling  wrote:

> $ for v in $(seq 1 119); do sed -i 's, id="V'$v'">,,g' ./*.html; done
> 
> Be sure to have a copy in case something goes wrong; and diff(1) a few
> files afterwards to make sure that the result is as you intended.

Having done that (or similar), don't forget to change the relevant
 closing tags to  closing tags. However, there may be
other  closing tags you don't want to change because they close
other  tags we haven't seen. So you may prefer to use regexes as
Murphy intended, handling both the opening and closing tags at the same
time, leaving the intervening text intact.



-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/



Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread Michael Kjörling
On 28 Jun 2024 14:04 -0500, from rowl...@access.net (Richard Owlett):
> I need to replace ANY occurrence of
> 
>   thru [at most]
> 
> by
> 
> 
> I'm reformatting a Bible stored in HTML format for a particular set of
> vision impaired seniors (myself included). Each chapter is in its own file.
> 
> How do I open a file.
> Do the above replacement.
> Save and close the file.

Ignoring the question about Emacs and focusing on the goal (your
question otherwise is an excellent example of a XY question), this is
not something regular expressions are very good at. However, since
it's presumably a once-only operation, I assume that you can live with
it being done in a suboptimal way in terms of performance.

In that case, assuming for simplicity that all the files are in a
single directory, you could try something similar to:

$ for v in $(seq 1 119); do sed -i 's,,,g' 
./*.html; done

Be sure to have a copy in case something goes wrong; and diff(1) a few
files afterwards to make sure that the result is as you intended.

Yes, it almost certainly can be done with a single sed (or other
similar tool) invocation where the regular expression matches
precisely what you want it to match. But unless this is something you
will do very often, I tend to prefer readability over being clever,
even if the readable version is somewhat less performant.

-- 
Michael Kjörling  https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread Van Snyder
On Fri, 2024-06-28 at 14:04 -0500, Richard Owlett wrote:
> Pluma is my editor of choice.
> *BUT* it can NOT handle Search and Replace operations involving
> regular 
> expressions.
> 
> Emacs can. It has much verbose documentation.
> But examples seem rather scarce.

nedit can handle regular expressions in search and replace operations.
I find nedit easier to use than emacs.

> 
> I need to replace ANY occurrence of
>  
>    thru [at most]
>  
> by
>  
> 
> I'm reformatting a Bible stored in HTML format for a particular set
> of 
> vision impaired seniors (myself included). Each chapter is in its own
> file.
> 
> How do I open a file.
> Do the above replacement.
> Save and close the file.
> 
> Help please.
> TIA
> 



Re: Need help with narroely focused use case of Emacs

2024-06-28 Thread didier gaumet

Le 28/06/2024 à 21:04, Richard Owlett a écrit :

Pluma is my editor of choice.
*BUT* it can NOT handle Search and Replace operations involving regular 
expressions.

[...]

Hello Richard,

According to the Mate wiki, Pluma handles regular expressions the Perl way:
https://wiki.mate-desktop.org/mate-desktop/applications/pluma/
https://perldoc.perl.org/perlre