Re: [Geany-Users] Regular expression, for Unicode characters

Vesta Tue, 02 Aug 2016 04:59:47 -0700

<p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
I just found this regex for unicode,Perl, somewhere and tried modify it, but it 
not works.


I have Geany 1.23.1, I browsed it regex syntax, but there is no any examples.

The text I want parse have multiple spaces inside paragraphs tags. Sometimes 
upper case text inside paragraphs are mixed with lower case characters or words 
- those paragraphs need be omitted. So we need match and apply bold class only 
to paragraphs, containing all upper case text, as in my examples.

I tried both regex but it not works.
<p(>.*?[[p{Lu}]].*?</p>)

(<p>).*?[[p{Lu}]].*?</p>

Vesta

> Sent: Tuesday, August 02, 2016 at 12:03 PM
> From: "James Ginns" <starvagr...@yahoo.com>
> To: "Geany general discussion list" <users@lists.geany.org>
> Subject: Re: [Geany-Users] Regular expression, for Unicode characters
>
> Regular Expressions are a tad difficult to master.
> 
> Basic question: you're using lazy modifiers on purpose right? Just 
> checking.
> 
> So, a dissection The regex engine (don't know what you're using) should 
> hit \W*? and look for as few non word characters as possible (in some 
> instances zero). Then it will look for ONE character in the character 
> class [p{Lu}] (unicode?). Then it will look for zero or more instances 
> of [p{Lu}] or a non-word character. This is until it gets to the closing 
> tag. Since you're only looking for a single capital letter, why not try:
> 
> <p(>.*?[[p{Lu}]].*?</p>)
> 
> Or better yet, since you're only replacing the p tag with p class="bold" 
> why not just capture the initial p tag:
> 
> (<p>).*?[[p{Lu}]].*?</p>
> 
> Hope that gives you some starting ideas.
> 
> On 07/31/2016 08:19 AM, Vesta wrote:
> > Can anyone show how should look regular expression for this particular case?
> >
> > this not works too:
> >
> > <p(>\W*?[[p{Lu}]][[p{Lu}]\W]*?</p>)
> >
> > Regards,
> > Vesta
> >
> >
> >
> >
> >
> >> Sent: Sunday, July 31, 2016 at 3:32 PM
> >> From: "Lex Trotman" <ele...@gmail.com>
> >> To: "Geany general discussion list" <users@lists.geany.org>
> >> Subject: Re: [Geany-Users] Regular expression, for Unicode characters
> >>
> >> Geany uses the Glib regex library whose syntax is described at
> >> https://developer.gnome.org/glib/stable/glib-regex-syntax.html
> >>
> >> Cheers
> >> Lex
> >>
> >> 2016-07-31 22:03 GMT+10:00 Vesta <laguna...@mail.com>:
> >>> How to create regular expression tp match all UPPER CASE text within 
> >>> paragraps tag, and replace these <p> tag with <p class="bold">
> >>>
> >>>      <p>                                                   </p>
> >>>      <p>                      USU EA EUISMOD HONESTATIS DETERRUISSET.</p>
> >>>      <p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei 
> >>> nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, 
> >>> in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei 
> >>> laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia 
> >>> ea eum, quo integre theophrastus ex, eum et sint omnes detracto. </p>
> >>>      <p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo 
> >>> viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et 
> >>> noster aliquam pri, case iudicabit te sea. </p>
> >>>      <p>                                                                  
> >>>            </p>
> >>>      <p>                       CU CONGUE IRIURE SCAEVOLA   --
> >>>         UT DOMING IRACUNDIA. </p>
> >>>      <p>                                  DICO TEMPOR HABEMUS - PART II, 
> >>> 123 </p>
> >>>      <p>Homero everti ei nam. An liber euripidis vis, pericula persecuti 
> >>> deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. 
> >>> In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque 
> >>> phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
> >>>
> >>> I want appply class to
> >>>
> >>> <p class="bold">                      USU EA EUISMOD HONESTATIS 
> >>> DETERRUISSET.</p>
> >>> <p class="bold">                      CU CONGUE IRIURE SCAEVOLA   --
> >>>         UT DOMING IRACUNDIA. </p>
> >>> <p class="bold">                                DICO TEMPOR HABEMUS -PART 
> >>> II, 123 </p>
> >>>
> >>> I need Unicode solution for Cyrillic text. This not works:
> >>>
> >>> Find what: <p(>\W*?[[:upper:]][[:upper:]\W]*?</p>)
> >>> Replace with: <p class="bold"\1
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users@lists.geany.org
> >>> https://lists.geany.org/cgi-bin/mailman/listinfo/users
> >> _______________________________________________
> >> Users mailing list
> >> Users@lists.geany.org
> >> https://lists.geany.org/cgi-bin/mailman/listinfo/users
> >>
> > _______________________________________________
> > Users mailing list
> > Users@lists.geany.org
> > https://lists.geany.org/cgi-bin/mailman/listinfo/users
> 
> _______________________________________________
> Users mailing list
> Users@lists.geany.org
> https://lists.geany.org/cgi-bin/mailman/listinfo/users
> 
_______________________________________________
Users mailing list
Users@lists.geany.org
https://lists.geany.org/cgi-bin/mailman/listinfo/users

Re: [Geany-Users] Regular expression, for Unicode characters

Reply via email to