Re: RE : [fpc-pascal] WideString and TRegexpr

2011-12-04 Thread Mark Morgan Lloyd

Ludo Brands wrote:
Reported as issue 0020806. I'm not able to test using a 
recent Delphi, 

or on a 64-bit CPU.
I think you could report directly to sorokin too. If he fixes 
it, then we can merge the fix here.


regexpr is a nearly unchanged copy from sorokin, except for 
the alignment patch which I have sent to sorokin.




Patch attached to issue. A case of a calling move() with number of chars
instead of bytes.


OK on SPARC, x86, PPC and ARM (all Linux).

Thanks Ludo and well spotted- I was expecting something horribly 
computer scienceish :-)


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


RE : [fpc-pascal] WideString and TRegexpr

2011-12-04 Thread Ludo Brands
> > Reported as issue 0020806. I'm not able to test using a 
> recent Delphi, 
> > or on a 64-bit CPU.
> 
> I think you could report directly to sorokin too. If he fixes 
> it, then we can merge the fix here.
> 
> regexpr is a nearly unchanged copy from sorokin, except for 
> the alignment patch which I have sent to sorokin.
> 

Patch attached to issue. A case of a calling move() with number of chars
instead of bytes.

Ludo

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-04 Thread Felipe Monteiro de Carvalho
On Sun, Dec 4, 2011 at 8:30 AM, Mark Morgan Lloyd
 wrote:
> Reported as issue 0020806. I'm not able to test using a recent Delphi, or on
> a 64-bit CPU.

I think you could report directly to sorokin too. If he fixes it, then
we can merge the fix here.

regexpr is a nearly unchanged copy from sorokin, except for the
alignment patch which I have sent to sorokin.

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Mark Morgan Lloyd

Felipe Monteiro de Carvalho wrote:

I have no idea about your question, but it might be interresting for
you to know, just in case you already don't, the wiki page about the
new Regexpr in FPC: http://wiki.lazarus.freepascal.org/Regexpr


 It almost works: Exec() is OK (at least with trunk, I might have seen a 
problem with a version a few weeks old) but Match[] isn't.


That code does not change so often: Last commit in regexpr.pas 3 months ago:

http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/packages/regexpr/src/


Almost certainly an endianness issue, since ARM behaves the same as x86 
(and SPARC behaves the same as PPC).


Reported as issue 0020806. I'm not able to test using a recent Delphi, 
or on a 64-bit CPU.


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Frank Church
On 3 December 2011 16:29, Felipe Monteiro de Carvalho <
felipemonteiro.carva...@gmail.com> wrote:

> I have no idea about your question, but it might be interresting for
> you to know, just in case you already don't, the wiki page about the
> new Regexpr in FPC: http://wiki.lazarus.freepascal.org/Regexpr
>
> >  It almost works: Exec() is OK (at least with trunk, I might have seen a
> problem with a version a few weeks old) but Match[] isn't.
>
> That code does not change so often: Last commit in regexpr.pas 3 months
> ago:
>
> http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/packages/regexpr/src/
>
> --
> Felipe Monteiro de Carvalho
> ___
> fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>

I think the wiki entry should contain some more info about the Sorokin
regexpr and a link to http://regexpstudio.com/.

There isn't much info about regular expressions on the wiki itself

-- 
Frank Church

===
http://devblog.brahmancreations.com
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Felipe Monteiro de Carvalho
I have no idea about your question, but it might be interresting for
you to know, just in case you already don't, the wiki page about the
new Regexpr in FPC: http://wiki.lazarus.freepascal.org/Regexpr

>  It almost works: Exec() is OK (at least with trunk, I might have seen a 
> problem with a version a few weeks old) but Match[] isn't.

That code does not change so often: Last commit in regexpr.pas 3 months ago:

http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/packages/regexpr/src/

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Mark Morgan Lloyd

Felipe Monteiro de Carvalho wrote:

On Thu, Dec 1, 2011 at 7:48 PM, Mark Morgan Lloyd
 wrote:

Has anybody with experience of WideStrings tried compiling the "new" Regexpr
unit to support them?


Which "new" Regexpr? The one in packages/regexpr or another one?

I always thought that the one I added was focused on utf-8, although I
only used it for ASCII so far.


The Sorokin one. I've been using it for years for ASCII, but it has an 
internal flag to use widestring:


//  Define options for TRegExpr engine
{.$DEFINE UniCode} // Unicode support
..
type
 {$IFDEF UniCode}
 PRegExprChar = PWideChar;
 RegExprString = WideString;
 REChar = WideChar;
 {$ELSE}
 PRegExprChar = PChar;
 RegExprString = AnsiString; //###0.952 was string
 REChar = Char;
 {$ENDIF}

It almost works: Exec() is OK (at least with trunk, I might have seen a 
problem with a version a few weeks old) but Match[] isn't.


--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Felipe Monteiro de Carvalho
On Thu, Dec 1, 2011 at 7:48 PM, Mark Morgan Lloyd
 wrote:
> Has anybody with experience of WideStrings tried compiling the "new" Regexpr
> unit to support them?

Which "new" Regexpr? The one in packages/regexpr or another one?

I always thought that the one I added was focused on utf-8, although I
only used it for ASCII so far.

-- 
Felipe Monteiro de Carvalho
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] WideString and TRegexpr

2011-12-03 Thread Mark Morgan Lloyd

Mark Morgan Lloyd wrote:
Has anybody with experience of WideStrings tried compiling the "new" 
Regexpr unit to support them?


I'm in a position where I could very much benefit from using these, but 
I think that I'm only seeing patterns match for characters <= #$00ff and 
even then am not seeing the match strings returned.


This appears to be an endianness issue: on a little-endian system 
(including x86) the Match[] entries only contain the LS byte of a 
widechar and on a big-endian system (incluing PPC) they only contain the 
MS byte. Practical result is that things look OK on x86 until the match 
contains a value > #$00ff.


I'm putting test data together for various CPUs and will raise a bug.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal