Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-14 Thread Martok
Am 14.07.2017 um 10:04 schrieb Marco van de Voort:
> In our previous episode, Martok said:
>> There is a fundamental difference in the type system between a somewhat 
>> sensible
>> (if unexpected) assumption in FPC and a more practical documented definition 
>> in
>> every other Pascal compiler. An assumption that even FPC follows only in this
>> one single spot.
>> This is unexpected and breaks unrelated code. That's the problem.
> 
> Other pascal's don't have sparse enums ?

Wait, what do sparse enums have to do with any of that?

But: yes, they do.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-13 Thread Martok
Am 13.07.2017 um 22:24 schrieb Marco van de Voort:
> Personally I think the input validation angle to justify checking enums is
> dragged-by-the-hairs. 
I completely agree with you on that. Although in a different way ;-)

That was just the easily-observable breakage of a common pattern. If anybody
actually read what I wrote after Florian clarified the actual issue, I already
narrowed it down to 'simple' compatibility and self-consistency.

There is a fundamental difference in the type system between a somewhat sensible
(if unexpected) assumption in FPC and a more practical documented definition in
every other Pascal compiler. An assumption that even FPC follows only in this
one single spot.
This is unexpected and breaks unrelated code. That's the problem.


Good night,

Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-13 Thread Martok
Hi all,

any new ideas on this issue?

I've been thinking about this a lot, and I do see where you're coming from.
There is some theoretical advantage in treating enums like that. Only one minor
issue: a language with that interpretation does not appear to be Pascal...

You can find some results of my investigations here:
<https://www.entwickler-ecke.de/viewtopic.php?p=707764#707764>
(German-language forum post, but I know many of the core team are or can read
German anyway; I can provide a translation if you want)

Regardless of whether there may be some argument for this language change, I'm
still a firm believer in "don't surprise the user". There is literally no
precedent that this simplification has ever been done in any Pascal compiler
(quite the contrary), and there is no written hint that FPC does it either.
Basically, if people with some 30-ish years of experience (and always keeping up
with current language extensions) write that, I think we may have an issue here:

> In TP und {$R+} würde aValue ausserhalb einen RangeCheckError erzeugen.
> 
> In {$R-} nicht, jedenfalls solange der Datentyp nicht überfahren wird {$Z..}.
> 
> Demnach sollte also der Sprung in den else-Zweig immer eindeutig definiert 
> sein.
> 
> Jede andere Reaktion würde ich für ein Sicherheitsproblem halten, da hätte 
> Pascal ja keinen Vorteil mehr.


I also read all of ncg*.pas again with respect to range simplifications, and it
turns out that there really is only one instance where we simplify to undefined
behaviour: tcgcasenode. tcginnode just produces the else-branch faster for
x>=high(setbasetype) (without bittests), but is still defined. All others work
with the base integer type only.
Point is: there is really no unrelated side effect at all if we were to align
FPC with all the other Pascals out there.


Kind regards,

Martok



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-05 Thread Martok
Hi all,

Am 02.07.2017 um 22:02 schrieb Florian Klämpfl:
> Am 02.07.2017 um 21:40 schrieb Martok:
>> Honestly, I still don't understand why we're even having this discussion.
> Because it is a fundamental question: if there is any defined behavior 
> possible if a variable
> contains an invalid value. I consider a value outside of the declared range 
> as invalid
So, as this is the core of all this, I have spent the last few days asking
various users of pascal languages in different compilers, intentionally without
telling them what this was about. Not a single one considered out-of-range
ordinal values as something bad (though not terribly useful), especially not
causing undefined behaviour: all assumed that they would continue to behave like
ordinals in comparisons.

Something I hadn't known, and which I find quite funny: that group apparently
includes Anders Hejlsberg, who wrote the original Turbo Pascal compiler and
years later specifically defined C# enums contrary to your assumption. In fact,
this entire thread's topic is an actual example in the language reference:
<https://docs.microsoft.com/en-gb/dotnet/csharp/language-reference/keywords/enum>

I haven't yet told all of them why I asked (one set of answers comes from a
forum thread that I don't want to spoil yet, maybe tomorrow evening), but those
who I asked in private all have at some point written code that relies on that
concept and are "irritated" why that wouldn't work in FPC.

All that seems to leave only one conclusion...


Kind regards,

Martok
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-03 Thread Martok
Good morning!

Am 02.07.2017 um 22:02 schrieb Florian Klämpfl:
> Am 02.07.2017 um 21:40 schrieb Martok:
>> Honestly, I still don't understand why we're even having this discussion.
> 
> Because it is a fundamental question: if there is any defined behavior 
> possible if a variable
> contains an invalid value. _I consider a value outside of the declared range 
> as invalid_,
(emphasis mine)
And this is where you disagree with Borland's explicit documentation, Borland's
implicit extensions via consistent compiler behaviour, and with at least ISO
7185:1990 (that revision has no concept of range checks, and explicitly allows
all operations other than constant assignment to exceed a subrange type).
If this is what you always had in mind for the FPC dialect, fair enough. It is
your project, after all :-) I shall submit appropriate change requests for the
documentation, as well as for several other simplifications for all other
conditionals except CASE..OF that then become possible. I will also submit
another set of change requests to *not* do that in modes TP, DELPHI and ISO for
code compatibility reasons. Probably a 'modeswitch strictenums' or something
like that.

To remind you: CASE..OF is currently the only statement that casts this concept
into code (grep -R getrange compiler/*). Everything else is consistent and
compatible.


Regards,

Martok

PS: starting a mail with "good morning" looks rather stupid if one then spends
two hours writing it. Hm.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Am 02.07.2017 um 19:47 schrieb Florian Klämpfl:
> Am 02.07.2017 um 19:29 schrieb Martok:
>>type Percentile = 1..99;
>>var I: Percentile;
>>begin
>>  I:= 99;
>>  inc(I);   // I is now 100
> 
> Forgot the mention:
> Tried with $r+ :)?
That case is also documented. RTE in {$R+}, legal in {$R-}. That also means that
while you could make assumptions about the content in {$R+} (Delphi does not*),
you definitely cannot as soon as there is a single write in {$R-}. A C++
compiler could probably try tracing that using constness of variables and
parameters, but we cannot, and so must be defensive.

*) Even FPC makes no such assumptions in all other instances!

type
  TF = 1..25;
var
  t: TF;
begin
  t:= TF(200);
  if t in [1..50] then  // tautology!
Writeln('a')
  else
writeln('b');

What does that print?
Yeah. As documented.
Check the codegen in R+: the if is still fully generated.
Only tcgcasenode does something else.


Honestly, I still don't understand why we're even having this discussion.
We're not talking about adding a new check - only not leaving one out that is
already there 99% of the time.
We're not talking about standardising some new behaviour - Borland did that
decades ago.
The correct behaviour is already documented in every Pascal language reference
(partly including our own), and is also the intuitive one.

I just don't get it. Why would you sacrifice the runtime safety, or, if you
prefer, the code compatibility, of your compiler over an (arguably wrong in at
least 2 modes) specific technicality of the type system that is adhered to
nowhere else?


Taking a break for now. Grading a thesis starts to sound like good relaxation.

Kind regards,

Martok





___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Am 02.07.2017 um 20:29 schrieb Ondrej Pokorny:
> On 02.07.2017 20:23, Florian Klämpfl wrote:
>> And the compiler writes no warning during compilation?
> 
> It does indeed.
But about something else.
Can we please stop derailing from the main issue here?


> If we get a convenient way to assign ordinal to enum with range checks, 
> everything will be fine :)
No it will not, we still can no longer elegantly pass/receive enums to/from
libraries from other compilers.
But at least it would be defined then, so programmers would know this is an
incompatibility.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
> They are:
> http://docwiki.embarcadero.com/Libraries/XE5/en/System.Boolean
That prototype is a recent invention, it wasn't there in older versions. Also
the text sounds quite different somewhere else:
http://docwiki.embarcadero.com/RADStudio/XE5/en/Simple_Types#Boolean_Types

> Yes. What I wanted to point out: also delphi does optimizations on enums 
> which fails if one feeds
> invalid values.
Okay, if you want believe that Booleans are enums:

  b:=boolean(42);
  if not b then
writeln('falsy')
  else
writeln('truthy');

Prints truthy. Doesn't crash.



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Booleans are not enums in Delphi (not even ordinals), but their own little
thing. "if boolean_expr" is always a jz/jnz, no matter what. They are defined as
0=FALSE and "everything else"=TRUE

However:

var
  b : boolean;
begin
  b:=boolean(3);
  if b = True then
writeln(true)
  else if b = False then
writeln(false)
  else
writeln(ord(b));
end.

That writes 3, which is why your should never compare on the boolean lexicals.
Some Winapi functions returning longbool rely on that.

Wait, that was a trick question, wasn't it?
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Addendum to this:

> This was also always my intuition that the else block is also triggered for
> invalid enum values (the docs even literally say that, "If none of the case
> constants match the expression value") - and it *is* true in Delphi.
There is a reason why this is true in Delphi: because this is the way it has
been documented in Borland products for at least 25 years!

I have checked with the TP7 language reference (it pays to keep books around),
which defines the following things:
 - Enumeration element names are implicitly defined as typed constants of their
enum type
 - The enum type is either Byte (<=256 elements) or Word.
 - Subrange types are defined as the smallest type that can contain their range
 - Case statements execute the statements of the matching case label, or the
else block otherwise

Note that they actually defined enumerations as what I called 'fancy constants'
before.


The Delphi 4 language reference (also in book form, which is a bit more detailed
than what is in the .hlp files) uses more precise language:
 - Enumeration element names are implicitly defined as typed constants of their
enum type
 - The enum type is either Byte, Word, or Longword, depending on $Z and element
count
 - Subrange types are defined as the smallest type that can contain their range
 - it is legal to inc/dec outside of a subrange, example from the book:
   type Percentile = 1..99;
   var I: Percentile;
   begin
 I:= 99;
 inc(I);   // I is now 100
   So if this is a legal statement, subrange types can contain values outside of
their range. The description in the German version is "Die Variable wird in
ihren Basistyp umgewandelt", the variable becomes its base type.
 - Case statements execute *precisely one* of their branches: the statements of
the matching case label, or the else block otherwise

So, in D4, we have enums as fancy constants, subrange-types are not safe (so
enums can also never be), and case statements cannot fail.


FPC's language reference has no formal definition of what enums or subranges
really are, and the same language as TP7 regarding case statements.


So at least in modes TP and DELPHI, the optimisation in question is formally 
wrong.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
> Yes, checking the data. I can easily create a similar problem as above with 
> the "range checks" for
> the jump table by reading a negative value into the enum. Unfortunately, the 
> checks are unsigned ...
Actually, fun fact, *fortunately* the checks are unsigned. Having a negative
value (or generally a value before the first element when that has a value
assignment) underflows on the check, and so gets caught by the CMP/JA as well.
Yes, I tried that, your code is safer than you think ;-)

Also, for sparse enums the "gaps" are filled with pointers to else-block, so the
check that is already there turns out to be always safe.

enum = (ela = 5, elb, elc, eld, ele);

1) enum value too small ( = 1):

   mov1,%al
   sub5,%al   # al = -4 = $fb
   cmp(9-5),%al   # $fb > 4
   ja $#ELSE-BLOCK# branches
   and$ff,%eax
   jmp*0x40c000(,%eax,4)

2) enum value in range or in gap (= 7 = elc)

   mov7,%al
   sub5,%al   # al = 2
   cmp(9-5),%al   # 2 <= 4
   ja $#ELSE-BLOCK# no branch
   and$ff,%eax
   jmp*0x40c000(,%eax,4)

3) enum value too large ( = 20)

   mov20,%al
   sub5,%al   # al = 15
   cmp(9-5),%al   # 15 > 4
   ja $#ELSE-BLOCK# branches
   and$ff,%eax
   jmp*0x40c000(,%eax,4)


Same thing on x86_64, where instead of al and eax we use eax and rax, with the
same underflow characteristics.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
Am 02.07.2017 um 10:40 schrieb Michael Van Canneyt:
> These cases are without exception covered by the " unchecked (aka explicit)
> typecast," part of Jonas's statement. Including Read(File).

Aye, that was kinda my point ;)
It is really hard to write code that interacts with the outside world without
having a validation problem.
If the validation code then breaks because the compiler thinks it's clever...

Am 02.07.2017 um 10:29 schrieb Michael Van Canneyt:
> GetEnumName from typinfo will already do this for you.
> We could add an additional function that just returns true or false.
> Something as
> function ValueInEnumRange(TypeInfo : PTypeInfo; AValue : Integer) : boolean;

Enum Typeinfo is horribly broken in so many ways except for the one simple case
needed for published properties, it definitely cannot be used in its current 
form.


That, probably (not sure about the timeline, but it makes sense to me), is part
of the core issue: Enumeration types have become way more powerful since this
optimization was introduced. Back then, nobody would have translated a C library
enum typedef as an enumerated type - simply because we didn't have sparse enums
then. Now we do, and so it is possible to use the typesafe way -- only that it
turns out to be less safe than a byte variable and some untyped constants.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Dangerous optimization in CASE..OF

2017-07-02 Thread Martok
> Is this made safe by always having an else/otherwise? If so, could the 
> compiler at least raise a warning if an enumeration was sparse but there 
> was no else/otherwise to catch unexpected cases?
Interestingly, not in FPC.

This was also always my intuition that the else block is also triggered for
invalid enum values (the docs even literally say that, "If none of the case
constants match the expression value") - and it *is* true in Delphi. In FPC it
is also mostly true, unless you happen to fall into this optimisation.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Dangerous optimization in CASE..OF

2017-07-01 Thread Martok
by
'rules-as-written' that is exactly what we should have assumed anyway.

> Just compile with {$RANGECHECKS ON}, then!
I've been trying really hard for the past couple of hours, but I haven't gotten
the compiler to emit a single check at all when doing anything with enums. And
even if, a runtime error is usually not what you want. You'd probably want to
tell the user a file is corrupt instead of killing the program...

> But that still doesn't mean we have to worry about any of this in the
> codegen for CASE..OF - just tell the programmer to manually check their input
> after reading from wherever!
Well, yeah, except... there is no way to do that.
  if EnumValue in [low(TEnumType)..high(TEnumType)] then
will not work for sparse enums or a basetype larger than Byte, and
  case Enumvalue of
All,
Expected,
Values : doSomething;
  else
raise EFileError.Create('Invalid data');
  end;
will obviously also not work because this is just what we're trying to do here
in the first place (NB: this was my original use case).

So, we have a problem here: either the type system is broken because we can put
stuff in a type without being able to check if it actually belongs there, or
Tcgcasenode is broken because it (and _only_ it, as far as I can see) wants to
be clever by omitting an essentially free check for very little benefit.
I know which interpretation I would choose: the one with the easier fix ;-)


I would very much like for someone to at least acknowledge that there is a
problem here, because I can think of several more or less clever fixes for that
(except the obvious) and would prefer discussing these instead of having to
prove that "not-as-defined"-behaviour is not the same as "undefined behaviour".


Kind regards,

Martok
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Data flow analysis (dfa) and "case ... of"

2017-07-01 Thread Martok
The attitude displayed over on #32079 is, quite frankly, terrifying. Apparently
a language which from the beginning has intrinsics for reading and writing files
must never be used for doing so, or wild things may happen /and that's okay/.

Implying that input should already be sanitized on a bug about something that
breaks input sanitation code (but only sometimes) is just... wow.

If anybody wants it, here's the patch I'll be rolling on the windows snapshots
from now on.


Have a good weekend,
Martok
Index: compiler/ncgset.pas
===
--- compiler/ncgset.pas (revision 36620)
+++ compiler/ncgset.pas (working copy)
@@ -1080,7 +1080,7 @@
labelcnt:=case_count_labels(labels);
{ can we omit the range check of the jump table ? }
getrange(left.resultdef,lv,hv);
-   jumptable_no_range:=(lv=min_label) and (hv=max_label);
+   jumptable_no_range:=(lv=min_label) and (hv=max_label) and 
(cs_opt_level4 in current_settings.optimizerswitches) and not (cs_check_range 
in current_settings.localswitches);
{ hack a little bit, because the range can be greater }
{ than the positive range of a aint}
 
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Data flow analysis (dfa) and "case ... of"

2017-06-28 Thread Martok
Interestingly, I just ran into "bad" code generation with exactly the properties
discussed in this thread.

Take a function like this:

function SignatureSubpacketTypeToStr(const X: TSignatureSubpacketType): String;
begin
  case X of
sstReserved00 : Result:= 'Reserved00';
sstReserved01 : Result:= 'Reserved01';
sstCreationTime   : Result:= 'CreationTime';


Because every declared element is covered, the generated code for it ends up
being a computed goto:

   0x10047c4c <+28>:mov-0x4(%ebp),%al
   0x10047c4f <+31>:and$0xff,%eax
   0x10047c54 <+36>:jmp*0x10071d08(,%eax,4)

Which is perfectly fine if X is guaranteed to be in range of the elements the
case statement matches to. If it is not, as it may be with invalid input data
(as read from a file), that jump goes somewhere undefined - and most
importantly, not into any else statement.

So, while we have code that looks like Result is always properly initialized,
what we get instead is code that doesn't actually work. And no kind of DFA could
detect that, except also range-checking everything.

Just thought I'd share that, as a less synthetic example than some discussed 
here.


Regards,
Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] UTF-8 string literals

2017-05-11 Thread Martok
That's the one I also think Sven was talking about.
I just searched for "Unicode". Michael's proposal comes up, but I guess the
title is fairly obvious.


But apparently everything is rainbows and unicorns and there is absolutely no
problem with the documentation at all, so I guess this week-long discussion here
never happened anyway.


Martok

Am 10.05.2017 um 08:38 schrieb Mattias Gaertner:
> On Tue, 9 May 2017 14:59:16 +0200
> Michael Schnell  wrote:
> 
>> On 06.05.2017 09:39, Sven Barth via fpc-devel wrote:
>>> That might be the one from Michael Schnell.  
>> Very unlikely, as this text does not mention anything about how a source 
>> file byte sequence is converted in a String constant / literal.
> 
> I think he meant this one:
> http://wiki.lazarus.freepascal.org/index.php?title=not_Delphi_compatible_enhancement_for_Unicode_Support&action=history
> 
> I thought Mschnell is Michael Schnell. Was this wrong?
> 
> Mattias
> ___
> fpc-devel maillist  -  fpc-devel@lists.freepascal.org
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
> 


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] UTF-8 string literals

2017-05-08 Thread Martok
> That might be the one from Michael Schnell. Probably it should be marked with 
> a
> big, fat warning that it's merely a user's suggestion and nothing official.
Not even that. This one looks relatively obvious to me ;)

I've filed a bug as <https://bugs.freepascal.org/view.php?id=31758> for 
reference.


Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] UTF-8 string literals

2017-05-05 Thread Martok

> You should weigh the advantages you outline here against the disadvantages of
> no longer knowing how string literals will be encoded.
As a programmer, either I don't want to know (declared const without giving
explicit type) or I do, then I did declare it correctly:

{$codepage utf8}
var u: UTF8String = 'äöüالعَرَبِيَّة';
  -> UTF8String containing the characters I entered in the source file (in this
case(!!) just 1:1 copy).

{$codepage utf8}
var u: UCS4String= 'äöü';
  -> UCS4 encoded Version, either 00e4 00f6 00fc or the equivalent
with combining characters

There should probably be an error if the characters I typed don't actually exist
in the declared type (emoji in an UCS2String), but otherwise, there's no good
reason why that shouldn't "just work".

> It means e.g. the resource string tables will have entries that are UTF16 
> encoded
> or entries that are UTF8 encoded, depending on the unit they come from. 
> This is highly undesirable.
Always convert from "unit CP" to UTF8 (or UTF16 if some binary compat is
required), done. Aren't they just internal anyway?

> By forcing everything UTF16 we ensure delphi compatibility (yes it does 
> matter) 
> and we also ensure a uniform set of string tables.
If that was what happened, ok. But from the error message Matthias listed as (1)
I would assume that the actual string type is UCS2String, at least at some point
in the process.

Just my 2 cents...

Martok

PS: adding to the discussion over on the Lazarus ML: I just found a fourth wiki
page describing a slightly different Unicode support. This is getting 
ridiculous.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] LineInfo

2017-04-03 Thread Martok
>> Does it is possible that the LineInfo trace (-gl option) are broken (no 
>> output)
>> in 3.0.2 version on Linux (at least)?
> 
> Hm. Indeed. I can reproduce the issue :/
AFAIR lineinfo.pp only works with Stabs? Didn't the default change to Dwarf?


Martok





___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Optimization of redundant mov's

2017-03-20 Thread Martok
Hi,

> It's called register spilling: once there are no registers left to hold
> values, the compiler has to pick registers whose value will be kept in
> memory instead.
I thought it would be something like that...

Still, my main issue was with the repeated fetches. I'd (naively!) say that it
should be relatively easy for an assembly-level optimizer to detect that these
are repeated loads of the same thing, with nothing that could affect the outcome
inbetween. It's not even a CSE in the technical sense, not a sub-expression but
the entire thing...

> E.g. those memory loads
> are probably optimised by the processor itself (not necessarily coming
> even from the L1 cache, but possibly from the write-back buffer).
Not as well as one might believe, manually fixing (by forcing @CurrentHash into
a register with a local variable) just those 4 lines gives a ~2% increase in
MB/s for this hash. Which is quite a lot, given this is the part *without*
actual computations.

And again, I've seen this happen more than once on i386 code, where it even
creates "fake" register pressure (by using 2 or more registers to hold exactly
the same temporary) that makes the rest of the code worse than it could be.
As a ballpark: the same change as above results in a 10% speedup by freeing up 2
registers (all-int64 operations on i386, so 2 regs needed for everything, having
one more is very noticeable...)

It just strikes me as odd to have some rather good local code but then just
pointlessly add the second-most expensive operation in between ;-)


Regards,

Martok



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Optimization of redundant mov's

2017-03-19 Thread Martok
Hi all,

there has been some discussion about FPCs optimizer in #31444, prompting me to
investigate some of my own code. Generally speaking the generated assembler is
not all that bad (I like how it uses LEA for almost all integer arithmetics),
but I keep seeing sections with lots of redundant MOVs.

Example, from a SHA512 implementation:
CurrentHash is a field of the current class, compiled with anything above -O2,
-CpCOREAVX2, -Px86_64.

 a:= CurrentHash[0]; b:= CurrentHash[1]; c:= CurrentHash[2]; d:= CurrentHash[3];
000100074943 488b8424a002 mov0x2a0(%rsp),%rax
00010007494B 4c8b5038 mov0x38(%rax),%r10
00010007494F 488b8424a002 mov0x2a0(%rsp),%rax
000100074957 4c8b5840 mov0x40(%rax),%r11
00010007495B 488b9424a002 mov0x2a0(%rsp),%rdx
000100074963 488b4248 mov0x48(%rdx),%rax
000100074967 488b9424a002 mov0x2a0(%rsp),%rdx
00010007496F 488b6a50 mov0x50(%rdx),%rbp

Every single one of the "mov 0x2a0(%rsp), %rxx" instructions except the first is
redundant and causes another memory round-trip. At the same time, more registers
are used, which probably makes other optimizations more difficult, especially
when something similar happens on i386.

Now, the fun part: I haven't been able to build a simple test that causes the
same issue (the self-pointer already is in %rcx and not fetched from the stack
each time), so I have a feeling this may be a side effect of some other part of
the code.

Does this sound familiar to anyone? If so, what could I do about it?


Regards,

Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Generics: issue with double inline specialization

2016-09-30 Thread Martok
Hi everyone,

I had already reported the issue as
<http://bugs.freepascal.org/view.php?id=30626>, but as this problem is currently
blocking a clean solution in a project for us, I'm asking for help again here.

The problem appears to be that when a generic uses its type parameter to
inline-specialize another generic (ie. inherit from it), that generic cannot be
specialized anywhere else (ie. return value of that type) or a name collision
occurs. Not really sure why that happens, but for some reason the compiler
doesn't recognize that the two instances don't have the same name by accident
but really are the same type.

Do you have any idea what could be done to work around this?

Thank you in advance,

Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Overflow in TMemoryStream?

2016-09-14 Thread Martok

> I have committed a patch. Please test and report if it is fixed.
> I don't have a 32-bit system available to test on...
Tested on win32: the overflow is fixed, 500M gets incremented by 125M.

I think the RunError is caused by the way ReallocMem works: growing from 869M to
1086M seems to be done by allocating the new area and copying, so there's a
period of almost all of the 2G heap used...


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Overflow in TMemoryStream?

2016-09-11 Thread Martok
Hi,

yes, I can confirm this as an overflow, but on its own, it should be safe. Above
430MB, the stream doesn't grow by a quarter but just by however much was
requested, luckily the branch fails before the wrong capacity could be set.

Test:
type
  TMS2 = class(TMemoryStream) end;
var
  ms: TMS2;
  ds: Int64;
begin
  ds:= 100*1000*1000;
  ms:= TMS2.Create;
  ms.SetSize(ds);
  WriteLn(ds:15,' ', ms.Size:15, ' ', ms.Capacity:15);
  inc(ds, ds div 10); // grow by less than 25%
  ms.SetSize(ds);
  WriteLn(ds:15,' ', ms.Size:15, ' ', ms.Capacity:15);
end.

with ds=100M, prints:
  1   1   13840
  11000   11000   125005824<< grew by 1/4*100M

with ds=500M, prints:
  5   5   52816
  55000   55000   550002688<< bug, grew by 1/10*500M

However, with ds=869M, prints:
  86900   86900   869003264
  95590 18666185569013440   955904000
and mostly crashes with Runtime Error 203 except when I'm step-by-step-debugging
it...
That looks like a *separate* overflow to me, probably caused by the wild mix of
Int64 and Longint that our Streams inherited from Delphi...

I don't have RTL built with full symbols right now, maybe someone else can
investigate?


Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


<    1   2