Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Jonas Maebe via fpc-devel




On 2023-04-11 07:23, Hairy Pixels via fpc-devel wrote:
Strange question but I heard it brought up during a discussion and I 
was very curious what compiler people think about it.


The question is, instead of using a virtual method table would it be 
feasible to use a case statement on the real class type and make a 
dispatch routine for each method call?


This is sometimes done by compilers when value profiling determines that 
a particular function pointer call (almost) always goes to the same 
target.


It can also be done as part of whole program optimisation, as Marco 
mentioned. FWIW, when WPO devirtualisation is applied to the compiler 
(which can be devirtualised almost completely because of the way it is 
constructed), LLVM can apply your proposed optimisation (and some 
others) and gets 8% extra performance. So don't expect miracles from it, 
unless your main loop consists of only calling tiny virtual methods.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Marco van de Voort via fpc-devel


On 11-4-2023 12:11, Hairy Pixels via fpc-devel wrote:

Btw, I was curious because I haven’t done this in so many years but is this 
basically how a VTable looks in procedural code?


Every class has a reference to class type information stored at negative 
offsets. The class type info contains VMT, IVMT, the class name etc.  
The table is compile time (iow a structured constant in procedural 
pascal) and the layout of that table is roughly  described in objpash.inc:


 const
   vmtInstanceSize = 0;
   vmtParent   = sizeof(SizeInt)*2;
   { These were negative value's, but are now positive, else classes
 couldn't be used with shared linking which copies only all 
data from
 the .global directive and not the data before the directive 
(PFV) }

   vmtClassName    = vmtParent+sizeof(pointer);
   vmtDynamicTable = vmtParent+sizeof(pointer)*2;
   vmtMethodTable  = vmtParent+sizeof(pointer)*3;
   vmtFieldTable   = vmtParent+sizeof(pointer)*4;
   vmtTypeInfo = vmtParent+sizeof(pointer)*5;
   vmtInitTable    = vmtParent+sizeof(pointer)*6;
   vmtAutoTable    = vmtParent+sizeof(pointer)*7;
   vmtIntfTable    = vmtParent+sizeof(pointer)*8;
   vmtMsgStrPtr    = vmtParent+sizeof(pointer)*9;
   { methods }
   vmtMethodStart  = vmtParent+sizeof(pointer)*10;
   vmtDestroy  = vmtMethodStart;
   vmtNewInstance  = vmtMethodStart+sizeof(codepointer);
   vmtFreeInstance = vmtMethodStart+sizeof(codepointer)*2;
   vmtSafeCallException    = vmtMethodStart+sizeof(codepointer)*3;
   vmtDefaultHandler   = vmtMethodStart+sizeof(codepointer)*4;
   vmtAfterConstruction    = vmtMethodStart+sizeof(codepointer)*5;
   vmtBeforeDestruction    = vmtMethodStart+sizeof(codepointer)*6;
   vmtDefaultHandlerStr    = vmtMethodStart+sizeof(codepointer)*7;
   vmtDispatch = vmtMethodStart+sizeof(codepointer)*8;
   vmtDispatchStr  = vmtMethodStart+sizeof(codepointer)*9;
   vmtEquals   = vmtMethodStart+sizeof(codepointer)*10;
   vmtGetHashCode  = vmtMethodStart+sizeof(codepointer)*11;
   vmtToString = vmtMethodStart+sizeof(codepointer)*12;

The VMT has one pointer for each virtual method in a class. In e.g. 
TAnimal version of the table those pointers would point to the TAnimal 
copy of the virtual methods or a method that raises an "abstract" 
exception if the method is abstract.  The VMTs are linked (the TDog 
table points to TAnimal in its parent pointer, which in turn points to 
its parent) for the IS and AS functionality, but not for VMTs


The TDog version of the table they point to the TDog version.

So basically the tdog.create would only allocate memory, and initialize 
the first "field" with a pointer to the tdog class structured constants, 
and advance the pointer to the next field and return that as the 
instance pointer.


const ddogclassinfo : array [0..3]of pointer = 
(@parent,@vmtmethod1,@vmtmethod2,@vmtmethod3);


If the class (TDog) overrides the method, it points to the tdog 
reference, if not, this table still lists the TAnimal version of the method.



constructor TDog.Create;

begin

  result:=NewInstance; // allocate memory for the instance

  ppointer(result)^:=@dogclassinfo;

  inc(result,sizeof(pointer));

end;


Dispatch then becomes in simplified Pascal  (assume that "instance" is a 
initialized class)


var vmt : pointer;

vmt:=ppointer(instance)^;

tthemethod(ppointer(vmt)[n]).themethod(parameter);

... with n the number of the respective virtual method, and TTheMethod 
its signature.  Handling all those indexes, signatures and generating 
the parameter loading into registers, stack is the core work of the 
compiler.


Note that Delphi 1 had a different way of dispatch, called "dynamic" 
(the 16-bit compiler had to be much more careful with memory), at the 
expense of performance. IIRC that version kind of worked like your 
scheme, where every class only had a table for the methods it really 
had, leaving the rest up to a function in the parent. I don't recall the 
exact details (I suggest an old Delphi manual or mastering Delphi for 
that).  Free Pascal ignores this directive, and doesn't implement it.




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Hairy Pixels via fpc-devel


> On Apr 11, 2023, at 4:02 PM, Marco van de Voort via fpc-devel 
>  wrote:
> 
> That's what I thought, yes. But the whole analysis stays the same:
> 
> - you don't have a list of all possible polymorphic types in the application 
> when you compile the average dispatch point, that is in the realm of 
> whole-program optimization.   (I saw your later mail that you understood 
> this, but I already composed this message when it arrived)

Yeah I think that’s kind of kills the idea.

Btw, I was curious because I haven’t done this in so many years but is this 
basically how a VTable looks in procedural code?

The idea is that the record adds all function pointers from all the descendants 
and fills them out with local implementation as pseudo-overriding.



{$mode objfpc}

program procedural_oop;
uses
UDog, UAnimal;

var
dog: PDog;
begin
dog := TDog_Init;
dog^.walk_proc(PAnimal(dog));
dog^.bark_proc(dog);
end.



{$mode objfpc}

unit UDog;
interface
uses
  UAnimal;

type
  PDog = ^TDog;
  TDog = record
// TAnimal methods
walk_proc: procedure(self: PAnimal);
// TDog methods
bark_proc: procedure(self: PDog);
// instance members
sound: string;
  end;

function TDog_Init: PDog;
procedure TDog_Bark(self: PDog);

implementation

procedure TDog_Bark(self: PDog);
begin
  writeln('dog goes ', self^.sound);
end;

function TDog_Init: PDog;
begin
  result := PDog(GetMem(sizeof(TDog)));

  // setup methods
  result^.walk_proc := @TAnimal_Walk;   // use parent implementation
  result^.bark_proc := @TDog_Bark;  // override with current implementation

  // init members
  result^.sound := 'woof!'
end;

end.



{$mode objfpc}

unit UAnimal;
interface

type
  PAnimal = ^TAnimal;
  TAnimal = record
walk_proc: procedure(self: PAnimal);
  end;

function TAnimal_Init: PAnimal;
procedure TAnimal_Walk(self: PAnimal); 

implementation

procedure TAnimal_Walk(self: PAnimal); 
begin
  writeln('animal walks');
end;

function TAnimal_Init: PAnimal;
begin
  result := PAnimal(GetMem(sizeof(TAnimal)));

  // setup methods
  result^.walk_proc := @TAnimal_Walk;
end;

end.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Marco van de Voort via fpc-devel


On 11-4-2023 10:41, Hairy Pixels via fpc-devel wrote:
  

  case animal.type of
TDog: TDog(animal).DoSomething;
TCat: TCat(animal).DoSomething;
TMouse: TMouse(animal).DoSomething;
  end;

This doesn't happen. There is no class that is TDog,Cat and mouse. Usually a 
VMT governs the relation between parent and child, i.e.   TDog and TAnimal.  
The TDog nor the TAnimal implementation has no knowledge of the other mammals.

Address this part first since maybe my example was poor. I was trying to model 
how polymorphism is done in pure procedural code. I used to write code like 
this so I kind of remember.

The idea is that there is a record with an enum which is the type of the 
object. Then anytime you have a polymorphic function (an override) you dispatch 
on it using a case.


That's what I thought, yes. But the whole analysis stays the same:

- you don't have a list of all possible polymorphic types in the 
application when you compile the average dispatch point, that is in the 
realm of whole-program optimization.   (I saw your later mail that you 
understood this, but I already composed this message when it arrived)


- The code is still more bulky. To get the type from the instance 
pointer is as expensive as the getting the vmt, and while the dispatch 
via the VMT is a single instruction on x86 (call *[vmt+offset]), you 
still have to spend a comparison and a branch per case (dog/cat/mouse) 
extra, with as only saving grace that the indirection of the call 
instruction disappears. Both bulkier and slower.


Another issue that I just thought about:

- you now only consider one virtual method per class. But think of more 
complex class hierarchies where various methods are introduced at 
various levels. Every virtual method then gets its own set of possible 
classes, and thus enum. This reduces to one entry per virtual method, 
just like in the VMT, only difference is that it is an enum instead of a 
pointer. (but the VMT pointer table is duplicated in the case statements 
of every dispatch call)


To judge new language schemes, always factor in multi unit examples, and 
ask yourself what you know at each point of the compilation. (e.g. 
typically you only know the interface of an USES'd unit, and not its 
implementation, and not other units). Similarly, the simple case is only 
a step up to also evaluate more elaborate cases. The devil is always in 
the details.




___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Hairy Pixels via fpc-devel


> On Apr 11, 2023, at 2:58 PM, Marco van de Voort via fpc-devel 
>  wrote:
> 
> But the core problem is that you don't have an overview of all other 
> descendants in the compiler. Those can be fragmented over multiple units, and 
> then you touch the problem that whole program optimization like 
> de-virtualization tries to solve.

I didn’t get this at first but I see what you mean. The call site would only 
have access the descendants it knew about that point during compilation. So the 
case may not have a full set of descendants to work from.

I used to write code like this and there was always a way to make things work 
so maybe in practice it’s not as bad as it sounds but the VTable is clearly a 
more general purpose solution.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Hairy Pixels via fpc-devel


> On Apr 11, 2023, at 2:58 PM, Marco van de Voort via fpc-devel 
>  wrote:
> 
>>  case animal.type of
>>TDog: TDog(animal).DoSomething;
>>TCat: TCat(animal).DoSomething;
>>TMouse: TMouse(animal).DoSomething;
>>  end;
> 
> This doesn't happen. There is no class that is TDog,Cat and mouse. Usually a 
> VMT governs the relation between parent and child, i.e.   TDog and TAnimal.  
> The TDog nor the TAnimal implementation has no knowledge of the other mammals.

Address this part first since maybe my example was poor. I was trying to model 
how polymorphism is done in pure procedural code. I used to write code like 
this so I kind of remember.

The idea is that there is a record with an enum which is the type of the 
object. Then anytime you have a polymorphic function (an override) you dispatch 
on it using a case.

Does that make it clearer? I thought the base class TAnimal would have this 
“type” enum set at construction  and it would be available at runtime.

Regards,
Ryan Joseph

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Using case statement instead of VTable

2023-04-11 Thread Marco van de Voort via fpc-devel


On 11-4-2023 07:23, Hairy Pixels via fpc-devel wrote:

Strange question but I heard it brought up during a discussion and I was very 
curious what compiler people think about it.


I hope you don't mind me replying as non compiler person :-)



The question is, instead of using a virtual method table would it be feasible 
to use a case statement on the real class type and make a dispatch routine for 
each method call?

For example assume “type” is the internal type of the class (where a VTable 
would be) and a call to “animal.DoSomething()” is called which would internally 
basically be this dispatch table:

  case animal.type of
TDog: TDog(animal).DoSomething;
TCat: TCat(animal).DoSomething;
TMouse: TMouse(animal).DoSomething;
  end;


This doesn't happen. There is no class that is TDog,Cat and mouse. 
Usually a VMT governs the relation between parent and child, i.e.   TDog 
and TAnimal.  The TDog nor the TAnimal implementation has no knowledge 
of the other mammals.




The reason this was brought up was because virtual methods can’t be inlined and 
this would allow that but I think there would be more overhead overall to have 
all these case statements on every virtual method call.


Much more, keep in mind your example above doesn't even have parameters.

Keep in mind that the virtual call is fairly cheap to begin with, you 
load the VMT from the instance references (first field), and then the 
method pointer at an fixed offset relative to that. Two instructions on 
x86 with its many addressing modes, only slightly more on non-x86, I 
would guess, and no branches.


Your solution would also need to load some reference to the class type, 
(the first instruction) and then a compare and branch for each case.


But the core problem is that you don't have an overview of all other 
descendants in the compiler. Those can be fragmented over multiple 
units, and then you touch the problem that whole program optimization 
like de-virtualization tries to solve.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel