[fpc-devel] Compiler bottlenecks] (was: Blackfin support)

2010-07-13 Thread Jonas Maebe

On 13 Jul 2010, at 01:46, Hans-Peter Diettrich drdiettri...@aol.com wrote:

 Florian Klaempfl schrieb:
 For me, a much higher priority when doing rewrites might be
 multithreading nf the compiler itself.
 
 That's questionable, depending on the real bottlenecks in compiler operation. 
 I suspect that disk I/O is the narrowest bottleneck, that can not be widened 
 by parallel processing.

Unless you are doing a cold compile, the main bottlenecks in the compiler are 
the memory manager (mostly the allocation of memory, freeing is faster), 
zero-filling new class instances (and partially resetting the register 
allocator) and tobject.initinstance.

 It also requires further research, for e.g. the determination of the optimal 
 number of threads, depending on the currently available resources on a 
 concrete machine.

We'd just use the same approach as make: allow the user to specify the number 
of parallel operations.


Jonas___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Michael Schnell

 On 07/13/2010 01:46 AM, Hans-Peter Diettrich wrote:

That's questionable, depending on the real bottlenecks in compiler
operation. I suspect that disk I/O is the narrowest bottleneck,
I doubt this. The disk-cache does a decent work here. gcc can do this 
very effectively on a higher layer, as for each source file gcc is 
called separately by make. As FPC internally organizes the unit make 
sequence, I suppose internal multithreading needs to be implemented.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Michael Schnell

 On 07/12/2010 05:54 PM, Hans-Peter Diettrich wrote:

M68K machine, which in turn seems to have inherited from the ARM.

I suppose: vice versa :).

.., but it doesn't allow to support multiple machine back-ends in one
program.
Do you think it would be an advantage to support multiple archs in a 
single compiler executable ? I feel that recompiling the compiler when 
changing the target CPU is not very harmful.

I could not find much, and most existing documentation is outdated
since 2.0 :-(

Of course improvement on that issue would be very desirable :).

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Michael Schnell

 On 07/12/2010 06:24 PM, Geoffrey Barton wrote:

I wrote a procedure to turn on interrupts:-

Are you doing a project without an OS ?

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Florian Klaempfl
Hans-Peter Diettrich schrieb:
 For me, a much higher priority when doing rewrites might be
 multithreading nf the compiler itself.
 
 That's questionable, depending on the real bottlenecks in compiler
 operation. I suspect that disk I/O is the narrowest bottleneck, that can
 not be widened by parallel processing.

Memory throughput is a bottleneck, I/O not really. So multithreading has
a real advantage on NUMA systems and systems where different cores have
dedicated caches. One or two years ago, I did some experiments with
asynchronous assembler calls and it already improved significantly
compilation times on platforms using an external assembler. The problem
is that the whole compiler is not designed to do so. This could be
solved by an approach we want to implement for years: split the
compilation process into tasks (like parse unit X, load unit Y, code gen
unit X) with dependencies. This should also solve the fundamental
problems with unit loading/compilation causing sometimes internal
errors. The first step would be to do this without multithreading, later
it could be tried to execute several tasks in parallel.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Geoffrey Barton

yes, trying to :-) It is an embedded LM3S9B92 controller.

Geoffrey

On 13 Jul 2010, at 09:01, Michael Schnell wrote:


On 07/12/2010 06:24 PM, Geoffrey Barton wrote:

I wrote a procedure to turn on interrupts:-

Are you doing a project without an OS ?

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Marco van de Voort
In our previous episode, Hans-Peter Diettrich said:
  For me, a much higher priority when doing rewrites might be
  multithreading nf the compiler itself.
 
 That's questionable, depending on the real bottlenecks in compiler 
 operation. I suspect that disk I/O is the narrowest bottleneck, that can 
 not be widened by parallel processing.

No that has to be solved by a bigger granularity (compiling more units in
one go).  That avoids ppu reloading and limits directory searching (there is
a cache iirc) freeing up more bandwidth for source loading.

Not only compiling goes in paralel, I assume one could also load a ppu in
parallel? (and so parallelize the blocking time of the disk I/O and the
parsing of the .ppu contents.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Florian Klaempfl
Marco van de Voort schrieb:
 In our previous episode, Hans-Peter Diettrich said:
 For me, a much higher priority when doing rewrites might be
 multithreading nf the compiler itself.
 That's questionable, depending on the real bottlenecks in compiler 
 operation. I suspect that disk I/O is the narrowest bottleneck, that can 
 not be widened by parallel processing.
 
 No that has to be solved by a bigger granularity (compiling more units in
 one go).  That avoids ppu reloading and limits directory searching (there is
 a cache iirc) freeing up more bandwidth for source loading.
 
 Not only compiling goes in paralel, I assume one could also load a ppu in
 parallel? 

With compiling I meant all tasks the compiler does, even assemling and
linking.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Purpose of uses ... in?

2010-07-13 Thread Marco van de Voort
In our previous episode, Jonas Maebe said:
  Even for portability purposes it often doesn't work, since usually the build
  systems  and files for FPC/Lazarus and Delphi differ anyway (and you noticed
  the working dir difference)
 
 The working dir difference is a Lazarus difference, not an FPC  
 difference. Afaict, that feature works identically in FPC and in Delphi.

 Furthermore, at least two of the users have already posted in this  
 thread saying that they use this functionality (both in FPC and in  
 Delphi). Therefore I don't think it is a good idea to remove or change  
 it.

Nobody is talking about removing ? It is more a matter of not expanding, and
not guaranteeing too much (more) wrt to it. Specially since DoDi in other
posts seemed to state that he wanted to use it to override which unit is
selected in multiple sources in path cases.

So I was not talking about IN in general, but the specific case
that it actually contains paths (IOW not unit renaming).

(btw afaik the consequences of IN (but also allowing multiple casings in
general) is that we don't use the OS routines to search for files, but read
in the entire dir ourselves? Because one full search is cheaper than many
small ones?)

 If different functionality is desired, I think it's better to add a  
 different construct rather than using the same construct but with a  
 different meaning.

The whole paths in source is evil IMHO. It should not be expanded. The IN
should remain what it was introduced for, a minimal ability to work around
some case problems, and nothing more.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Geoffrey Barton


On 12 Jul 2010, at 19:06, Jeppe Johansen wrote:

Add the missing instructions to the bottom of armins.dat, run  
mkarmins in the same directory.



It now recognises the mnemonic 'cpsie' but not the following 'i'.

The 'msr' instruction should also allow the interrupts to be enabled/ 
disabled as


msr primask,r0

but msr gives an unknown identifier error for 'primask' and all the  
other 'special' register names ('apsr' etc.) Perhaps they have been  
given different names, but I cannot find them listed anywhere in the  
FPC source.



(and then submit patch) :-)


well, once I have some code which works on the chip, I will ask  
someone where to put it :-)


Geoffrey



Geoffrey Barton skrev:

I wrote a procedure to turn on interrupts:-

procedure intenable;nostackframe;
begin
 asm
cpsie i
 end;
end;

The compilation fails with 'Error: Unrecognized opcode cpsie'

The compiler also does not recognise 'cpsid' and also 'primask' as  
in 'mrs r0,primask'


any ideas/workarounds?

Geoffrey
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Purpose of uses ... in?

2010-07-13 Thread Michael Schnell

 On 07/13/2010 11:46 AM, Marco van de Voort wrote:

The whole paths in source is evil IMHO.

+1,

But id _could_ be overcome e.g. by multiple unit search passes to be 
defined and do something like in 2:xxx to define unit xxx to be searched 
in the 2nd unit search path (while no : means normal (1st) search path).


(Of course not compatible to Delphi at all :) ).

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Jeppe Johansen
The bit names are a bit hard, since there aren't any parsing facilities 
in place for them, to my knowledge. I think they would need a special 
syntax to not be seen as symbols by the assembler reader


It'll take some work on the assembler reader and writer to get those 
special instructions to work


The special registers should be added to armreg.dat(and then run 
mkarmreg). I didn't add all the cortex registers. Is primask a real 
register btw? It just assembles to cpsr


Geoffrey Barton skrev:


On 12 Jul 2010, at 19:06, Jeppe Johansen wrote:

Add the missing instructions to the bottom of armins.dat, run 
mkarmins in the same directory.



It now recognises the mnemonic 'cpsie' but not the following 'i'.

The 'msr' instruction should also allow the interrupts to be 
enabled/disabled as


msr primask,r0

but msr gives an unknown identifier error for 'primask' and all the 
other 'special' register names ('apsr' etc.) Perhaps they have been 
given different names, but I cannot find them listed anywhere in the 
FPC source.



(and then submit patch) :-)


well, once I have some code which works on the chip, I will ask 
someone where to put it :-)


Geoffrey



Geoffrey Barton skrev:

I wrote a procedure to turn on interrupts:-

procedure intenable;nostackframe;
begin
 asm
cpsie i
 end;
end;

The compilation fails with 'Error: Unrecognized opcode cpsie'

The compiler also does not recognise 'cpsid' and also 'primask' as 
in 'mrs r0,primask'


any ideas/workarounds?

Geoffrey
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Compiler bottlenecks]

2010-07-13 Thread Hans-Peter Diettrich

Jonas Maebe schrieb:


I suspect that disk I/O is the narrowest bottleneck,
that can not be widened by parallel processing.


Unless you are doing a cold compile, the main bottlenecks in the
compiler are the memory manager (mostly the allocation of memory,
freeing is faster), zero-filling new class instances (and partially
resetting the register allocator) and tobject.initinstance.


Memory management can not normally be parallelized. Also access to the 
newly created instances requires to fill the cache, so that IMO fear 
that parallel operations in this area will not speed up anything much. 
But since more is done in a thread, there may remain tasks that can fill 
the gaps in other threads.


Did you already do some performance analysis in the compiler code? I 
never did so myself, and don't know how to obtain really meaningful figures.



It also requires further research, for e.g. the determination of
the optimal number of threads, depending on the currently available
resources on a concrete machine.


We'd just use the same approach as make: allow the user to specify
the number of parallel operations.


A safe bet :-)

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Hans-Peter Diettrich

Michael Schnell schrieb:


That's questionable, depending on the real bottlenecks in compiler
operation. I suspect that disk I/O is the narrowest bottleneck,


I doubt this. The disk-cache does a decent work here. gcc can do this 
very effectively on a higher layer, as for each source file gcc is 
called separately by make. As FPC internally organizes the unit make 
sequence, I suppose internal multithreading needs to be implemented.


A C compiler has to access the very same header files over and over 
again, so that a file cache can reduce disk I/O considerably. But when 
FPC processes every source unit in a project only once, the file cache 
is not very helpful.


Nontheless it may make sense to process the units in threads, so that an 
already read unit can be processed while other threads still are waiting 
for disk I/O. I only doubt that this will result in a noticeable overall 
speed gain, when the results have to be written back to disk after 
compilation. But we can know more only after according tests...


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Geoffrey Barton


On 13 Jul 2010, at 12:24, Jeppe Johansen wrote:

The bit names are a bit hard, since there aren't any parsing  
facilities in place for them, to my knowledge. I think they would  
need a special syntax to not be seen as symbols by the assembler  
reader


It'll take some work on the assembler reader and writer to get those  
special instructions to work


The special registers should be added to armreg.dat(and then run  
mkarmreg). I didn't add all the cortex registers.


ok, I have done that, 'msr primask,r0' compiles ok. I will disassemble  
and compare with the C code examples, if I can understand them :-)



Is primask a real register btw? It just assembles to cpsr


cpsr should not exist in cortexm3 according to ARM (see www.arm.com/files/pdf/Cortex-M3_programming_for_ARM7_developers.pdf 
 eg. page 10 ) It is not in either the v7m archictecture or cortex_m3  
tech ref manuals.


primask is a (real as far as I can see) one bit register. I will try  
flipping it on the hardware debugger and see if it changes anywhere  
else...


rgds
Geoffrey


Geoffrey Barton skrev:


On 12 Jul 2010, at 19:06, Jeppe Johansen wrote:

Add the missing instructions to the bottom of armins.dat, run  
mkarmins in the same directory.



It now recognises the mnemonic 'cpsie' but not the following 'i'.

The 'msr' instruction should also allow the interrupts to be  
enabled/disabled as


msr primask,r0

but msr gives an unknown identifier error for 'primask' and all the  
other 'special' register names ('apsr' etc.) Perhaps they have been  
given different names, but I cannot find them listed anywhere in  
the FPC source.



(and then submit patch) :-)


well, once I have some code which works on the chip, I will ask  
someone where to put it :-)


Geoffrey



Geoffrey Barton skrev:

I wrote a procedure to turn on interrupts:-

procedure intenable;nostackframe;
begin
asm
   cpsie i
end;
end;

The compilation fails with 'Error: Unrecognized opcode cpsie'

The compiler also does not recognise 'cpsid' and also 'primask'  
as in 'mrs r0,primask'


any ideas/workarounds?

Geoffrey
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Compiler bottlenecks]

2010-07-13 Thread Michael Schnell

 On 07/13/2010 02:41 PM, Hans-Peter Diettrich wrote:

Memory management can not normally be parallelized.
The FPC memory manager even can handle handle threaded access to the 
memory manager without OS help using atomic instructions (if running on 
X86 or ARMv6)


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Michael Schnell

 On 07/13/2010 02:49 PM, Hans-Peter Diettrich wrote:
 But when FPC processes every source unit in a project only once, the 
file cache is not very helpful.
Obviously, a sufficiently huge cache can avoid any disk I/O bottleneck 
when doing the 2nd+ build.


-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Hans-Peter Diettrich

Michael Schnell schrieb:


M68K machine, which in turn seems to have inherited from the ARM.

I suppose: vice versa :).


At least I found files with comments from/for ARM.


.., but it doesn't allow to support multiple machine back-ends in one
program.
Do you think it would be an advantage to support multiple archs in a 
single compiler executable ? I feel that recompiling the compiler when 
changing the target CPU is not very harmful.


I don't understand the current compilation process yet. How is the 
target command line switch handled? Does pp spawn the target-specific 
compiler?



I could not find much, and most existing documentation is outdated
since 2.0 :-(

Of course improvement on that issue would be very desirable :).


What format should it be? Wiki entries were easily extensible, but it's 
also easy to loose the overview over the missing pieces. FPDoc is nasty 
to format, though it would allow to inline the documentation with the 
online help. I'd prefer HTML, or OpenOffice if it allows for embedded links.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Hans-Peter Diettrich

Florian Klaempfl schrieb:


Memory throughput is a bottleneck, I/O not really. So multithreading has
a real advantage on NUMA systems and systems where different cores have
dedicated caches. One or two years ago, I did some experiments with
asynchronous assembler calls and it already improved significantly
compilation times on platforms using an external assembler.


Good to know :-)


The problem
is that the whole compiler is not designed to do so. This could be
solved by an approach we want to implement for years: split the
compilation process into tasks (like parse unit X, load unit Y, code gen
unit X) with dependencies. This should also solve the fundamental
problems with unit loading/compilation causing sometimes internal
errors. The first step would be to do this without multithreading, later
it could be tried to execute several tasks in parallel.


I should know more about available threading features (blocking, 
synchronization...). IMO compilation should be done in two steps, with 
the first step providing the interface for used units, from a .ppu file 
or by a new parse. Once this information is available, the using units 
(threads) can resume their work. The final code generation can occur in 
further threads.


At least I know now what to look for, in my parser redesign. It seems to 
be a good idea to reduce the number of global links, so that in a 
following compiler redesign multiple threads can do their work 
independently.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Purpose of uses ... in?

2010-07-13 Thread Hans-Peter Diettrich

Marco van de Voort schrieb:


Nobody is talking about removing ? It is more a matter of not expanding, and
not guaranteeing too much (more) wrt to it. Specially since DoDi in other
posts seemed to state that he wanted to use it to override which unit is
selected in multiple sources in path cases.


AFAIR this was intended as a hack, to work around the current compiler 
directory tree structure. When the affected units can be moved into 
dedicated directories (for now in the parser_rewrite branch), such hacks 
will not be necessary.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Hans-Peter Diettrich

Geoffrey Barton schrieb:


It now recognises the mnemonic 'cpsie' but not the following 'i'.

The 'msr' instruction should also allow the interrupts to be 
enabled/disabled as


msr primask,r0

but msr gives an unknown identifier error for 'primask' and all the 
other 'special' register names ('apsr' etc.) Perhaps they have been 
given different names, but I cannot find them listed anywhere in the FPC 
source.


I don't know details about this CPU, but possibly priviledged operations 
(and registers) are not part of the CPU definition, because these cannot 
be used in ordinary applications.


It may be a good idea to create multiple code generators, for machines 
that can be used either for non-priviledged (application) or privileged 
(system, driver...) coding. At least a priviledge level should be passed 
to the compiler and assembler, so that it can flag the need for 
privileged instructions in the given source code.


Dodi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Jeppe Johansen
I think that'll only complicate things. I think the compiler should be 
able to do anything, down to lowest level. Just like you have CLI, HLT, 
FXSTOR, WRMSR, etc instruction support in x86


Hans-Peter Diettrich skrev:

Geoffrey Barton schrieb:


It now recognises the mnemonic 'cpsie' but not the following 'i'.

The 'msr' instruction should also allow the interrupts to be 
enabled/disabled as


msr primask,r0

but msr gives an unknown identifier error for 'primask' and all the 
other 'special' register names ('apsr' etc.) Perhaps they have been 
given different names, but I cannot find them listed anywhere in the 
FPC source.


I don't know details about this CPU, but possibly priviledged 
operations (and registers) are not part of the CPU definition, because 
these cannot be used in ordinary applications.


It may be a good idea to create multiple code generators, for machines 
that can be used either for non-priviledged (application) or 
privileged (system, driver...) coding. At least a priviledge level 
should be passed to the compiler and assembler, so that it can flag 
the need for privileged instructions in the given source code.


Dodi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Marco van de Voort
In our previous episode, Hans-Peter Diettrich said:
  No that has to be solved by a bigger granularity (compiling more units in
  one go).  That avoids ppu reloading and limits directory searching (there is
  a cache iirc) freeing up more bandwidth for source loading.
 
 ACK. The compiler should process in one go as many units as possible - 
 but this is more a matter of the framework (Make, Lazarus...), that 
 should pass complete lists of units to the compiler (projects, packages).

Not necessarily. One could also strengthen the make capabilities of the
compiler, think about reworking the  compiler to be kept resident etc.
 
 As a workaround a dedicated server process could hold the least recently 
 processed unit objects in RAM, for use in immediately following 
 compilation of other units. But this would only cure the symptoms, not 
 the reason for slow compiles :-(

(some random wild thinking:)

Jonas seems to indicate most is due to the object model (zeroing) and
memorymanagement in general.

One must keep in mind though that he probably measures on a *nix, and there
is a reason why on Windows the make cycle takes twice the time on Windows. I
don't think under Windows, the CPU or the cache halves in speed, so it must
be more in the I/O sphere:
- ntfs is relatively slow in directory operations (seeking)
- Windows is slow starting up binaries.
- Afaik ntfs caching is optimized for fileserver use, not to speed up a 
   single application strongly. Specially if that apps starts/stops
   constantly (a model that is foreign on Windows)

So one can't entirely rule out limiting I/O and number of compiler startups,
since not all OSes are alike.

For the memory management issues, an memory manager specifically for the
compiler is the solution first hand. To make it worthwhile to have a list of
zeroed blocks (and have a thread zero big blocks), somehow the system
must know when a zeroed block is needed. For objects this maybe could be by
creating a new root object, and deriving every object from it (cclasses
etc). But that would still leave dynamic arrays and manually allocated
memory.

For manually allocated memory of always the same size (virtual register
map?) a pooling solution could be found.
 
 It may be a good idea to implement different models, that either read 
 entire files or use the current (buffered) access. Depending on disk 
 fragmentation it may be faster to read entire (unfragmented) source or 
 ppu files, before requests for other files can cause disk seeks and slow 
 down continued reading of files from other places. Both models can be 
 used concurrently, when an arbitration is possible from certain system 
 (load) parameters.

Most OSes already read several 10s of kbs in advance. I don't really think
that will bring that much. Such approaches are so lowlevel that the OS could
do it, and probably it will.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Purpose of uses ... in?

2010-07-13 Thread Jonas Maebe

Marco van de Voort wrote on Tue, 13 Jul 2010:


In our previous episode, Jonas Maebe said:

Furthermore, at least two of the users have already posted in this
thread saying that they use this functionality (both in FPC and in
Delphi). Therefore I don't think it is a good idea to remove or change
it.


Nobody is talking about removing ?


It was suggested (for non-Borland syntax modes) in a quote at the top  
of the message you were replying to.



(btw afaik the consequences of IN (but also allowing multiple casings in
general) is that we don't use the OS routines to search for files, but read
in the entire dir ourselves? Because one full search is cheaper than many
small ones?)


The directory cache is unrelated to the support for IN 'xxx'. It was  
mainly added

a) because searching directories is very slow on Windows
b) indeed also because of the searching for filenames with different cases.

In case some directory in the search path contains a lot of files  
while only a few units are used (e.g., when doing testsuite runs), it  
actually slows things down though -- and quite severely so when a  
network file system is involved.



Jonas


This message was sent using IMP, the Internet Messaging Program.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Hans-Peter Diettrich

Michael Schnell schrieb:

 On 07/13/2010 02:49 PM, Hans-Peter Diettrich wrote:
 But when FPC processes every source unit in a project only once, the 
file cache is not very helpful.
Obviously, a sufficiently huge cache can avoid any disk I/O bottleneck 
when doing the 2nd+ build.


Then the system file cache will make it hard to determine reasonable 
figures for the first build. And I wonder how often long builds are run 
more often in sequence?


When we rely on an OS file chache, we can read all files entirely into 
memory, instead of using buffered I/O. Or we can design an interface 
that allows to run the compiler e.g. inside Lazarus, using the already 
loaded editor files and directory caches.



BTW, should we switch the thread topic?

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] arm embedded cortexM3 unrecognized opcode

2010-07-13 Thread Hans-Peter Diettrich

Jeppe Johansen schrieb:
I think that'll only complicate things. I think the compiler should be 
able to do anything, down to lowest level. Just like you have CLI, HLT, 
FXSTOR, WRMSR, etc instruction support in x86


Then many users will wonder why their application with included ASM from 
somewhere else (DOS time...) will compile fine, but fails to run :-(


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Blackfin support

2010-07-13 Thread Hans-Peter Diettrich

Marco van de Voort schrieb:


One must keep in mind though that he probably measures on a *nix, and there
is a reason why on Windows the make cycle takes twice the time on Windows.


One of these issues are memory mapped files, that can speed up file 
access a lot (I've been told), perhaps because it maps directly to the 
system file cache?




So one can't entirely rule out limiting I/O and number of compiler startups,
since not all OSes are alike.


That means optimizing for one platform may slow down the compiler on 
other platforms :-(



For the memory management issues, an memory manager specifically for the
compiler is the solution first hand. To make it worthwhile to have a list of
zeroed blocks (and have a thread zero big blocks), somehow the system
must know when a zeroed block is needed. For objects this maybe could be by
creating a new root object, and deriving every object from it (cclasses
etc). But that would still leave dynamic arrays and manually allocated
memory.


When zeroing blocks really is an issue, then I suspect that it's more an 
issue of memory chaches. This would mean that the data locality should 
be increased, i.e. related pieces of data should reside physically next 
each other (same page). Most list implementations (TList) tend to spread 
the list and its entries across the address space.


Special considerations may apply to 64 bit systems, with an (currently) 
almost unlimited address space. There it might be a good idea to 
allocate lists bigger than really needed, what should do no harm when 
the unused elements never are allocated to RAM (thanks to paged memory 
management). Then a TList with buckets only is slower on such a system, 
for no other gain.




For manually allocated memory of always the same size (virtual register
map?) a pooling solution could be found.


Again candidates for huge pre-allocated memory arrays. But when these 
elements then are not used together, they may occupy one or two memory 
pages, and the remaining RAM in these pages is unused.




Most OSes already read several 10s of kbs in advance. I don't really think
that will bring that much. Such approaches are so lowlevel that the OS could
do it, and probably it will.


Every OS with MMF will do so, when memory mapped files only are used. 
The rest IMO is so platform specific, that a single optimization 
strategy may not be a good solution for other platforms.



But I think that such low-level considerations should be left for later, 
when the big issues are fixed, and the requirements for exploring the 
real behaviour of various strategies have been implemented.


DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel