Re: App hangs, GC.collect() fixet it. Why?

2020-09-29 Thread Bastiaan Veelo via Digitalmars-d-learn
On Monday, 28 September 2020 at 21:58:31 UTC, Steven 
Schveighoffer wrote:

On 9/28/20 3:28 PM, Bastiaan Veelo wrote:
I’m leaning towards ditching the memory mapped I/O on the D 
end, and replace it by regular serialisation/deserialisation. 
That will be a manual rewrite though, which is a bit of bummer 
as memory mapped files are widely used in our Pascal code. But 
this will probably give the best end result.


2 things:

1. I agree this is the answer. If you ever ditch the old Pascal 
code, then you can reactivate the memory-mapped code.
2. You can possibly do the translation outside of your 
programs. That is, it wouldn't be entirely impossible to simply 
have a process running that ensures the "D view" and the 
"Pascal view" of the same file is kept in sync. Then you can 
keep the memory mapped code the same, and just define sane 
structures in your D code.


If you aren't required to have both Pascal and D programs 
reading and writing the file at the same time, this shouldn't 
be a problem.


There is no need to run both versions concurrently. The issue is 
that design offices typically maintain a library of past designs 
for as long as they are in existence, to build new designs off 
of. So being able to read or import the files that were written 
with an ancient version of our software is very valuable. Our old 
compiler offered two alternatives for file i/o: one where all 
elements are of the same type, the other one (memory mapped 
files) being the "only" option for files of mixed type. Ideally, 
the structs that are used for i/o do not have any pointers in 
them, and certainly in the more recent file versions that would 
be the case. In older versions that might not be the case; then 
the pointers obviously would be given meaningful values after the 
structs would have been read back in. These cases we would be 
able to work around, though, by converting the old structs to new 
ones upon import.


BTW, one further thing I don't understand -- if this is memory 
mapped data, how come it has issues with the GC? And what do 
the "pointers" mean in the memory mapped data? I'm sure there's 
good answers, and your actual code is more complex than the 
simple example, but I'm just curious.


The main problem is that the transpiler doesn't know which 
structs are used for i/o and would need 1-byte alignment, and 
which structs have pointers into GC memory and must not be 1-byte 
aligned. The alternative to switching to 
serialisation/deserialisation is to stay with the automated 
translation of the memory mapped file implementation, not 
automatically 1-byte align every struct but manually align the 
ones that are used in i/o. This is however sensitive to mistakes, 
and the translated mmfile implementation has a bit of a smell to 
it. It is also not portable, as it uses the WinAPI directly. 
Still, it may be the quickest route to get us back on track.


I am very glad to have identified the problem, and there being 
ways to deal with it. I just hope this will be the last big 
hurdle :-)


-Bastiaan.


Re: App hangs, GC.collect() fixet it. Why?

2020-09-28 Thread Steven Schveighoffer via Digitalmars-d-learn

On 9/28/20 3:28 PM, Bastiaan Veelo wrote:
I’m leaning towards ditching the memory mapped I/O on the D end, and 
replace it by regular serialisation/deserialisation. That will be a 
manual rewrite though, which is a bit of bummer as memory mapped files 
are widely used in our Pascal code. But this will probably give the best 
end result.


2 things:

1. I agree this is the answer. If you ever ditch the old Pascal code, 
then you can reactivate the memory-mapped code.
2. You can possibly do the translation outside of your programs. That 
is, it wouldn't be entirely impossible to simply have a process running 
that ensures the "D view" and the "Pascal view" of the same file is kept 
in sync. Then you can keep the memory mapped code the same, and just 
define sane structures in your D code.


If you aren't required to have both Pascal and D programs reading and 
writing the file at the same time, this shouldn't be a problem.


BTW, one further thing I don't understand -- if this is memory mapped 
data, how come it has issues with the GC? And what do the "pointers" 
mean in the memory mapped data? I'm sure there's good answers, and your 
actual code is more complex than the simple example, but I'm just curious.


-Steve


Re: App hangs, GC.collect() fixet it. Why?

2020-09-28 Thread Bastiaan Veelo via Digitalmars-d-learn
On Monday, 28 September 2020 at 15:44:44 UTC, Steven 
Schveighoffer wrote:

On 9/28/20 8:57 AM, Bastiaan Veelo wrote:

I am glad to have found the cause of the breakage finally, but 
it won't be easy to find a generic solution...


Obviously, this isn't a real piece of code, but there is no way 
around this. You have to align your pointers. The other option 
is to not use the GC and use manual memory management.


If this is a compatibility thing between D and Pascal, and you 
absolutely have to have the same layout, is there a way to 
adjust the structure in Pascal? Like put the elements that 
misalign the pointers at the end of the structure?


Another totally drastic approach would be to supply your own 
even-more-conservative GC which will scan misaligned pointers. 
Probably going to hurt performance quite a bit. You might be 
able to get away with marking only certain blocks as having 
misaligned pointers, but you will have to scan all the stacks 
with this assumption.


Some more information about the setup you are using might help 
(I'm assuming D and Pascal are using the same memory in the 
same process, otherwise this wouldn't be a problem). In 
particular, where does the data come from, and how malleable is 
it in your system? Are there times where references to the D 
data only exist in Pascal?


-Steve


Thanks a lot for thinking with me. I’m not linking any Pascal 
objects, so I don’t need to maintain binary compatibility in 
memory; Only compatibility of data files. The problem arises when 
those files are read using memory mapped files, from which 
structs are memcpy’d over. This is of course the result of 
machine translation of the current Pascal implementation.


Manual memory management is an option and would be 
straightforward in principle, as we’ve done that for ages. The 
only thing is that this memory cannot contain other allocations 
on the GC heap, such as strings or other slices, unless they are 
both aligned and their root is registered.


Fixing the alignment in Pascal is possible in principle, but any 
old files would then need to first be processed by the last 
Pascal version of the programs, which we then would need to keep 
around indefinitely. There would also be issues when we port from 
32 bit to 64 bit.


Another option could be to use 1-byte aligned structs for I/O, 
and copy the members over in default aligned versions. But this 
cannot be part of the automated transcompilation.


Thanks for suggesting a custom gc, which I had not thought of.

I’m leaning towards ditching the memory mapped I/O on the D end, 
and replace it by regular serialisation/deserialisation. That 
will be a manual rewrite though, which is a bit of bummer as 
memory mapped files are widely used in our Pascal code. But this 
will probably give the best end result.


-Bastiaan.


Re: App hangs, GC.collect() fixet it. Why?

2020-09-28 Thread Steven Schveighoffer via Digitalmars-d-learn

On 9/28/20 8:57 AM, Bastiaan Veelo wrote:

I am glad to have found the cause of the breakage finally, but it won't 
be easy to find a generic solution...


Obviously, this isn't a real piece of code, but there is no way around 
this. You have to align your pointers. The other option is to not use 
the GC and use manual memory management.


If this is a compatibility thing between D and Pascal, and you 
absolutely have to have the same layout, is there a way to adjust the 
structure in Pascal? Like put the elements that misalign the pointers at 
the end of the structure?


Another totally drastic approach would be to supply your own 
even-more-conservative GC which will scan misaligned pointers. Probably 
going to hurt performance quite a bit. You might be able to get away 
with marking only certain blocks as having misaligned pointers, but you 
will have to scan all the stacks with this assumption.


Some more information about the setup you are using might help (I'm 
assuming D and Pascal are using the same memory in the same process, 
otherwise this wouldn't be a problem). In particular, where does the 
data come from, and how malleable is it in your system? Are there times 
where references to the D data only exist in Pascal?


-Steve


Re: App hangs, GC.collect() fixet it. Why?

2020-09-28 Thread Bastiaan Veelo via Digitalmars-d-learn
On Friday, 5 June 2020 at 21:20:09 UTC, Steven Schveighoffer 
wrote:
This kind of sounds like a codegen bug, a race condition, or 
(worst case) memory corruption.


I think it must have been memory corruption: I had not realized 
that our old Pascal compiler aligns struct members on one byte 
boundaries, and also uses ubyte as the base type for enumerations 
(or ushort if required) instead of uint. When using memory mapped 
files this binary incompatibility likely caused the corruption.


But, after correcting that mistake, suddenly things broke that 
had been working for a long time. Having no idea what could be 
wrong this time, I spent quite some time dustmiting (thanks 
Vladimir!) and manually reducing the code. Voilà:



import std.stdio;
import core.memory;

struct Nothing
{
}

struct Info
{
align(1):
  ubyte u;
  Nothing*[2] arr;
}

Info* info;

void main()
{
  info = new Info;
  writeln("1");
  GC.collect();
  info.arr[0] = new Nothing;
  writeln("2");
  GC.collect();
  info.arr[1] = new Nothing;
  writeln("info.arr[0]  = ", info.arr[0]);
  writeln("info.arr[1]  = ", info.arr[1]);
  assert(info.arr[0] != info.arr[1], "Live object was 
collected!");

}


(The assert triggers on Windows, not on run.dlang.org.) 
Unfortunately for me, I cannot blame this on the compiler. It 
violates the requirements from the spec:


  "Do not misalign pointers if those pointers may point into the 
GC heap" (https://dlang.org/spec/garbage.html)


I am glad to have found the cause of the breakage finally, but it 
won't be easy to find a generic solution...


-Bastiaan.


Re: App hangs, GC.collect() fixet it. Why?

2020-06-05 Thread Steven Schveighoffer via Digitalmars-d-learn

On 6/5/20 1:57 PM, Bastiaan Veelo wrote:
I've been tracking down a hang in our pilot app. Using writeln, it 
appears to hang at newing a slice. After many hours of trying things, I 
discovered that program flow would continue past that point when I 
inserted a call to `GC.collect()` just before. Then it stalled again at 
a call to Win32 `SetMenu()`. Again, inserting `GC.collect()` before that 
made the problem go away.


This band-aid isn't going to scale in the long run. I feel I'm treating 
symptoms, and wonder what the cause is. Any ideas?


I know the GC is not disabled somehow because if I print 
`GC.profileStats()`, I see that there are collections even without my 
explicit calls to `GC.collect()`.


1. collections happen automatically when you allocate memory and it 
can't find any free memory to allocate with.
2. Even if it can't find any free memory after a collection, and it runs 
out of memory, it should throw an Error instead of hanging.


The only thing I can think of is to open in a debugger and see what it 
is doing.


This kind of sounds like a codegen bug, a race condition, or (worst 
case) memory corruption.


-Steve


App hangs, GC.collect() fixet it. Why?

2020-06-05 Thread Bastiaan Veelo via Digitalmars-d-learn
I've been tracking down a hang in our pilot app. Using writeln, 
it appears to hang at newing a slice. After many hours of trying 
things, I discovered that program flow would continue past that 
point when I inserted a call to `GC.collect()` just before. Then 
it stalled again at a call to Win32 `SetMenu()`. Again, inserting 
`GC.collect()` before that made the problem go away.


This band-aid isn't going to scale in the long run. I feel I'm 
treating symptoms, and wonder what the cause is. Any ideas?


I know the GC is not disabled somehow because if I print 
`GC.profileStats()`, I see that there are collections even 
without my explicit calls to `GC.collect()`.


Thanks,

Bastiaan.