Re: ARSD PNG memory usage

2016-08-29 Thread Guillaume Piolat via Digitalmars-d-learn

On Tuesday, 16 August 2016 at 16:40:30 UTC, Adam D. Ruppe wrote:
On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
wrote:
Hey, I also stumbled upon this with imageformats decoding PNG. 
Image loading makes 10x the garbage it should.

Let's see what this threads unveils...


leet me know how it is now


So I made a small benchmark for testing PNG loading in D
  * dplug.gui.pngload ( = stb_image): 134ms / 4.4mb memory
  * arsd.png: 118ms / 7mb memory
  * imageformats: 108ms / 13.1mb memory

Compiler: ldc-1.0.0-b2, release-nobounds build type


Re: ARSD PNG memory usage

2016-08-17 Thread Guillaume Piolat via Digitalmars-d-learn

On Tuesday, 16 August 2016 at 16:40:30 UTC, Adam D. Ruppe wrote:
On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
wrote:
Hey, I also stumbled upon this with imageformats decoding PNG. 
Image loading makes 10x the garbage it should.

Let's see what this threads unveils...


leet me know how it is now


Reverted back to a stb_image translation to avoid the problem 
altogether (though it's a bit slower now). Rewriting offending 
allocations in std.zlib was harder than expected.


Re: ARSD PNG memory usage

2016-08-16 Thread Adam D. Ruppe via Digitalmars-d-learn
On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
wrote:
Hey, I also stumbled upon this with imageformats decoding PNG. 
Image loading makes 10x the garbage it should.

Let's see what this threads unveils...


leet me know how it is now


Re: ARSD PNG memory usage

2016-08-16 Thread Guillaume Piolat via Digitalmars-d-learn

On Friday, 17 June 2016 at 02:55:43 UTC, thedeemon wrote:

On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
Hi, so, do you have any idea why when I load an image with 
png.d it takes a ton of memory?


I've bumped into this previously. It allocates a lot of 
temporary arrays for decoded chunks of data, and I managed to 
reduce those allocations a bit, here's the version I used:

http://stuff.thedeemon.com/png.d
(last changed Oct 2014, so may need some tweaks today)




Hey, I also stumbled upon this with imageformats decoding PNG. 
Image loading makes 10x the garbage it should.

Let's see what this threads unveils...


Re: ARSD PNG memory usage

2016-06-19 Thread Joerg Joergonson via Digitalmars-d-learn
Also, for some reason one image has a weird horizontal line at 
the bottom of the image that is not part of the original. This is 
as if the height was 1 pixel to much and it's reading "junk". I 
have basically a few duplicate images that were generated from 
the same base image. None of the others have this problem.


If I reduce the image dimensions it doesn't have this problem. My 
guess is that there is probably a bug with a > vs >= or 
something. When the image dimensions are "just right" an extra 
line is added that may be non-zero.


The image dimensions are 124x123.

This is all speculation but it seems like it is a png.d or 
opengltexture issue. I cannot see this added line in any image 
editor I've tried(PS, ifranview) and changing the dimensions of 
the image fix it.


Since it's a hard one to debug without test case I will work on 
it... Hoping you have some possible points of attack though.









Re: ARSD PNG memory usage

2016-06-19 Thread Joerg Joergonson via Digitalmars-d-learn

On Saturday, 18 June 2016 at 02:17:01 UTC, Adam D. Ruppe wrote:

I have an auto generator for pngs and 99% of the time it works, 
but every once in a while I get an error when loading the png's. 
Usually re-running the generator "fixes the problem" so it might 
be on my end. Regardless of where the problem stems, it would be 
nice to have more info why instead of a range violation. 
previousLine is null in the break.


All the png's generated are loadable by external app like 
ifranview, so they are not completely corrupt but possibly could 
have some thing that is screwing png.d up.



The code where the error happens is:

case 3:
auto arr = data.dup;
foreach(i; 0 .. arr.length) {
auto prev = i < bpp ? 0 : arr[i - bpp];
arr[i] += cast(ubyte)
/*std.math.floor*/( cast(int) (prev + 
previousLine[i]) / 2);
}

Range violation at png.d(1815)

Any ideas?



Re: ARSD PNG memory usage

2016-06-18 Thread Joerg Joergonson via Digitalmars-d-learn

On Saturday, 18 June 2016 at 02:01:29 UTC, Adam D. Ruppe wrote:
On Saturday, 18 June 2016 at 01:20:16 UTC, Joerg Joergonson 
wrote:
Error: undefined identifier 'sleep', did you mean function 
'Sleep'?		


"import core.thread; sleep(10);"


It is `Thread.sleep(10.msecs)` or whatever time - `sleep` is a 
static member of the Thread class.



They mention to use PeekMessage and I don't see you doing 
that, not sure if it would change things though?


I am using MsgWaitForMultipleObjectsEx which blocks until 
something happens. That something can be a timer, input event, 
other message, or an I/O thing... it doesn't eat CPU unless 
*something* is happening.


Yeah, I don't know what though. Adding Sleep(5); reduces it's 
consumption to 0% so it is probably just spinning. It might be 
the nvidia issue that creates some weird messages to the app.


I'm not too concerned about it as it's now done to 0, it is 
minimal wait time for my app(maybe not acceptable for performance 
apps but ok for mine... at least for now).


As I continue to work on it, I might stumble on the problem or it 
might disappear spontaneously.




Re: ARSD PNG memory usage

2016-06-17 Thread Adam D. Ruppe via Digitalmars-d-learn

On Saturday, 18 June 2016 at 01:57:49 UTC, Joerg Joergonson wrote:
Ok. Also, maybe the GC hasn't freed some of those temporaries 
yet.


The way GC works in general is it allows allocations to just 
continue until it considers itself under memory pressure. Then, 
it tries to do a collection. Since collections are expensive, it 
puts them off as long as it can and tries to do them as 
infrequently as reasonable. (some GCs do smaller collections to 
spread the cost out though, so the details always differ based on 
implementation)


So, you'd normally see it go up to some threshold then stabilize 
there, even if it is doing a lot of little garbage allocations.


However, once the initialization is done here, it shouldn't be 
allocating any more. The event loop itself doesn't when all is 
running normally.




Re: ARSD PNG memory usage

2016-06-17 Thread Adam D. Ruppe via Digitalmars-d-learn

On Saturday, 18 June 2016 at 01:20:16 UTC, Joerg Joergonson wrote:
Error: undefined identifier 'sleep', did you mean function 
'Sleep'?		


"import core.thread; sleep(10);"


It is `Thread.sleep(10.msecs)` or whatever time - `sleep` is a 
static member of the Thread class.



They mention to use PeekMessage and I don't see you doing that, 
not sure if it would change things though?


I am using MsgWaitForMultipleObjectsEx which blocks until 
something happens. That something can be a timer, input event, 
other message, or an I/O thing... it doesn't eat CPU unless 
*something* is happening.


Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn

On Saturday, 18 June 2016 at 01:46:32 UTC, Adam D. Ruppe wrote:
On Saturday, 18 June 2016 at 01:44:28 UTC, Joerg Joergonson 
wrote:
I simply removed your nextpowerof2 code(so the width and 
height wasn't being enlarged) and saw no memory change). 
Obviously because they are temporary buffers, I guess?


right, the new code free() them right at scope exit.

If this is the case, then maybe there is one odd temporary 
still hanging around in png?


Could be, though the png itself has relatively small overhead, 
and the opengl texture adds to it still. I'm not sure if video 
memory is counted by task manager or not... but it could be 
loading up the whole ogl driver that accounts for some of it. I 
don't know.


Ok. Also, maybe the GC hasn't freed some of those temporaries 
yet. What's strange is that when the app is run, it seems to do a 
lot of small allocations around 64kB or something for about 10 
seconds(I watch the memory increase in TM) then it stabilizes. 
Not a big deal, just seems a big weird(maybe some type of lazy 
allocations going on)



Anyways, I'm much happier now ;) Thanks!


Re: ARSD PNG memory usage

2016-06-17 Thread Adam D. Ruppe via Digitalmars-d-learn

On Saturday, 18 June 2016 at 01:44:28 UTC, Joerg Joergonson wrote:
I simply removed your nextpowerof2 code(so the width and height 
wasn't being enlarged) and saw no memory change). Obviously 
because they are temporary buffers, I guess?


right, the new code free() them right at scope exit.

If this is the case, then maybe there is one odd temporary 
still hanging around in png?


Could be, though the png itself has relatively small overhead, 
and the opengl texture adds to it still. I'm not sure if video 
memory is counted by task manager or not... but it could be 
loading up the whole ogl driver that accounts for some of it. I 
don't know.


Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn

On Friday, 17 June 2016 at 14:39:32 UTC, kinke wrote:

On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:

LDC x64 uses about 250MB and 13% cpu.

I couldn't check on x86 because of the error

phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module 
machine type 'x64' conflicts with target machine type 'X86'


not sure what that means with gzlib.c.ojb. Must be another bug 
in ldc alpha ;/


It looks like you're trying to link 32-bit objects to a 64-bit 
Phobos.
The only pre-built LDC for Windows capable of linking both 
32-bit and 64-bit code is the multilib CI release, see 
https://github.com/ldc-developers/ldc/releases/tag/LDC-Win64-master.



Yes, it looks that way but it's not the case I believe(I did 
check when this error first came up). I'm using the phobo's libs 
from ldc that are x86.


I could be mistaken but

phobos2-ldc.lib(gzlib.c.obj)

suggests that the problem isn't with the entire phobos lib but 
gzlib.c.obj and that that is the only one marked incorrectly, 
since it's not for all the other imports, it seems something got 
marked wrong in that specific case?








Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn

On Saturday, 18 June 2016 at 00:56:57 UTC, Joerg Joergonson wrote:

On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:

[...]


Yes, same here! Great! It runs around 122MB in x86 and 107MB 
x64. Much better!



[...]


Yeah, strange but good catch! It now works in x64! I modified 
it to to!wstring(title).dup simply to have the same title and 
classname.



[...]


I have the opposite on memory but not a big deal.



[...]


I will investigate this soon and report back anything. It 
probably is something straightforward.



[...]


I found this on non-power of 2 textures:

https://www.opengl.org/wiki/NPOT_Texture


https://www.opengl.org/registry/specs/ARB/texture_non_power_of_two.txt

It seems like it's probably a quick and easy add on and you 
already have the padding code, it could easily be optional(set 
a flag or pass a bool or whatever).


it could definitely same some serious memory for large textures.

e.g., a 3000x3000x4 texture takes about 36MB or 2^25.1 bytes. 
Since this has to be rounded up to 2^26 = 67MB, we almost have 
doubled the amount of wasted space.


Hence, allowing for non-power of two would probably reduce the 
memory footprint of my code to near 50MB(around 40MB being the 
minimum using uncompressed textures).


I might try to get a working version of that at some point. 
Going to deal with the cpu thing now though.


Thanks again.


Never mind about this. I wasn't keeping in mind that these 
textures are ultimately going to end up in the video card memory.


I simply removed your nextpowerof2 code(so the width and height 
wasn't being enlarged) and saw no memory change). Obviously 
because they are temporary buffers, I guess?


If this is the case, then maybe there is one odd temporary still 
hanging around in png?





Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn
The CPU usage is consistently very low on my computer. I still 
don't know what could be causing it for you, but maybe it is 
the temporary garbage... let us know if the new patches make a 
difference there.




Ok, I tried the breaking at random method and I always ended up 
in system code and no stack trace to... seems it was an alternate 
thread(maybe GC?). I did a sampling profile and got this:


Function Name   Inclusive  Exclusive Inclusive % Exclusive %
_DispatchMessageW@4 10,361  5   88.32   0.04
[nvoglv32.dll]  7,874   745 67.12   6.35
_GetExitCodeThread@85,745   5,745   48.97   48.97
_SwitchToThread@0   2,166   2,166   18.46   18.46

So possibly it is simply my system and graphics card. For some 
reason NVidia might be using a lot of cpu here for no apparent 
reason?


DispatchMessage is still taking quite a bit of that though?


Seems like someone else has a similar issue:

https://devtalk.nvidia.com/default/topic/832506/opengl/nvoglv32-consuming-a-ton-of-cpu/


https://github.com/mpv-player/mpv/issues/152


BTW, trying sleep in the MSG loop

Error: undefined identifier 'sleep', did you mean function 
'Sleep'?		


"import core.thread; sleep(10);"

;)

Adding a Sleep(10); to the loop dropped the cpu usage down to 
0-1% cpu!


http://stackoverflow.com/questions/33948837/win32-application-with-high-cpu-usage/33948865

Not sure if that's the best approach though but it does work.

They mention to use PeekMessage and I don't see you doing that, 
not sure if it would change things though?






Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn

On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:

On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
ok, then it's somewhere in TrueColorImage or the loading of 
the png.


So, opengltexture actually does reallocate if the size isn't 
right for the texture... and your image was one of those sizes.


The texture pixel size needs to be a power of two, so 3000 gets 
rounded up to 4096, which means an internal allocation.


But it can be a temporary one! So ketmar tackled png.d's 
loaders' temporaries and I took care of gamehelper.d's...


And the test program went down about to 1/3 of its memory 
usage. Try grabbing the new ones from github now and see if it 
works for you too.




Yes, same here! Great! It runs around 122MB in x86 and 107MB x64. 
Much better!




Well, It works on LDC x64! again ;) This seems like an issue 
with DMD x64? I was thinking maybe it has to do the layout of 
the struct or something, but not sure.


I have a fix for this too, though I don't understand why it 
works


I just .dup'd the string literal before passing it to Windows. 
I think dmd is putting the literal in a bad place for these 
functions (they do bit tests to see if it is a pointer or an 
atom, so maybe it is in an address where the wrong bits are set)




Yeah, strange but good catch! It now works in x64! I modified it 
to to!wstring(title).dup simply to have the same title and 
classname.


In any case, the .dup seems to fix it, so all should work on 32 
or 64 bit now. In my tests, now that the big temporary arrays 
are manually freed, the memory usage is actually slightly lower 
on 32 bit, but it isn't bad on 64 bit either.


I have the opposite on memory but not a big deal.


The CPU usage is consistently very low on my computer. I still 
don't know what could be causing it for you, but maybe it is 
the temporary garbage... let us know if the new patches make a 
difference there.


I will investigate this soon and report back anything. It 
probably is something straightforward.


Anyways, We'll figure it all out at some point ;) I'm really 
liking your lib by the way. It's let me build a gui and get a 
lot done and just "work". Not sure if it will work on X11 with 
just a recompile, but I hope ;)



It often will! If you aren't using any of the native event 
handler functions or any of the impl.* members, most things 
just work (exception being the windows hotkey functions, but 
those are marked Windows anyway!). The basic opengl stuff is 
all done for both platforms. Advanced opengl isn't implemented 
on Windows yet though (I don't know it; my opengl knowledge 
stops in like 1998 with opengl 1.1 so yeah, I depend on 
people's contributions for that and someone did Linux for me, 
but not Windows yet. I think.)


I found this on non-power of 2 textures:

https://www.opengl.org/wiki/NPOT_Texture


https://www.opengl.org/registry/specs/ARB/texture_non_power_of_two.txt

It seems like it's probably a quick and easy add on and you 
already have the padding code, it could easily be optional(set a 
flag or pass a bool or whatever).


it could definitely same some serious memory for large textures.

e.g., a 3000x3000x4 texture takes about 36MB or 2^25.1 bytes. 
Since this has to be rounded up to 2^26 = 67MB, we almost have 
doubled the amount of wasted space.


Hence, allowing for non-power of two would probably reduce the 
memory footprint of my code to near 50MB(around 40MB being the 
minimum using uncompressed textures).


I might try to get a working version of that at some point. Going 
to deal with the cpu thing now though.


Thanks again.




Re: ARSD PNG memory usage

2016-06-17 Thread Joerg Joergonson via Digitalmars-d-learn

On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:

On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:

[...]


So, opengltexture actually does reallocate if the size isn't 
right for the texture... and your image was one of those sizes.


[...]



Cool, I'll check all this out and report back. I'll look into the 
cpu issue too.


Thanks!


Re: ARSD PNG memory usage

2016-06-17 Thread Adam D. Ruppe via Digitalmars-d-learn

On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
ok, then it's somewhere in TrueColorImage or the loading of the 
png.


So, opengltexture actually does reallocate if the size isn't 
right for the texture... and your image was one of those sizes.


The texture pixel size needs to be a power of two, so 3000 gets 
rounded up to 4096, which means an internal allocation.


But it can be a temporary one! So ketmar tackled png.d's loaders' 
temporaries and I took care of gamehelper.d's...


And the test program went down about to 1/3 of its memory usage. 
Try grabbing the new ones from github now and see if it works for 
you too.



Well, It works on LDC x64! again ;) This seems like an issue 
with DMD x64? I was thinking maybe it has to do the layout of 
the struct or something, but not sure.


I have a fix for this too, though I don't understand why it 
works


I just .dup'd the string literal before passing it to Windows. I 
think dmd is putting the literal in a bad place for these 
functions (they do bit tests to see if it is a pointer or an 
atom, so maybe it is in an address where the wrong bits are set)


In any case, the .dup seems to fix it, so all should work on 32 
or 64 bit now. In my tests, now that the big temporary arrays are 
manually freed, the memory usage is actually slightly lower on 32 
bit, but it isn't bad on 64 bit either.



The CPU usage is consistently very low on my computer. I still 
don't know what could be causing it for you, but maybe it is the 
temporary garbage... let us know if the new patches make a 
difference there.


Anyways, We'll figure it all out at some point ;) I'm really 
liking your lib by the way. It's let me build a gui and get a 
lot done and just "work". Not sure if it will work on X11 with 
just a recompile, but I hope ;)



It often will! If you aren't using any of the native event 
handler functions or any of the impl.* members, most things just 
work (exception being the windows hotkey functions, but those are 
marked Windows anyway!). The basic opengl stuff is all done for 
both platforms. Advanced opengl isn't implemented on Windows yet 
though (I don't know it; my opengl knowledge stops in like 1998 
with opengl 1.1 so yeah, I depend on people's contributions 
for that and someone did Linux for me, but not Windows yet. I 
think.)


Re: ARSD PNG memory usage

2016-06-17 Thread kinke via Digitalmars-d-learn

On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:

LDC x64 uses about 250MB and 13% cpu.

I couldn't check on x86 because of the error

phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module 
machine type 'x64' conflicts with target machine type 'X86'


not sure what that means with gzlib.c.ojb. Must be another bug 
in ldc alpha ;/


It looks like you're trying to link 32-bit objects to a 64-bit 
Phobos.
The only pre-built LDC for Windows capable of linking both 32-bit 
and 64-bit code is the multilib CI release, see 
https://github.com/ldc-developers/ldc/releases/tag/LDC-Win64-master.


Re: ARSD PNG memory usage

2016-06-17 Thread ketmar via Digitalmars-d-learn

On Friday, 17 June 2016 at 03:41:02 UTC, Adam D. Ruppe wrote:
It actually has been on my todo list for a while to change the 
decoder to generate less garbage. I have had trouble in the 
past with temporary arrays being pinned by false pointers and 
the memory use ballooning from that, and the lifetime is really 
easy to manage so just malloc/freeing it would be an easy 
solution, just like you said, std.zlib basically sucks so I 
have to use the underlying C functions and I just haven't 
gotten around to it.


did that. decoding still sux, but now it should suck less. ;-) 
encoder is still using "std.zlib", though. next time, maybe.


Re: ARSD PNG memory usage

2016-06-16 Thread Joerg Joergonson via Digitalmars-d-learn

On Friday, 17 June 2016 at 04:32:02 UTC, Adam D. Ruppe wrote:

On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
Are you keeping multiple buffers of the image around? A 
trueimage, a memoryimage, an opengl texture


MemoryImage and TrueImage are the same thing, memory is just 
the interface, true image is the implementation.


OpenGL texture is separate, but it references the same memory 
as a TrueColorImage, so it wouldn't be adding.




ok, then it's somewhere in TrueColorImage or the loading of the 
png.




You might have pinned temporary buffers though. That shouldn't 
happen on 64 bit, but on 32 bit I have seen it happen a lot.




Ok, IIRC LDC both x64 and x86 had high memory usage too, so if it 
shouldn't happen on 64-bit(if it applies to ldc), this then is 
not the problem. I'll run a -vgc on it and see if it shows up 
anything interesting.


When I do a bare loop minimum project(create2dwindow + event 
handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.


I haven't seen that here but I have a theory now: you have 
some pinned temporary buffer on 32 bit (on 64 bit, the GC would 
actually clean it up) that keeps memory usage near the 
collection boundary.


Again, it might be true but I'm pretty sure I saw the problem 
with ldc x64.


Then, a small allocation in the loop - which shouldn't be 
happening, I don't see any in here... - but if there is a small 
allocation I'm missing, it could be triggering a GC collection 
cycle each time, eating CPU to scan all that wasted memory 
without being able to free anything.




Ok, Maybe... -vgc might show that.

If you can run it in the debugger and just see where it is by 
breaking at random, you might be able to prove it.




Good idea, not thought about doing that ;) Might be a crap shoot 
but who knows...


That's a possible theory I can reproduce the memory usage 
here, but not the CPU usage though. Sitting idle, it is always 
<1% here (0 if doing nothing, like 0.5% if I move the mouse in 
the window to generate some activity)


 I need to get to bed though, we'll have to check this out in 
more detail later.


me too ;) I'll try to test stuff out a little more when I get a 
chance.




Thanks!  Also, when I try to run the app in 64-bit windows, 
RegisterClassW throws for some reason ;/ I haven't been able 
to figure that one out yet ;/


err this is a mystery to me too... a hello world on 64 bit 
seems to work fine, but your program tells me error 998 
(invalid memory access) when I run it. WTF, both register class 
the same way.


I'm kinda lost on that.


Well, It works on LDC x64! again ;) This seems like an issue with 
DMD x64? I was thinking maybe it has to do the layout of the 
struct or something, but not sure.


---

I just run a quick test:

LDC x64 uses about 250MB and 13% cpu.

I couldn't check on x86 because of the error

phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module 
machine type 'x64' conflicts with target machine type 'X86'


not sure what that means with gzlib.c.ojb. Must be another bug in 
ldc alpha ;/



Anyways, We'll figure it all out at some point ;) I'm really 
liking your lib by the way. It's let me build a gui and get a lot 
done and just "work". Not sure if it will work on X11 with just a 
recompile, but I hope ;)




Re: ARSD PNG memory usage

2016-06-16 Thread Adam D. Ruppe via Digitalmars-d-learn

On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
Are you keeping multiple buffers of the image around? A 
trueimage, a memoryimage, an opengl texture


MemoryImage and TrueImage are the same thing, memory is just the 
interface, true image is the implementation.


OpenGL texture is separate, but it references the same memory as 
a TrueColorImage, so it wouldn't be adding.



You might have pinned temporary buffers though. That shouldn't 
happen on 64 bit, but on 32 bit I have seen it happen a lot.


When I do a bare loop minimum project(create2dwindow + event 
handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.


I haven't seen that here but I have a theory now: you have 
some pinned temporary buffer on 32 bit (on 64 bit, the GC would 
actually clean it up) that keeps memory usage near the collection 
boundary.


Then, a small allocation in the loop - which shouldn't be 
happening, I don't see any in here... - but if there is a small 
allocation I'm missing, it could be triggering a GC collection 
cycle each time, eating CPU to scan all that wasted memory 
without being able to free anything.


If you can run it in the debugger and just see where it is by 
breaking at random, you might be able to prove it.


That's a possible theory I can reproduce the memory usage 
here, but not the CPU usage though. Sitting idle, it is always 
<1% here (0 if doing nothing, like 0.5% if I move the mouse in 
the window to generate some activity)


 I need to get to bed though, we'll have to check this out in 
more detail later.



Thanks!  Also, when I try to run the app in 64-bit windows, 
RegisterClassW throws for some reason ;/ I haven't been able to 
figure that one out yet ;/


err this is a mystery to me too... a hello world on 64 bit 
seems to work fine, but your program tells me error 998 (invalid 
memory access) when I run it. WTF, both register class the same 
way.


I'm kinda lost on that.


Re: ARSD PNG memory usage

2016-06-16 Thread Adam D. Ruppe via Digitalmars-d-learn

On Friday, 17 June 2016 at 02:55:43 UTC, thedeemon wrote:
I've bumped into this previously. It allocates a lot of 
temporary arrays for decoded chunks of data, and I managed to 
reduce those allocations a bit, here's the version I used:


If you can PR any of it to me, I'll merge.

It actually has been on my todo list for a while to change the 
decoder to generate less garbage. I have had trouble in the past 
with temporary arrays being pinned by false pointers and the 
memory use ballooning from that, and the lifetime is really easy 
to manage so just malloc/freeing it would be an easy solution, 
just like you said, std.zlib basically sucks so I have to use the 
underlying C functions and I just haven't gotten around to it.





Re: ARSD PNG memory usage

2016-06-16 Thread thedeemon via Digitalmars-d-learn

On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
Hi, so, do you have any idea why when I load an image with 
png.d it takes a ton of memory?


I've bumped into this previously. It allocates a lot of temporary 
arrays for decoded chunks of data, and I managed to reduce those 
allocations a bit, here's the version I used:

http://stuff.thedeemon.com/png.d
(last changed Oct 2014, so may need some tweaks today)

But most of allocations are really caused by using std.zlib. This 
thing creates tons of temporary arrays/slices and they are not 
collected well by the GC. To deal with that I had to use GC 
arenas for each PNG file I decode. This way all the junk created 
during PNG decoding is eliminated completely after the decoding 
ends. See gcarena module here:

https://bitbucket.org/infognition/dstuff
You may see Adam's PNG reader was really the source of motivation 
for it. ;)


ARSD PNG memory usage

2016-06-16 Thread Joerg Joergonson via Digitalmars-d-learn
Hi, so, do you have any idea why when I load an image with png.d 
it takes a ton of memory?


I have a 3360x2100 that should take around 26mb of memory 
uncompressed and a bunch of other smaller png files.


Are you keeping multiple buffers of the image around? A 
trueimage, a memoryimage, an opengl texture thing that might be 
in main memory, etc? Total file space of all the images is only 
about 3MB compressed and 40MB uncompressed. So it's using around 
10x more memory than it should! I tried a GC collect and all that.


I don't think my program will have a chance in hell using that 
much memory. That's just a few images for gui work. I'll be 
loading full page png's later on that might have many pages(100+) 
that I would want to pre-cache. This would probably cause the 
program to use TB's of space.


I don't know where to begin diagnosing the problem. I am using 
openGL but I imagine that shouldn't really allocate anything new?


I have embedded the images using `import` but that shouldn't 
really add much size(since it is compressed) or change things.


You could try it out yourself on a test case to see? (might be a 
windows thing too) Create a high res image(3000x3000, say) and 
load it like


auto eImage = cast(ubyte[])import("mylargepng.png");

TrueColorImage image = 
imageFromPng(readPng(eImage)).getAsTrueColorImage;	
OpenGlTexture oGLimage = new OpenGlTexture(image); // Will crash 
without create2dwindow

//oGLimage.draw(0,0,3000,3000);


When I do a bare loop minimum project(create2dwindow + event 
handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.


When I add the code above I get 291MB of memory(for one image.

Here's the full D code source:


module winmain;

import arsd.simpledisplay;
import arsd.png;
import arsd.gamehelpers;

void main()
{

auto window = create2dWindow(1680, 1050, "Test");

auto eImage = cast(ubyte[])import("Mock.png");

		TrueColorImage image = 
imageFromPng(readPng(eImage)).getAsTrueColorImage;   // 178MB	

OpenGlTexture oGLimage = new OpenGlTexture(image);   // 
291MB
//oGLimage.draw(0,0,3000,3000);

window.eventLoop(50,
delegate ()
{
window.redrawOpenGlSceneNow();
},

);
}

Note that I have modified create2dWindow to take the viewport and 
set it to 2x as large in my own code(I removed here). It 
shouldn't matter though as it's the png and OpenGlTexture that 
seem to have the issue.


Surely once the image is loaded by opengl we could potentially 
disregard the other images and virtually no extra memory would be 
required? I do use getpixel though, not sure it that could be 
used on OpenGLTexture's? I don't mind keeping a main memory copy 
though but I just need it to have a realistic size ;)


So two problems: 1 is the cpu usage, which I'll try to get more 
info on my side when I can profile and 2 is the 10x memory usage. 
If it doesn't happen on your machine can you try alternate(if 
'nix, go for win, or vice versa). This way we can get an idea 
where the problem might be.


Thanks!  Also, when I try to run the app in 64-bit windows, 
RegisterClassW throws for some reason ;/ I haven't been able to 
figure that one out yet ;/