Hi Greg,

While doing the renderings for the VR of Kynthia we encountered "sometimes" (that means not 100% reproducible) problems with the memory.

We rendered 4 images at the same time sharing an ambient file, each rtrace was using the -n 2 or -n 3 option.

I made a screenshot of top and some of the processes. If you look at id 88263 it seems like the "mother-process" uses in total 41 GB (virt) !! - since some of our machines don't have a large swap space, some of these processes failed with "cannot allocate memory". I know that the Virt mem is not a real indicator for what is ever used, but from our 400 jobs we had around 10 failing with this issue.

The "children" use around 800-900mb, so this is fine and what we expected. But we dont know how to estimate to total memory usage (lets say a single rtrace would need 500mb, I would have expected running -n 2 uses 1GB, but at least there is also the mother process, which size a bit unpredictable and sometimes exploding.

This "growth" of the mother process happens always at the end of the images (lets say 90% finished).

Interestingly when restarting the processes the fail never happened again (but I have to admit I didn't restart the simulation explicitly on the same machine, since I had a fully automized process, where the failed ones were automatically restarted on one of the 50 machines we had available.)

Finally we finished all 400(!) renderings with a very good quality.

So this is not an urgent issue, but we wanted to report this. Maybe you have some rules of thumb to calculate the memory usage when applying the -n option when the usage of a single process is known?

best

Jan


top - 19:16:33 up 57 days,  6:55, 24 users,  load average: 88.11, 81.49, 70.31
Tasks: 891 total,  51 running, 840 sleeping,   0 stopped,   0 zombie
%Cpu(s): 20.7 us,  0.8 sy, 72.3 ni,  1.2 id,  5.0 wa,  0.0 hi, 0.0 si,  0.0 st
KiB Mem:  13191659+total, 11070876+used, 21207820 free,    42656 buffers
KiB Swap: 13411430+total,  9556232 used, 12455806+free. 23473420 cached Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ COMMAND
 88260 lipid4    20   0 21.316g 0.021t   2496 R  12.6 16.8 12:45.60 rtrace
 88263 lipid4    20   0 41.816g 0.021t   2424 S   0.0 16.8 15:28.52 rtrace
 88254 lipid4    20   0 2198728 1.913g   2452 S   0.0  1.5 8:58.91 rtrace
 88257 lipid4    20   0  884028 871708   2720 R  99.5  0.7 840:05.96 rtrace  88292 lipid4    20   0  884300 868240   2212 R  99.5  0.7 824:08.50 rtrace  88287 lipid4    20   0  883484 856984   2176 R  22.4  0.6 825:37.93 rtrace  88286 lipid4    20   0  883988 839512   2176 R  32.3  0.6 825:06.91 rtrace  88291 lipid4    20   0  884352 806224   2192 D   7.1  0.6 821:36.40 rtrace  88289 lipid4    20   0  884280 796720   2160 D   9.8  0.6 817:12.97 rtrace  88288 lipid4    20   0  883532 783388   2160 D   1.1  0.6 816:19.17 rtrace

lipid4    88262  1.1  0.0   7548  1964 ?        S    03:38  10:52 rcalc -f /home/lipid4/finalrun/files/view360stereo.cal -e XD:12960;YD:12960;X:-2.69452;Y:-31.0606;Z:2.098;IPD:0.06;EX:0;EZ:0 lipid4    88263  1.6 16.7 43846856 22123884 ?   S    03:38  15:28 rtrace -w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88286 87.9  0.6 883988 861640 ?       R    03:38 825:20 rtrace -w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88287 87.9  0.6 883484 863988 ?       R    03:38 826:02 rtrace -w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88288 86.9  0.6 883532 807460 ?       D    03:38 816:20 rtrace -w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88289 87.0  0.6 884280 815772 ?       D    03:38 817:18 rtrace -w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88290 88.5  0.5 884308 784652 ?       R    03:38 831:32 rtrace -w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88291 87.5  0.6 884352 822328 ?       D    03:38 821:40 rtrace -w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct lipid4    88292 87.8  0.6 884300 869016 ?       R    03:38 824:47 rtrace -w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02 -ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af -x 12960 -y 12960 -fac /home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct


--
Dr.-Ing.  Jan Wienold
Ecole Polytechnique Fédérale de Lausanne (EPFL)
EPFL ENAC IA LIPID

http://people.epfl.ch/jan.wienold
LE 1 111 (Office)
Phone    +41 21 69 30849


_______________________________________________
Radiance-dev mailing list
Radiance-dev@radiance-online.org
https://www.radiance-online.org/mailman/listinfo/radiance-dev

Reply via email to