Hi Greg,
While doing the renderings for the VR of Kynthia we encountered
"sometimes" (that means not 100% reproducible) problems with the memory.
We rendered 4 images at the same time sharing an ambient file, each
rtrace was using the -n 2 or -n 3 option.
I made a screenshot of top and some of the processes. If you look at id
88263 it seems like the "mother-process" uses in total 41 GB (virt) !! -
since some of our machines don't have a large swap space, some of these
processes failed with "cannot allocate memory". I know that the Virt mem
is not a real indicator for what is ever used, but from our 400 jobs we
had around 10 failing with this issue.
The "children" use around 800-900mb, so this is fine and what we
expected. But we dont know how to estimate to total memory usage (lets
say a single rtrace would need 500mb, I would have expected running -n 2
uses 1GB, but at least there is also the mother process, which size a
bit unpredictable and sometimes exploding.
This "growth" of the mother process happens always at the end of the
images (lets say 90% finished).
Interestingly when restarting the processes the fail never happened
again (but I have to admit I didn't restart the simulation explicitly on
the same machine, since I had a fully automized process, where the
failed ones were automatically restarted on one of the 50 machines we
had available.)
Finally we finished all 400(!) renderings with a very good quality.
So this is not an urgent issue, but we wanted to report this. Maybe you
have some rules of thumb to calculate the memory usage when applying the
-n option when the usage of a single process is known?
best
Jan
top - 19:16:33 up 57 days, 6:55, 24 users, load average: 88.11, 81.49,
70.31
Tasks: 891 total, 51 running, 840 sleeping, 0 stopped, 0 zombie
%Cpu(s): 20.7 us, 0.8 sy, 72.3 ni, 1.2 id, 5.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem: 13191659+total, 11070876+used, 21207820 free, 42656 buffers
KiB Swap: 13411430+total, 9556232 used, 12455806+free. 23473420 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
88260 lipid4 20 0 21.316g 0.021t 2496 R 12.6 16.8 12:45.60 rtrace
88263 lipid4 20 0 41.816g 0.021t 2424 S 0.0 16.8 15:28.52 rtrace
88254 lipid4 20 0 2198728 1.913g 2452 S 0.0 1.5 8:58.91 rtrace
88257 lipid4 20 0 884028 871708 2720 R 99.5 0.7 840:05.96
rtrace
88292 lipid4 20 0 884300 868240 2212 R 99.5 0.7 824:08.50
rtrace
88287 lipid4 20 0 883484 856984 2176 R 22.4 0.6 825:37.93
rtrace
88286 lipid4 20 0 883988 839512 2176 R 32.3 0.6 825:06.91
rtrace
88291 lipid4 20 0 884352 806224 2192 D 7.1 0.6 821:36.40
rtrace
88289 lipid4 20 0 884280 796720 2160 D 9.8 0.6 817:12.97
rtrace
88288 lipid4 20 0 883532 783388 2160 D 1.1 0.6 816:19.17
rtrace
lipid4 88262 1.1 0.0 7548 1964 ? S 03:38 10:52 rcalc
-f /home/lipid4/finalrun/files/view360stereo.cal -e
XD:12960;YD:12960;X:-2.69452;Y:-31.0606;Z:2.098;IPD:0.06;EX:0;EZ:0
lipid4 88263 1.6 16.7 43846856 22123884 ? S 03:38 15:28 rtrace
-w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88286 87.9 0.6 883988 861640 ? R 03:38 825:20 rtrace
-w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88287 87.9 0.6 883484 863988 ? R 03:38 826:02 rtrace
-w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88288 86.9 0.6 883532 807460 ? D 03:38 816:20 rtrace
-w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88289 87.0 0.6 884280 815772 ? D 03:38 817:18 rtrace
-w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88290 88.5 0.5 884308 784652 ? R 03:38 831:32 rtrace
-w -n 3 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88291 87.5 0.6 884352 822328 ? D 03:38 821:40 rtrace
-w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
lipid4 88292 87.8 0.6 884300 869016 ? R 03:38 824:47 rtrace
-w -n 2 -dj 0.02 -ds 0.05 -dt .05 -dc .5 -dp 256 -st 0.5 -ab 4 -aa 0.02
-ar 32 -ad 50000 -as 25000 -lr 4 -lw 0.000003 -af
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.af
-x 12960 -y 12960 -fac
/home/lipid4/finalrun/p5_social_overcast_ntnu_largewin_simu/p5_social_overcast_ntnu_largewin_simu.oct
--
Dr.-Ing. Jan Wienold
Ecole Polytechnique Fédérale de Lausanne (EPFL)
EPFL ENAC IA LIPID
http://people.epfl.ch/jan.wienold
LE 1 111 (Office)
Phone +41 21 69 30849
_______________________________________________
Radiance-dev mailing list
Radiance-dev@radiance-online.org
https://www.radiance-online.org/mailman/listinfo/radiance-dev