Welcome back,

I suggested MALLOCOPTION=disclaim, because we have noticed that on 1 of test machines memory situation was like that:
01:02 AM: ~ jBASE Free in total 350 MB, jBASE Used in total 5 GB
03:20 AM: ~ jBASE Free in total 8 GB, jBASE Used in total 7 GB (more "Free" than "Used", this is not my mistake)

Numbers came from totaled Free and Used of WHERE (V (for all jBASE processes). I guess that memory consumption in COB must be looking similar in other banks (by the way: I previously never checked such things, so it would be great if somebody of group members could check it and share here).

Above output suprised me seriously. Shown numbers are not exact - I tried to recall them and they are quite precise. Situation at 3:20 AM was exactly like that (server has 24 GB of physical mem + 10 GB of swap).

We have also noticed that memory allocation growth (jump) is on SSELECTs. So I am guessing that freed memory should formulate continious blocks.

That is why I claimed all the time that "Free" memory is not given back to other processes and this is our problem. I understood that this may be problem of memory fragementation, but I think that yorktown allocator never gives memory back unless (probably) MALLOCOPTION=disclaim is set. That why I asked about memory page size. I am thinking about this whole stuff and can not imagine such a big fragmentation (notice that at 3:20 AM are processes which for example allocated and used 300 MB, but immediately went back to 100 MB used and 200 MB free).

We have checked also wheter some processes try to allocate more than 2 GB and there are such COB agents, but hopefully not many. We need to do more analysis carefully. Testing of Watson is scheduled to weekend - you can imagine now how people from test team are protesting against such changes :)

I forgot to tell, but QA area was all the time working on MALLOCTYPE=buckets. We did this setting, because people from CSHD gave us such advice. We wonder if it could somehow accelerate our problems or not. Recently we are facing out of memory problems on test servers and we did not (!) increase memory/swap size.

End of topic for the moment, because thread will be qualified as spam ;)

Kind regards
Pawel

Dnia 11-02-2009 o godz. 16:57 Jim Idle napisał(a):
Pawel (privately) wrote:
Hi Jim,

After deeper observations we think that this is exactly what is happening.
No, not quite. As I say, the reason that subsequent allocations cannot reuse space in the free chain is because of pattern of allocation. That is the main reason to use watson as an allocator. Hence you get a lot of accumulation in the free space chain. Now, it could be that option gives some relief, but the issue is really the allocation pattern. The way watson doles out address space is key to the resue of blocks that are already in the free space chain.

Hence you should try:

1) Switch to watson without other MALLOCOPTS and obsrver the differences;
2) Add options singly to see their individual effect;
3) combine options;

The key here is to make one change at a time of course. Have you done any of this yet?
I have shown one stupid process that allocated almost 2 GB of memory (soon it crashed), but this is really exceptional.
That is good.
Our COB sessions (tSA) allocate a lot of memory on SSELECTs and free most of it immediately. I think that you can easily notice that sorting, not keys selections increases memory consumption.
When you sort, you can either make one pass and accumulate the sort keys and primary key or you can accumulate the output at the same time. I think that SSELECT does the latter, hence you must store all referenced elements and need that much memory. As the keys and list grow the must be reallocated and the old smaller block is released. IF that smaller block is never big enough to satisfy a subsequent malloc, then it will sit in the free chain unused, but still require address space. Hence the option you are looking at MAY allow some of this to return to they system, but the address space is in system pages. Hence increasing the page size as mentioned in an earlier reply. may encourage better fits within the watson allocator, but using the standard allocator, may not help much anyway.

I am not showing here output from WHERE (V for all processes, but "Free" gives high amounts in total.
Sure, but they are not causing you any problems per se. This isn't the root problem, but a symptom. Also, you must asses the free chain as a percentage of the allocation used. What looks like a large number may be a small percentage. Further, because of statistical aberrations, once rogue process could throw out your entire assessment.

We have analysed also nmon results and you can observe that memory consumption during COB is always growing. There are almost no points when it falls down (there are such points in fact, because COB is usually stopped once during night and processes stopped). This is well visible, but totals of "Free" in processes (at some point of time) may give serious amounts.

I am not really 100% sure how memory is allocated by malloc - in pages, right?
YEs, hence for large allocations, using the 64 page size may help. But there are systems pages, the malloc allocator and then the malloced piece. They all interact.
So will MALLOCOPTION=disclaim give back (fully) unused pages to the OS? Will it happen only when there is out of physical memory?
I have not looked at the precise description of this option, but the short description implies that it tries to give back pages as soon as nothing occupies them when free is called. Because the reallocs are relatively large, then this might give back pages to the system. Try it, but only once you have tried watson, then tried watson with the bigger page allocation; then try the combinations.

Additionally: how can we control page size? (size of chunks of memory being allocated?) Is it advisable to change it?
Changing the page size may help, it may make things give up earlier. Only experimentation will show this. The settings for MALLOCOPT to change the page size for larger allocations was in my earlier posts in this thread.

I am going to give you results of some simple program soon. We will try to test allocators on our test server over a weekend.

I know that .profile changes should be considered as dangerous and well pretested (considered as configuration changes).
Only for the live system of course and they are not dangerous if they are pre-tested ;-). On your test system you should be willing to experiment after taking a backup and expecting that you will screw things up a few times before finding the right options.

We did not get response from CSHD yet, but I am sure they will answer shortly.
Perhaps a nudge is in order if you are in dire straights.

Jim





Kind regards
Pawel



----------------------------------------------------
Weź udział w akcji: Zakochana Polska!
Prześlij "serducho miłości" - Kliknij:
http://klik.wp.pl/?adr=http://corto.www.wp.pl/as/14luty.html&sid=637



--~--~---------~--~----~------------~-------~--~----~
Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24

To post, send email to [email protected]
To unsubscribe, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/jBASE?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to