Re: [AOLSERVER] malloc replacement libraries?
In a message dated 11/6/02 10:57:34 PM, [EMAIL PROTECTED] writes: > Unfortunately, I don't have much more to report than Andy did. I had > an > AOLserver 3.3+ad13 running ACS, and it was periodically crashing. Lately I've been seeing more instances where AOLserver 3.3+ad13 just stops serving pages. It's still running, hasn't crashed, no errors or anything suspicious in the error log. I've already bumped up the stack sizes. I haven't a clue what's going on in there, but it can't be good! janine What are the major differences that would need to be bridged between the stock 3.5.1 code base and 3.3+ad13. There seems to be a large amount of email, bugs, etc. around the ad versions. Would things become easier if we were all working from a single common code base? ...I know this is a somewhat loaded question, but I'm curious as to what features are considered missing from the 3.5.1 code base. Thanks! - n
Re: [AOLSERVER] malloc replacement libraries?
On Wednesday, November 6, 2002, at 10:56 PM, Janine Sisk wrote: I haven't a clue what's going on in there, but it can't be good! You could try to attach gdb to the running process, and then poke around (start with a stack backtrace); many Unix variants allow this. You could also just "kill -ABRT" the process, which should generate a core dump (assuming you have core dumps enabled -- check your ulimit -c and, if you' re running Solaris, your coreadm), then start gdb and get a stack backtrace.
Re: [AOLSERVER] malloc replacement libraries?
On Wednesday, November 6, 2002, at 10:48 PM, Peter M. Jansson wrote: Unfortunately, I don't have much more to report than Andy did. I had an AOLserver 3.3+ad13 running ACS, and it was periodically crashing. Lately I've been seeing more instances where AOLserver 3.3+ad13 just stops serving pages. It's still running, hasn't crashed, no errors or anything suspicious in the error log. I've already bumped up the stack sizes. I haven't a clue what's going on in there, but it can't be good! janine -- Janine Sisk President/CEO furfly.net, LLC Mont Vernon, NH Phone: 603-672-1122
Re: [AOLSERVER] malloc replacement libraries?
Yes, Purify definitely seems to be the way to go. On Wednesday, November 6, 2002, at 06:12 PM, Andrew Piskorski wrote: But then I got Purify, which AFAIK covers everything that Electric Fence can do
Re: [AOLSERVER] malloc replacement libraries?
On Wednesday, November 6, 2002, at 06:00 PM, Dossy wrote: Sorry I can't be more constructive. I just had a problem like this, and didn't solve it, so the system just crashes regularly. Ouch. That's a real drag. Want to describe the problem in case there might be some ideas from the community? Unfortunately, I don't have much more to report than Andy did. I had an AOLserver 3.3+ad13 running ACS, and it was periodically crashing. At first, no coredumps, then I figured out how to use coreadm on Solaris, so we got the coredumps, and they all showed death inside malloc. Different times of day, different server loads, different uptimes. After I failed to achieve a diagnosis after a while, the customer decided not to pursue the matter further. The customer is bailing on AOLserver anyway, with low availability of support being a contributing factor; this experience didn' t really help AOLserver's case any. It doesn't seem to be a "clock format..." either.
Re: [AOLSERVER] malloc replacement libraries?
On Wed, Nov 06, 2002 at 06:44:06PM -0600, Rob Mayoff wrote: > You haven't said what errors Purify reports in the vendor library. If > the library double-frees, or writes to freed memory, then adding padding > around allocated blocks won't help. Hm, good point. It's primarily Array Bounds Write, Array Bounds Read, and Uninitialized Memory Read, but there are some Free Memory Read and Free Memory Write errors in there too. > On systems with GNU libc (such as Linux systems), you can easily use > __malloc_hook and friends to do what you want. See the "Hooks for > Malloc" node in the libc info. Perhaps you can do something similar if Ah, interesting, I didn't know about that: http://www.gnu.org/manual/glibc-2.2.5/html_node/Hooks-for-Malloc.html#Hooks%20for%20Malloc -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com
Re: [AOLSERVER] malloc replacement libraries?
+-- On Nov 6, Andrew Piskorski said: > That is good advice in general, but probably isn't relevent for my > particular problem. I want to make the vendor code more robust by > sliding a different malloc in underneath it, not simply wall it off in > its own sandbox and let it corrupt itself as much as it wants there. If you could restart the proxy and have it restore its state somehow, then it's relevant to your problem. You haven't said what errors Purify reports in the vendor library. If the library double-frees, or writes to freed memory, then adding padding around allocated blocks won't help. On systems with GNU libc (such as Linux systems), you can easily use __malloc_hook and friends to do what you want. See the "Hooks for Malloc" node in the libc info. Perhaps you can do something similar if you're on another system.
Re: [AOLSERVER] malloc replacement libraries?
On Wed, Nov 06, 2002 at 05:44:43PM -0500, Peter M. Jansson wrote: > I recently tried the Solaris malloc debugging facility, but the AOLserver > (running ACS) went from taking about 5 minutes to start up to taking over > 72 hours to start up. AOLserver uses so much dynamic memory that any Way back when, I tried a trial license of Insure++ with AOLserver, and it was so slow as to be completely unusable. Purify on the other hand slows things down noticeably, but is very useful. > I think electric fence is not thread-safe, which is a problem that a lot > of the malloc debugging libraries have. Maybe that's why I couldn't make Electric Fence work with AOLserver, back when I tried 8 months or so ago. It seemed to be working but would find an "error" and halt AOLserver pretty much immediately, while it was still starting up. But then I got Purify, which AFAIK covers everything that Electric Fence can do, so I didn't worry about it. -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com
Re: [AOLSERVER] malloc replacement libraries?
On Wed, Nov 06, 2002 at 04:21:43PM -0500, Nathan Folkman wrote: > Another option to consider might be to move the vendor code into a proxy. We > have some code here which is not thread-safe for example, that we have moved That is good advice in general, but probably isn't relevent for my particular problem. I want to make the vendor code more robust by sliding a different malloc in underneath it, not simply wall it off in its own sandbox and let it corrupt itself as much as it wants there. > vendor code, which provides process isolation and protection. Might be > something we could look at providing back to the community. It definitely sounds useful. -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com
Re: [AOLSERVER] malloc replacement libraries?
On 2002.11.06, Peter M. Jansson <[EMAIL PROTECTED]> wrote: > I recently tried the Solaris malloc debugging facility, but the AOLserver > (running ACS) went from taking about 5 minutes to start up to taking over > 72 hours to start up. AOLserver uses so much dynamic memory that any > malloc debugging solutions that work by adding virtual-memory hardware > guard buffers around the malloc'd area are going to be r e a l l ys > l ow. Speaking of slow, I've heard testimonial from someone regarding Valgrind. Unfortunately, it's a x86 and Linux specific thing, but I was told "if you're willing to tolerate it being ungodly slow, Valgrind will help you stomp out those remaining few memory problems." More about Valgrind here: http://developer.kde.org/~sewardj/ At a glance, it's an awesome looking tool. When I get the time, I definitely want to push Tcl and AOLserver through its paces with Valgrind. Probably once the automated test suite for AOLserver tests a good amount of functionality, so I can just run the test suite under Valgrind ... > I think electric fence is not thread-safe, which is a problem that a lot > of the malloc debugging libraries have. Oh, yuck. > Sorry I can't be more constructive. I just had a problem like this, and > didn't solve it, so the system just crashes regularly. Ouch. That's a real drag. Want to describe the problem in case there might be some ideas from the community? -- Dossy -- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70)
Re: [AOLSERVER] malloc replacement libraries?
On Wed, Nov 06, 2002 at 04:35:10PM -0500, Dossy wrote: > Since I didn't see it on your list, I'll add one more that I think will > do what you want: > > Electric Fence (efence) > http://cs.ecs.baylor.edu/~donahoo/tools/efence/ No, I think Electric Fence works by stopping the process when it detects a violation, which is not what I want. I want it to keep running, but make the violation not hurt anything. -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com
Re: [AOLSERVER] malloc replacement libraries?
I recently tried the Solaris malloc debugging facility, but the AOLserver (running ACS) went from taking about 5 minutes to start up to taking over 72 hours to start up. AOLserver uses so much dynamic memory that any malloc debugging solutions that work by adding virtual-memory hardware guard buffers around the malloc'd area are going to be r e a l l ys l ow. I think electric fence is not thread-safe, which is a problem that a lot of the malloc debugging libraries have. Sorry I can't be more constructive. I just had a problem like this, and didn't solve it, so the system just crashes regularly.
Re: [AOLSERVER] malloc replacement libraries?
On 2002.11.06, Andrew Piskorski <[EMAIL PROTECTED]> wrote: > > What I would really like, is a replacement malloc library which I can > link in to protect me from the bad vendor code, by doing things like > padding malloc buffers with extra space. [...] > > I've found lots of potentially useful malloc replacements (see below), > however, it's not at all clear which would meet my needs. So before I > go off and start fooling with these, does anyone have any experience > with this sort of thing? Advice or recomendations? Since I didn't see it on your list, I'll add one more that I think will do what you want: Electric Fence (efence) http://cs.ecs.baylor.edu/~donahoo/tools/efence/ -- Dossy -- Dossy Shiobara mail: [EMAIL PROTECTED] Panoptic Computer Network web: http://www.panoptic.com/ "He realized the fastest way to change is to laugh at your own folly -- then you can let go and quickly move on." (p. 70)
Re: [AOLSERVER] malloc replacement libraries?
In a message dated 11/6/2002 4:13:11 PM Eastern Standard Time, [EMAIL PROTECTED] writes: As I've mentioned here before, I have a vendor C library, to which I do not have the source, which can corrupt the heap, eventually leading to segfaults. Purify reports the various errors nicely, but there's nothing I can do to fix the code other than reporting the problems to the vendor. What I would really like, is a replacement malloc library which I can link in to protect me from the bad vendor code, by doing things like padding malloc buffers with extra space. This would be useful to me for two reasons: One, if using this malloc band-aid library makes the segfaults go away, that adds additional proof that the errors Purify reports in the vendor code really are causing the heap corruption and segfaults, which can help convince the vendor to fix things. Two, if the vendor is slow about fixing their code, I can just run with the band-aid malloc library, to help insulate my application from their errors. I've found lots of potentially useful malloc replacements (see below), however, it's not at all clear which would meet my needs. So before I go off and start fooling with these, does anyone have any experience with this sort of thing? Advice or recomendations? Another option to consider might be to move the vendor code into a proxy. We have some code here which is not thread-safe for example, that we have moved into a tclsh - we call ours dcitcl. Then we wrote some code which allows the AOLserver to start the tclsh in proxy mode, mapping stdin, stdout, and stderr to pipes used for communication between the AOLserver and the tclsh with the vendor code, which provides process isolation and protection. Might be something we could look at providing back to the community. The interface currently works something like this: proxy.start myProxy 1 /dci/bin/dcitcl -p - starts a single proxy called myProxy set result [proxy.send myProxy [list ns_info commands]] - sends "ns_info commands" as the command to be executed remotely in the dcitcl tclsh proxy - n
Re: [AOLSERVER] malloc replacement libraries?
On Wednesday 06 November 2002 22:12, you wrote: I have a collegue who has some (well, good) experience with mpatrol. I'll forward your mail to him so he may share some. Cheers zoran > As I've mentioned here before, I have a vendor C library, to which I > do not have the source, which can corrupt the heap, eventually leading > to segfaults. Purify reports the various errors nicely, but there's > nothing I can do to fix the code other than reporting the problems to > the vendor. > > What I would really like, is a replacement malloc library which I can > link in to protect me from the bad vendor code, by doing things like > padding malloc buffers with extra space. > > This would be useful to me for two reasons: One, if using this malloc > band-aid library makes the segfaults go away, that adds additional > proof that the errors Purify reports in the vendor code really are > causing the heap corruption and segfaults, which can help convince the > vendor to fix things. Two, if the vendor is slow about fixing their > code, I can just run with the band-aid malloc library, to help > insulate my application from their errors. > > I've found lots of potentially useful malloc replacements (see below), > however, it's not at all clear which would meet my needs. So before I > go off and start fooling with these, does anyone have any experience > with this sort of thing? Advice or recomendations? > > > Lists of malloc replacements: > > http://www.cs.colorado.edu/homes/zorn/public_html/MallocDebug.html > http://www-1.ibm.com/servers/eserver/zseries/os/linux/ldt/whitepaper2.html > > > Possibly useful malloc replacemnt libraries: > > http://dmalloc.com/ > http://sourceforge.net/projects/dmalloc/ > http://packages.debian.org/stable/devel/dmalloc.html > > http://fscked.org/proj/njamd.shtml > http://sourceforge.net/projects/njamd/ > http://packages.debian.org/stable/devel/njamd.html > > http://www.cbmamiga.demon.co.uk/mpatrol/ > http://sourceforge.net/projects/mpatrol/ > > http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html > > http://sourceforge.net/projects/clw/ > http://sourceforge.net/projects/libmss/ > http://sourceforge.net/projects/ansimd/ > http://packages.debian.org/stable/devel/fda.html
[AOLSERVER] malloc replacement libraries?
As I've mentioned here before, I have a vendor C library, to which I do not have the source, which can corrupt the heap, eventually leading to segfaults. Purify reports the various errors nicely, but there's nothing I can do to fix the code other than reporting the problems to the vendor. What I would really like, is a replacement malloc library which I can link in to protect me from the bad vendor code, by doing things like padding malloc buffers with extra space. This would be useful to me for two reasons: One, if using this malloc band-aid library makes the segfaults go away, that adds additional proof that the errors Purify reports in the vendor code really are causing the heap corruption and segfaults, which can help convince the vendor to fix things. Two, if the vendor is slow about fixing their code, I can just run with the band-aid malloc library, to help insulate my application from their errors. I've found lots of potentially useful malloc replacements (see below), however, it's not at all clear which would meet my needs. So before I go off and start fooling with these, does anyone have any experience with this sort of thing? Advice or recomendations? Lists of malloc replacements: http://www.cs.colorado.edu/homes/zorn/public_html/MallocDebug.html http://www-1.ibm.com/servers/eserver/zseries/os/linux/ldt/whitepaper2.html Possibly useful malloc replacemnt libraries: http://dmalloc.com/ http://sourceforge.net/projects/dmalloc/ http://packages.debian.org/stable/devel/dmalloc.html http://fscked.org/proj/njamd.shtml http://sourceforge.net/projects/njamd/ http://packages.debian.org/stable/devel/njamd.html http://www.cbmamiga.demon.co.uk/mpatrol/ http://sourceforge.net/projects/mpatrol/ http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html http://sourceforge.net/projects/clw/ http://sourceforge.net/projects/libmss/ http://sourceforge.net/projects/ansimd/ http://packages.debian.org/stable/devel/fda.html -- Andrew Piskorski <[EMAIL PROTECTED]> http://www.piskorski.com
Re: [AOLSERVER] alloc: invalid block errors
Title: Message I'm about a month from having the current test framework operational. I'll check out the OpenACS testing functionality this weekend. Thanks for the pointer. /s. -Original Message-From: AOLserver Discussion [mailto:[EMAIL PROTECTED]] On Behalf Of Simon MillwardSent: Wednesday, November 06, 2002 3:55 AMTo: [EMAIL PROTECTED]Subject: Re: [AOLSERVER] alloc: invalid block errorsYou might be interested in taking a look at the automated testing package that now comes as part of the openACS 4.5/6 core.its actually a framework for organising, deploying and state-setting fo indepenent auto tests (much akin to XP style approach to testing). Although not the same as what your describing, again there is some crossover,and perhaps its another cross-community opportunity.On Tuesday, November 5, 2002, at 08:21 pm, Scott S. Goodwin wrote: I've been setting up a framework within which to consistently fit automated test scripts. I'm using nsopenssl as my first example. I will release the framework as soon as it is fully functional for community review and comment. Some features: It compiles all source code, configures all software, and run all tests in a fully automated way, all without being root. Don't let this stop you from writing manual or automated tests -- it should be fairly straightforward to migrate that code into the framework. /s.-Original Message-From: AOLserver Discussion [mailto:[EMAIL PROTECTED]] On Behalf Of Nathan FolkmanSent: Tuesday, November 05, 2002 1:57 PMTo: [EMAIL PROTECTED]Subject: Re: [AOLSERVER] alloc: invalid block errorsIn a message dated 11/5/2002 2:56:05 PM Eastern Standard Time, [EMAIL PROTECTED] writes:That'll be cool. Set something up to ns_schedule_proc -thread and seeif your thread creation/cleanup shows anything funny going on. Throw inlots of nsv_set and nsv_unset in there.I'm going to get a copy of nstelemetry.adp and see if anything funny ishappening.-- DossyPerhaps these test scripts could form the foundation of a generic test suite. I know Scott has been thinking a lot about this.- n
Re: [AOLSERVER] alloc: invalid block errors
You might be interested in taking a look at the automated testing package that now comes as part of the openACS 4.5/6 core. its actually a framework for organising, deploying and state-setting fo indepenent auto tests (much akin to XP style approach to testing). Although not the same as what your describing, again there is some crossover,and perhaps its another cross-community opportunity. On Tuesday, November 5, 2002, at 08:21 pm, Scott S. Goodwin wrote: I've been setting up a framework within which to consistently fit automated test scripts. I'm using nsopenssl as my first example. I will release the framework as soon as it is fully functional for community review and comment. Some features: It compiles all source code, configures all software, and run all tests in a fully automated way, all without being root. Don't let this stop you from writing manual or automated tests -- it should be fairly straightforward to migrate that code into the framework. /s. -Original Message- From: AOLserver Discussion [mailto:[EMAIL PROTECTED]] On Behalf Of Nathan Folkman Sent: Tuesday, November 05, 2002 1:57 PM To: [EMAIL PROTECTED] Subject: Re: [AOLSERVER] alloc: invalid block errors In a message dated 11/5/2002 2:56:05 PM Eastern Standard Time, [EMAIL PROTECTED] writes: That'll be cool. Set something up to ns_schedule_proc -thread and see if your thread creation/cleanup shows anything funny going on. Throw in lots of nsv_set and nsv_unset in there. I'm going to get a copy of nstelemetry.adp and see if anything funny is happening. -- Dossy Perhaps these test scripts could form the foundation of a generic test suite. I know Scott has been thinking a lot about this. - n