Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Nathan Folkman

In a message dated 11/6/02 10:57:34 PM, [EMAIL PROTECTED] writes:



> Unfortunately, I don't have much more to report than Andy did.  I had
> an
> AOLserver 3.3+ad13 running ACS, and it was periodically crashing.

Lately I've been seeing more instances where AOLserver 3.3+ad13 just
stops serving pages.  It's still running, hasn't crashed, no errors or
anything suspicious in the error log.  I've already bumped up the stack
sizes.  I haven't a clue what's going on in there, but it can't be good!

janine

What are the major differences that would need to be bridged between the stock 3.5.1 code base and 3.3+ad13. There seems to be a large amount of email, bugs, etc. around the ad versions. Would things become easier if we were all working from a single common code base? ...I know this is a somewhat loaded question, but I'm curious as to what features are considered missing from the 3.5.1 code base. Thanks!

- n


Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Peter M. Jansson
On Wednesday, November 6, 2002, at 10:56 PM, Janine Sisk wrote:


 I haven't a clue what's going on in there, but it can't be good!


You could try to attach gdb to the running process, and then poke around
(start with a stack backtrace); many Unix variants allow this.  You could
also just "kill -ABRT" the process, which should generate a core dump
(assuming you have core dumps enabled -- check your ulimit -c and, if you'
re running Solaris, your coreadm), then start gdb and get a stack
backtrace.



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Janine Sisk
On Wednesday, November 6, 2002, at 10:48 PM, Peter M. Jansson wrote:


Unfortunately, I don't have much more to report than Andy did.  I had
an
AOLserver 3.3+ad13 running ACS, and it was periodically crashing.


Lately I've been seeing more instances where AOLserver 3.3+ad13 just
stops serving pages.  It's still running, hasn't crashed, no errors or
anything suspicious in the error log.  I've already bumped up the stack
sizes.  I haven't a clue what's going on in there, but it can't be good!

janine

--
Janine Sisk
President/CEO
furfly.net, LLC
Mont Vernon, NH
Phone: 603-672-1122



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Peter M. Jansson
Yes, Purify definitely seems to be the way to go.

On Wednesday, November 6, 2002, at 06:12 PM, Andrew Piskorski wrote:


But then I got Purify, which AFAIK
covers everything that Electric Fence can do



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Peter M. Jansson
On Wednesday, November 6, 2002, at 06:00 PM, Dossy wrote:


Sorry I can't be more constructive.  I just had a problem like this, and
didn't solve it, so the system just crashes regularly.


Ouch.  That's a real drag.  Want to describe the problem in case there
might be some ideas from the community?


Unfortunately, I don't have much more to report than Andy did.  I had an
AOLserver 3.3+ad13 running ACS, and it was periodically crashing.  At
first, no coredumps, then I figured out how to use coreadm on Solaris, so
we got the coredumps, and they all showed death inside malloc.  Different
times of day, different server loads, different uptimes.  After I failed
to achieve a diagnosis after a while, the customer decided not to pursue
the matter further.  The customer is bailing on AOLserver anyway, with low
availability of support being a contributing factor; this experience didn'
t really help AOLserver's case any.

It doesn't seem to be a "clock format..." either.



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Andrew Piskorski
On Wed, Nov 06, 2002 at 06:44:06PM -0600, Rob Mayoff wrote:

> You haven't said what errors Purify reports in the vendor library. If
> the library double-frees, or writes to freed memory, then adding padding
> around allocated blocks won't help.

Hm, good point.  It's primarily Array Bounds Write, Array Bounds Read,
and Uninitialized Memory Read, but there are some Free Memory Read and
Free Memory Write errors in there too.

> On systems with GNU libc (such as Linux systems), you can easily use
> __malloc_hook and friends to do what you want. See the "Hooks for
> Malloc" node in the libc info. Perhaps you can do something similar if

Ah, interesting, I didn't know about that:

http://www.gnu.org/manual/glibc-2.2.5/html_node/Hooks-for-Malloc.html#Hooks%20for%20Malloc

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Rob Mayoff
+-- On Nov 6, Andrew Piskorski said:
> That is good advice in general, but probably isn't relevent for my
> particular problem.  I want to make the vendor code more robust by
> sliding a different malloc in underneath it, not simply wall it off in
> its own sandbox and let it corrupt itself as much as it wants there.

If you could restart the proxy and have it restore its state somehow,
then it's relevant to your problem.

You haven't said what errors Purify reports in the vendor library. If
the library double-frees, or writes to freed memory, then adding padding
around allocated blocks won't help.

On systems with GNU libc (such as Linux systems), you can easily use
__malloc_hook and friends to do what you want. See the "Hooks for
Malloc" node in the libc info. Perhaps you can do something similar if
you're on another system.



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Andrew Piskorski
On Wed, Nov 06, 2002 at 05:44:43PM -0500, Peter M. Jansson wrote:

> I recently tried the Solaris malloc debugging facility, but the AOLserver
> (running ACS) went from taking about 5 minutes to start up to taking over
> 72 hours to start up.  AOLserver uses so much dynamic memory that any

Way back when, I tried a trial license of Insure++ with AOLserver, and
it was so slow as to be completely unusable.  Purify on the other hand
slows things down noticeably, but is very useful.

> I think electric fence is not thread-safe, which is a problem that a lot
> of the malloc debugging libraries have.

Maybe that's why I couldn't make Electric Fence work with AOLserver,
back when I tried 8 months or so ago.  It seemed to be working but
would find an "error" and halt AOLserver pretty much immediately,
while it was still starting up.  But then I got Purify, which AFAIK
covers everything that Electric Fence can do, so I didn't worry about
it.

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Andrew Piskorski
On Wed, Nov 06, 2002 at 04:21:43PM -0500, Nathan Folkman wrote:
> Another option to consider might be to move the vendor code into a proxy. We
> have some code here which is not thread-safe for example, that we have moved

That is good advice in general, but probably isn't relevent for my
particular problem.  I want to make the vendor code more robust by
sliding a different malloc in underneath it, not simply wall it off in
its own sandbox and let it corrupt itself as much as it wants there.

> vendor code, which provides process isolation and protection. Might be
> something we could look at providing back to the community.

It definitely sounds useful.

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Dossy
On 2002.11.06, Peter M. Jansson <[EMAIL PROTECTED]> wrote:
> I recently tried the Solaris malloc debugging facility, but the AOLserver
> (running ACS) went from taking about 5 minutes to start up to taking over
> 72 hours to start up.  AOLserver uses so much dynamic memory that any
> malloc debugging solutions that work by adding virtual-memory hardware
> guard buffers around the malloc'd area are going to be r e a l  l  ys
> l ow.

Speaking of slow, I've heard testimonial from someone regarding
Valgrind.  Unfortunately, it's a x86 and Linux specific thing, but I was
told "if you're willing to tolerate it being ungodly slow, Valgrind will
help you stomp out those remaining few memory problems."

More about Valgrind here:

http://developer.kde.org/~sewardj/

At a glance, it's an awesome looking tool.  When I get the time, I
definitely want to push Tcl and AOLserver through its paces with
Valgrind.  Probably once the automated test suite for AOLserver tests a
good amount of functionality, so I can just run the test suite under
Valgrind ...

> I think electric fence is not thread-safe, which is a problem that a lot
> of the malloc debugging libraries have.

Oh, yuck.

> Sorry I can't be more constructive.  I just had a problem like this, and
> didn't solve it, so the system just crashes regularly.

Ouch.  That's a real drag.  Want to describe the problem in case there
might be some ideas from the community?

-- Dossy

--
Dossy Shiobara   mail: [EMAIL PROTECTED]
Panoptic Computer Network web: http://www.panoptic.com/
  "He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Andrew Piskorski
On Wed, Nov 06, 2002 at 04:35:10PM -0500, Dossy wrote:
> Since I didn't see it on your list, I'll add one more that I think will
> do what you want:
>
> Electric Fence (efence)
> http://cs.ecs.baylor.edu/~donahoo/tools/efence/

No, I think Electric Fence works by stopping the process when it
detects a violation, which is not what I want.  I want it to keep
running, but make the violation not hurt anything.

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Peter M. Jansson
I recently tried the Solaris malloc debugging facility, but the AOLserver
(running ACS) went from taking about 5 minutes to start up to taking over
72 hours to start up.  AOLserver uses so much dynamic memory that any
malloc debugging solutions that work by adding virtual-memory hardware
guard buffers around the malloc'd area are going to be r e a l  l  ys
l ow.

I think electric fence is not thread-safe, which is a problem that a lot
of the malloc debugging libraries have.

Sorry I can't be more constructive.  I just had a problem like this, and
didn't solve it, so the system just crashes regularly.



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Dossy
On 2002.11.06, Andrew Piskorski <[EMAIL PROTECTED]> wrote:
>
> What I would really like, is a replacement malloc library which I can
> link in to protect me from the bad vendor code, by doing things like
> padding malloc buffers with extra space.
[...]
>
> I've found lots of potentially useful malloc replacements (see below),
> however, it's not at all clear which would meet my needs.  So before I
> go off and start fooling with these, does anyone have any experience
> with this sort of thing?  Advice or recomendations?

Since I didn't see it on your list, I'll add one more that I think will
do what you want:

Electric Fence (efence)
http://cs.ecs.baylor.edu/~donahoo/tools/efence/

-- Dossy

--
Dossy Shiobara   mail: [EMAIL PROTECTED]
Panoptic Computer Network web: http://www.panoptic.com/
  "He realized the fastest way to change is to laugh at your own
folly -- then you can let go and quickly move on." (p. 70)



Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Nathan Folkman
In a message dated 11/6/2002 4:13:11 PM Eastern Standard Time, [EMAIL PROTECTED] writes:

As I've mentioned here before, I have a vendor C library, to which I
do not have the source, which can corrupt the heap, eventually leading
to segfaults.  Purify reports the various errors nicely, but there's
nothing I can do to fix the code other than reporting the problems to
the vendor.

What I would really like, is a replacement malloc library which I can
link in to protect me from the bad vendor code, by doing things like
padding malloc buffers with extra space.

This would be useful to me for two reasons:  One, if using this malloc
band-aid library makes the segfaults go away, that adds additional
proof that the errors Purify reports in the vendor code really are
causing the heap corruption and segfaults, which can help convince the
vendor to fix things.  Two, if the vendor is slow about fixing their
code, I can just run with the band-aid malloc library, to help
insulate my application from their errors.

I've found lots of potentially useful malloc replacements (see below),
however, it's not at all clear which would meet my needs.  So before I
go off and start fooling with these, does anyone have any experience
with this sort of thing?  Advice or recomendations?


Another option to consider might be to move the vendor code into a proxy. We have some code here which is not thread-safe for example, that we have moved into a tclsh - we call ours dcitcl. Then we wrote some code which allows the AOLserver to start the tclsh in proxy mode, mapping stdin, stdout, and stderr to pipes used for communication between the AOLserver and the tclsh with the vendor code, which provides process isolation and protection. Might be something we could look at providing back to the community.

The interface currently works something like this:

proxy.start myProxy 1 /dci/bin/dcitcl -p
   
   - starts a single proxy called myProxy

set result [proxy.send myProxy [list ns_info commands]]

   - sends "ns_info commands" as the command to be executed remotely in the dcitcl tclsh proxy

- n


Re: [AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Zoran Vasiljevic
On Wednesday 06 November 2002 22:12, you wrote:

I have a collegue who has some (well, good) experience
with mpatrol. I'll forward your mail to him so he may share some.

Cheers
zoran

> As I've mentioned here before, I have a vendor C library, to which I
> do not have the source, which can corrupt the heap, eventually leading
> to segfaults.  Purify reports the various errors nicely, but there's
> nothing I can do to fix the code other than reporting the problems to
> the vendor.
>
> What I would really like, is a replacement malloc library which I can
> link in to protect me from the bad vendor code, by doing things like
> padding malloc buffers with extra space.
>
> This would be useful to me for two reasons:  One, if using this malloc
> band-aid library makes the segfaults go away, that adds additional
> proof that the errors Purify reports in the vendor code really are
> causing the heap corruption and segfaults, which can help convince the
> vendor to fix things.  Two, if the vendor is slow about fixing their
> code, I can just run with the band-aid malloc library, to help
> insulate my application from their errors.
>
> I've found lots of potentially useful malloc replacements (see below),
> however, it's not at all clear which would meet my needs.  So before I
> go off and start fooling with these, does anyone have any experience
> with this sort of thing?  Advice or recomendations?
>
>
> Lists of malloc replacements:
>
> http://www.cs.colorado.edu/homes/zorn/public_html/MallocDebug.html
> http://www-1.ibm.com/servers/eserver/zseries/os/linux/ldt/whitepaper2.html
>
>
> Possibly useful malloc replacemnt libraries:
>
> http://dmalloc.com/
> http://sourceforge.net/projects/dmalloc/
> http://packages.debian.org/stable/devel/dmalloc.html
>
> http://fscked.org/proj/njamd.shtml
> http://sourceforge.net/projects/njamd/
> http://packages.debian.org/stable/devel/njamd.html
>
> http://www.cbmamiga.demon.co.uk/mpatrol/
> http://sourceforge.net/projects/mpatrol/
>
> http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html
>
> http://sourceforge.net/projects/clw/
> http://sourceforge.net/projects/libmss/
> http://sourceforge.net/projects/ansimd/
> http://packages.debian.org/stable/devel/fda.html



[AOLSERVER] malloc replacement libraries?

2002-11-06 Thread Andrew Piskorski
As I've mentioned here before, I have a vendor C library, to which I
do not have the source, which can corrupt the heap, eventually leading
to segfaults.  Purify reports the various errors nicely, but there's
nothing I can do to fix the code other than reporting the problems to
the vendor.

What I would really like, is a replacement malloc library which I can
link in to protect me from the bad vendor code, by doing things like
padding malloc buffers with extra space.

This would be useful to me for two reasons:  One, if using this malloc
band-aid library makes the segfaults go away, that adds additional
proof that the errors Purify reports in the vendor code really are
causing the heap corruption and segfaults, which can help convince the
vendor to fix things.  Two, if the vendor is slow about fixing their
code, I can just run with the band-aid malloc library, to help
insulate my application from their errors.

I've found lots of potentially useful malloc replacements (see below),
however, it's not at all clear which would meet my needs.  So before I
go off and start fooling with these, does anyone have any experience
with this sort of thing?  Advice or recomendations?


Lists of malloc replacements:

http://www.cs.colorado.edu/homes/zorn/public_html/MallocDebug.html
http://www-1.ibm.com/servers/eserver/zseries/os/linux/ldt/whitepaper2.html


Possibly useful malloc replacemnt libraries:

http://dmalloc.com/
http://sourceforge.net/projects/dmalloc/
http://packages.debian.org/stable/devel/dmalloc.html

http://fscked.org/proj/njamd.shtml
http://sourceforge.net/projects/njamd/
http://packages.debian.org/stable/devel/njamd.html

http://www.cbmamiga.demon.co.uk/mpatrol/
http://sourceforge.net/projects/mpatrol/

http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html

http://sourceforge.net/projects/clw/
http://sourceforge.net/projects/libmss/
http://sourceforge.net/projects/ansimd/
http://packages.debian.org/stable/devel/fda.html

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com



Re: [AOLSERVER] alloc: invalid block errors

2002-11-06 Thread Scott S. Goodwin
Title: Message



I'm 
about a month from having the current test framework operational. I'll check out 
the OpenACS testing functionality this weekend. Thanks for the 
pointer.
 
/s.
 

  
  -Original Message-From: AOLserver 
  Discussion [mailto:[EMAIL PROTECTED]] On Behalf Of Simon 
  MillwardSent: Wednesday, November 06, 2002 3:55 AMTo: 
  [EMAIL PROTECTED]Subject: Re: [AOLSERVER] alloc: invalid 
  block errorsYou might be interested in taking a look at 
  the automated testing package that now comes as part of the openACS 4.5/6 
  core.its actually a framework for organising, deploying and 
  state-setting fo indepenent auto tests (much akin to XP style approach to 
  testing). Although not the same as what your describing, again there is some 
  crossover,and perhaps its another cross-community opportunity.On 
  Tuesday, November 5, 2002, at 08:21 pm, Scott S. Goodwin wrote:
  I've 
been setting up a framework within which to consistently fit automated test 
scripts. I'm using nsopenssl as my first example. I will release the 
framework as soon as it is fully functional for community review and 
comment. Some features: It compiles all source code, configures all 
software, and run all tests in a fully automated way, all without being 
root. Don't 
let this stop you from writing manual or automated tests -- it should be 
fairly straightforward to migrate that code into the framework.  /s.-Original 
Message-From: AOLserver Discussion 
[mailto:[EMAIL PROTECTED]] On Behalf Of Nathan 
FolkmanSent: Tuesday, November 05, 2002 1:57 PMTo: 
[EMAIL PROTECTED]Subject: Re: [AOLSERVER] alloc: invalid 
block errorsIn a message dated 11/5/2002 
2:56:05 PM Eastern Standard Time, [EMAIL PROTECTED] writes:That'll 
be cool.  Set something up to ns_schedule_proc -thread and seeif 
your thread creation/cleanup shows anything funny going on.  Throw 
inlots of nsv_set and nsv_unset in there.I'm going to get a copy 
of nstelemetry.adp and see if anything funny ishappening.-- 
DossyPerhaps these test scripts could form the foundation of 
a generic test suite. I know Scott has been thinking a lot about 
this.- 
n


Re: [AOLSERVER] alloc: invalid block errors

2002-11-06 Thread Simon Millward
You might be interested in taking a look at the automated testing package that now comes as part of the openACS 4.5/6 core.

its actually a framework for organising, deploying and state-setting fo indepenent auto tests (much akin to XP style approach to testing). Although not the same as what your describing, again there is some crossover,and perhaps its another cross-community opportunity.


On Tuesday, November 5, 2002, at 08:21  pm, Scott S. Goodwin wrote:

I've been setting up a framework within which to consistently fit automated test scripts. I'm using nsopenssl as my first example. I will release the framework as soon as it is fully functional for community review and comment. Some features: It compiles all source code, configures all software, and run all tests in a fully automated way, all without being root.
 
Don't let this stop you from writing manual or automated tests -- it should be fairly straightforward to migrate that code into the framework.
 
 
/s.


-Original Message-
From: AOLserver Discussion [mailto:[EMAIL PROTECTED]] On Behalf Of Nathan Folkman
Sent: Tuesday, November 05, 2002 1:57 PM
To: [EMAIL PROTECTED]
Subject: Re: [AOLSERVER] alloc: invalid block errors

In a message dated 11/5/2002 2:56:05 PM Eastern Standard Time, [EMAIL PROTECTED] writes:

That'll be cool.  Set something up to ns_schedule_proc -thread and see
if your thread creation/cleanup shows anything funny going on.  Throw in
lots of nsv_set and nsv_unset in there.

I'm going to get a copy of nstelemetry.adp and see if anything funny is
happening.

-- Dossy



Perhaps these test scripts could form the foundation of a generic test suite. I know Scott has been thinking a lot about this.

- n