This is most probably the best variabt so far, and not complicated, such a
optimizer can do the right thing easily. sorry for the many versions..
-gustaf
{ unsigned register int s = (size-1) 3;
while (s1) { s = 1; bucket++; }
}
if (bucket NBUCKETS) {
bucket = NBUCKETS;
Am 16.01.2007 um 10:37 schrieb Stephen Deasey:
Can you import this into CVS? Top level.
You mean the tclThreadAlloc.c file on top-level
of the naviserver project?
Am 16.01.2007 um 12:18 schrieb Stephen Deasey:
vtmalloc -- add this
It's there. Everybody can now contribute, if needed.
On 1/16/07, Stephen Deasey [EMAIL PROTECTED] wrote:
On 1/16/07, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
Am 16.01.2007 um 12:18 schrieb Stephen Deasey:
vtmalloc -- add this
It's there. Everybody can now contribute, if needed.
Rocking.
I suggest putting the 0.0.3 tarball up on
Am 16.01.2007 um 15:41 schrieb Stephen Deasey:
I suggest putting the 0.0.3 tarball up on sourceforge, announcing on
Freshmeat, and cross-posting on the aolserver list. You really want
random people with their random workloads on random OS to beat on
this. I don't know if the pool of people
Yes, it is combined version, but Tcl version is slightly different and
Zoran took it over to maintain, in my tarball i include both, we do
experiments in different directions and then combine best results.
Also the intention was to try to include it in Tcl itself.
Stephen Deasey wrote:
On
Gustaf Neumann wrote:
This is most probably the best variabt so far, and not complicated, such a
optimizer can do the right thing easily. sorry for the many versions..
-gustaf
{ unsigned register int s = (size-1) 3;
while (s1) { s = 1; bucket++; }
}
if (bucket
Hi Jeff,
we are aware that the funciton is essentially an integer log2.
The chosen C-based variant is acually faster and more general than
what you have included (it needs only max 2 shift operations for
the relevant range) but the assembler based variant is hard to beat
and yields another 3%
a)
The test program Zoran includes biases Zippy toward standard
allocator, which it does not do for VT. The following patch
corrects this behavior:
+++ memtest.c Sun Jan 14 16:43:23 2007
@@ -211,6 +211,7 @@
} else {
size = 0x3FFF; /* Limit to 16K */
Am 15.01.2007 um 22:22 schrieb Mike:
Zoran, I believe you misunderstood. The patch above limits blocks
allocated by your tester to 16000 instead of 16384 blocks. The reason
for this is that Zippy's largest bucket is configured to be
16284-sizeof(Block) bytes (note the 2 in 16_2_84 is _NOT_
Am 13.01.2007 um 06:17 schrieb Mike:
I'm happy to offer ssh access to a test
box where you can reproduce these results.
Oh, that is very fine! Can you give me the
access data? You can post me the login-details
in a separate private mail.
Thanks,
Zoran
I downloaded the code in the previous mail. After some minor path
adjustments, I was able to get the test program to compile and link
under FreeBSD 6.1 running on a dual-processor PIII system, linked
against a threaded tcl 8.5a. I could get this program to consistently
do one of two things:
-
Am 13.01.2007 um 10:45 schrieb Gustaf Neumann:
PPS: strangly, the only think making me supicious is the
huge amount of improvement, especially on Mac OS X.
Look...
Running the test program unmodified (on Mac Pro box):
Test Tcl allocator with 4 threads, 16000 records ...
This allocator
Am 13.01.2007 um 10:45 schrieb Gustaf Neumann:
Fault was, that i did not read the README (i read the frist one) and
compiled (a) without -DTCL_THREADS .
In that case, fault was that on FreeBSD you need to
explictly put -pthread when linking the test program,
regardless of the fact that
I've been on a search for an allocator that will be fast
enough and not so memory hungry as the allocator being
built in Tcl. Unfortunately, as it mostly is, it turned
out that I had to write my own.
Vlad has written an allocator that uses mmap to obtain
memory for the system and munmap that
On 19.12.2006, at 01:10, Stephen Deasey wrote:
This program allocates memory in a worker thread and frees it in the
main thread. If all free()'s put memory into a thread-local cache then
you would expect this program to bloat, but it doesn't, so I guess
it's not a problem (at least not on
On 19.12.2006, at 15:57, Vlad Seryakov wrote:
Zoran, can you test it on Solaris and OSX so we'd know that is not
Linux
related problem.
I have a Tcl library compiled with nedmalloc and when I link
against it and make
#define MemAlloc Tcl_Alloc
#define MemFree Tcl_Free
it runs fine.
Yes, please
Zoran Vasiljevic wrote:
On 19.12.2006, at 15:57, Vlad Seryakov wrote:
Zoran, can you test it on Solaris and OSX so we'd know that is not
Linux
related problem.
I have a Tcl library compiled with nedmalloc and when I link
against it and make
#define MemAlloc Tcl_Alloc
#define
On 19.12.2006, at 16:06, Vlad Seryakov wrote:
Yes, please
( I appended the code to the nedmalloc test program
and renamed their main to main1)
bash-2.03$ gcc -O3 -o tcltest tcltest.c -lpthread -DNDEBUG -
DTCL_THREADS -I/usr/local/include -L/usr/local/lib -ltcl8.4g
bash-2.03$ gdb
gdb may slow down concurrency, does it run without gdb, also does it run
with solaris malloc?
Zoran Vasiljevic wrote:
On 19.12.2006, at 16:06, Vlad Seryakov wrote:
Yes, please
( I appended the code to the nedmalloc test program
and renamed their main to main1)
bash-2.03$ gcc -O3 -o
I was suspecting Linux malloc, looks like it has problems with high
concurrency, i tried to replace MemAlloc/Fre with mmap/munmap, and it
crashes as well.
#define MemAlloc mmalloc
#define MemFree(ptr) mfree(ptr, gSize)
void *mmalloc(size_t size) { return
yes, it crashes when number of threads are more than 1 with any size but
not all the time, sometimes i need to run it several times, looks like
it is random, some combination, not sure of what.
I guess we never got that high concurrency in Naviserver, i wonder if
AOL has randomm crashes.
On 19.12.2006, at 16:35, Vlad Seryakov wrote:
yes, it crashes when number of threads are more than 1 with any
size but
not all the time, sometimes i need to run it several times, looks like
it is random, some combination, not sure of what.
I guess we never got that high concurrency in
On 12/19/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
yes, it crashes when number of threads are more than 1 with any size but
not all the time, sometimes i need to run it several times, looks like
it is random, some combination, not sure of what.
I guess we never got that high concurrency in
I converted all to use pthreads directly instead of Tcl wrappers, and
now it does not crash anymore. Will continue testing but it looks like
Tcl is the problem here, not ptmalloc
Stephen Deasey wrote:
On 12/19/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
yes, it crashes when number of threads
I have no idea, i spent too much time on this still without realizing
what i am doing and what to expect :-)))
Zoran Vasiljevic wrote:
On 19.12.2006, at 17:08, Vlad Seryakov wrote:
I converted all to use pthreads directly instead of Tcl wrappers, and
now it does not crash anymore. Will
On 12/19/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
On 19.12.2006, at 17:08, Vlad Seryakov wrote:
I converted all to use pthreads directly instead of Tcl wrappers, and
now it does not crash anymore. Will continue testing but it looks like
Tcl is the problem here, not ptmalloc
Where does
Right, with Ns_ functions it does not crash.
Stephen Deasey wrote:
On 12/19/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
On 19.12.2006, at 17:08, Vlad Seryakov wrote:
I converted all to use pthreads directly instead of Tcl wrappers, and
now it does not crash anymore. Will continue testing
On 12/19/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
Right, with Ns_ functions it does not crash.
Zoran will be happy... :-)
On 19.12.2006, at 20:42, Stephen Deasey wrote:
On 12/19/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
Right, with Ns_ functions it does not crash.
Zoran will be happy... :-)
Not at all!
So, I would like to know exactly how to reproduce the problem
(what OS, machine, etc).
Furthermore I
On 16.12.2006, at 19:31, Vlad Seryakov wrote:
But if speed is not important to you, you can supply Tcl without
zippy,
then no bloat, system is returned with reasonable speed, at least on
Linux, ptmalloc is not that bad
OK. I think I've reached the peace of mind with all this
alternate
I tried to run this program, it crahses with all allocators on free when
it was allocated in other thread. zippy does it as well, i amnot sure
how Naviserver works then.
#include tcl.h
#define MemAlloc ckalloc
#define MemFree ckfree
int nbuffer = 16384;
int nloops = 5;
int nthreads = 4;
Still, even without the last free and with mutex around it, it core
dumps in free(gPtr) during the loop.
Stephen Deasey wrote:
On 12/18/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
I tried to run this program, it crahses with all allocators on free when
it was allocated in other thread. zippy
On 12/18/06, Vlad Seryakov [EMAIL PROTECTED] wrote:
Still, even without the last free and with mutex around it, it core
dumps in free(gPtr) during the loop.
OK. Still doesn't mean your program is bug free :-)
There's a lot of extra stuff going on in your example program that
makes it hard
On 18.12.2006, at 22:08, Stephen Deasey wrote:
Works for me.
I say you can allocate memory in one thread and free it in another.
Nice. Well I can say that nedmalloc works, that is, that small
program runs to end w/o coring when compiled with nedmalloc.
Does this prove anything?
On 18.12.2006, at 19:57, Stephen Deasey wrote:
Are you saying you tested your app on Linux with native malloc and
experienced no fragmentation/bloating?
No. I have seen bloating but less then on zippy. I saw some
bloating and fragmentation on all optimizing allocators I
have tested.
I
I suspect something i am doing wrong, but still it crashes and i do not
see it why
#include tcl.h
#include stdlib.h
#include memory.h
#include unistd.h
#include signal.h
#include pthread.h
#define MemAlloc malloc
#define MemFree free
static int nbuffer = 16384;
static int nloops = 5;
On 12/18/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
On 18.12.2006, at 19:57, Stephen Deasey wrote:
One thing I wonder about this is, how do requests average out across
all threads? If you set the conn threads to exit after 10,000
requests, will they all quit at roughly the same time
On 15.12.2006, at 19:59, Vlad Seryakov wrote:
http://www.nedprod.com/programs/portable/nedmalloc/index.html
Hm... not bad at all:
This was under Solaris 2.8 on a Sun Blade2500 (Sparc) 1GB memory:
Testing standard allocator with 8 threads ...
This allocator achieves
On 16.12.2006, at 15:00, Zoran Vasiljevic wrote:
On 15.12.2006, at 19:59, Vlad Seryakov wrote:
http://www.nedprod.com/programs/portable/nedmalloc/index.html
Hm... not bad at all:
This was on a iMac with Intel Dual Core 1.83 Ghz and 512 MB memory
Testing standard allocator with 8
On 12/16/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
On 15.12.2006, at 19:59, Vlad Seryakov wrote:
http://www.nedprod.com/programs/portable/nedmalloc/index.html
Hm... not bad at all:
This was under Solaris 2.8 on a Sun Blade2500 (Sparc) 1GB memory:
Testing standard allocator with 8
On 12/16/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
Hey! I think our customers will love it! I will now try to
ditch the zippy and replace it with nedmalloc... Too bad that
Tcl as-is does not allow easy snap-in of alternate memory allocators.
I think this should be lobbied for.
It would
On 16.12.2006, at 16:25, Stephen Deasey wrote:
The seem, in the end, to go for Google tcmalloc. It wasn't the
absolute fastest for their particular set of tests, but had
dramatically lower memory usage.
The down side of tcmalloc: only Linux port.
The nedmalloc does them all (win, solaris,
On 15.12.2006, at 19:59, Vlad Seryakov wrote:
Will try this one.
To aid you (and others):
http://www.archiware.com/downloads/nedmalloc_tcl.tar.gz
Download and peek at README file. This compiles on all
machines I tested and works pretty fine in terms of speed.
I haven't tested the
On 12/16/06, Zoran Vasiljevic [EMAIL PROTECTED] wrote:
Are you sure? AFAIK, we just go down to Tcl_Alloc in Tcl library.
The allocator there will not allow you that. There were some discussions
on comp.lang.tcl about it (Jeff Hobbs knows better). As they (Tcl)
just inherited what aolserver had
On 16.12.2006, at 17:15, Stephen Deasey wrote:
Yeah, pretty sure. You can only use Tcl objects within a single
interp, which is restricted to a single thread, but general
ns_malloc'd memory chunks can be passed around between threads. It
would suck pretty hard if that wasn't the case.
Instead of using threadspeed or other simple malloc/free test, i used
naviserver and Tcl pages as test for allocators.
Using ab from apache and stresstest it for thousand requests i test
several allocators. And
having everything the same except LD_PRELOAD the difference seems pretty
clear.
You can, it moves Tcl_Objs struct between thread and shared pools, same
goes with other memory blocks.On thread exit
all memory goes to shared pool.
Zoran Vasiljevic wrote:
On 16.12.2006, at 17:15, Stephen Deasey wrote:
Yeah, pretty sure. You can only use Tcl objects within a single
On 16.12.2006, at 17:29, Vlad Seryakov wrote:
Instead of using threadspeed or other simple malloc/free test, i used
naviserver and Tcl pages as test for allocators.
Using ab from apache and stresstest it for thousand requests i test
several allocators. And
having everything the same except
On 16.12.2006, at 16:25, Stephen Deasey wrote:
Something to think about: does the nedmalloc test include allocating
memory in one thread and freeing it in another? Apparently this is
tough for some allocators, such as Linux ptmalloc. Naviserver does
this.
I'm still not 100% ready reading
But if speed is not important to you, you can supply Tcl without zippy,
then no bloat, system is returned with reasonable speed, at least on
Linux, ptmalloc is not that bad
Zoran Vasiljevic wrote:
On 16.12.2006, at 16:25, Stephen Deasey wrote:
Something to think about: does the nedmalloc
On 16.12.2006, at 19:31, Vlad Seryakov wrote:
But if speed is not important to you, you can supply Tcl without
zippy,
then no bloat, system is returned with reasonable speed, at least on
Linux, ptmalloc is not that bad
Eh... Vlad...
On the Mac the nedmalloc outperforms the standard
On 16.12.2006, at 19:31, Vlad Seryakov wrote:
Linux, ptmalloc is not that bad
Interestingly. ptmalloc3 (http://www.malloc.de/) and
nedmalloc both diverge from dlmalloc (http://gee.cs.oswego.edu/malloc.h)
library from Doug lea. Consequently, their performance
is similar (nedmalloc being
Hi!
I've tried libumem as Stephen suggested, but it is slower
than the regular system malloc. This (libumem) is really
geared toward the integration with the mdb (solaris modular
debugger) for memory debugging and analysis.
But, I've found:
I also tried Hoard, Google tcmalloc, umem and some other rare mallocs i
could find. Still zippy beats everybody, i ran my speed test not
threadtest. Will try this one.
Zoran Vasiljevic wrote:
Hi!
I've tried libumem as Stephen suggested, but it is slower
than the regular system malloc. This
On 15.12.2006, at 19:59, Vlad Seryakov wrote:
I also tried Hoard, Google tcmalloc, umem and some other rare
mallocs i
could find. Still zippy beats everybody, i ran my speed test not
threadtest. Will try this one.
Important: it is not only raw speed, that is important but also
the memory
56 matches
Mail list logo