The branch, master has been updated
       via  03e880931d0 doc: Update doc about talloc vs malloc speed
       via  a66db6c16e4 lib:talloc: Use tabs to align output in speed test
       via  39c13f062a8 lib:talloc: Increase alloc size to 128 kilobytes
       via  00d5982da2f lib:talloc: Don't optimize the speed test
       via  b6bf9cba80e lib:talloc: Add talloc_zero vs calloc test
       via  8dddea2ceda lib:talloc: Use memset_s() to avoid the call gets 
optimized out
       via  6812e30be35 lib:talloc: Remove trailing spaces from testsuite.c
      from  20a3a94e06a lib:ldb: Document environment variables in ldb manpage

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 03e880931d0a6171826f5ffc33e7b4d86eea54a6
Author: Andreas Schneider <a...@samba.org>
Date:   Wed Mar 29 11:04:38 2023 +0200

    doc: Update doc about talloc vs malloc speed
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>
    
    Autobuild-User(master): Martin Schwenke <mart...@samba.org>
    Autobuild-Date(master): Sat Sep 28 01:20:01 UTC 2024 on atb-devel-224

commit a66db6c16e4225e200da7f6f939b756026cee61f
Author: Andreas Schneider <a...@samba.org>
Date:   Thu Apr 27 11:31:07 2023 +0200

    lib:talloc: Use tabs to align output in speed test
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit 39c13f062a859d17b5be3a2a02fece8d0b1819f1
Author: Andreas Schneider <a...@samba.org>
Date:   Thu Apr 27 11:24:59 2023 +0200

    lib:talloc: Increase alloc size to 128 kilobytes
    
    We want to avoid that the optimizer will use stack allocations. This way
    the test should be a bit more realistic.
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit 00d5982da2f5f2595b634b957c7d0b990e12b37b
Author: Andreas Schneider <a...@samba.org>
Date:   Mon Apr 17 09:25:48 2023 +0200

    lib:talloc: Don't optimize the speed test
    
    If the speed test gets optimized, the malloc() and free() might be
    replaced by stack allocations.
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit b6bf9cba80e55ca5aac3611e55dc952782c867c7
Author: Andreas Schneider <a...@samba.org>
Date:   Fri Apr 14 21:34:59 2023 +0200

    lib:talloc: Add talloc_zero vs calloc test
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit 8dddea2ceda40f2365bd6b1a62826b84dc523b74
Author: Andreas Schneider <a...@samba.org>
Date:   Tue Feb 13 09:22:56 2024 +0100

    lib:talloc: Use memset_s() to avoid the call gets optimized out
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

commit 6812e30be35508db3ac23406243fe08da9cf3084
Author: Andreas Schneider <a...@samba.org>
Date:   Tue Feb 6 18:03:22 2024 +0100

    lib:talloc: Remove trailing spaces from testsuite.c
    
    Signed-off-by: Andreas Schneider <a...@samba.org>
    Reviewed-by: Martin Schwenke <mar...@meltin.net>

-----------------------------------------------------------------------

Summary of changes:
 lib/talloc/doc/mainpage.dox |  10 ++--
 lib/talloc/man/talloc.3.xml |  12 ++--
 lib/talloc/talloc_guide.txt |  11 ++--
 lib/talloc/testsuite.c      | 142 ++++++++++++++++++++++++++++++--------------
 lib/talloc/wscript          |   2 +-
 5 files changed, 114 insertions(+), 63 deletions(-)


Changeset truncated at 500 lines:

diff --git a/lib/talloc/doc/mainpage.dox b/lib/talloc/doc/mainpage.dox
index ece6ccb8f0c..d881e503a43 100644
--- a/lib/talloc/doc/mainpage.dox
+++ b/lib/talloc/doc/mainpage.dox
@@ -69,11 +69,11 @@
  * @section talloc_performance Performance
  *
  * All the additional features of talloc() over malloc() do come at a price. We
- * have a simple performance test in Samba4 that measures talloc() versus
- * malloc() performance, and it seems that talloc() is about 4% slower than
- * malloc() on my x86 Debian Linux box. For Samba, the great reduction in code
- * complexity that we get by using talloc makes this worthwhile, especially as
- * the total overhead of talloc/malloc in Samba is already quite small.
+ * have a performance test in Samba that measures talloc() versus malloc()
+ * performance, and it seems that talloc() is about 50% slower than malloc()
+ * (AMD Ryzen 9 3900X). For Samba, the great reduction in code complexity that
+ * we get by using talloc makes this worthwhile, especially as the total
+ * overhead of talloc/malloc in Samba is already quite small.
  *
  * @section talloc_named Named blocks
  *
diff --git a/lib/talloc/man/talloc.3.xml b/lib/talloc/man/talloc.3.xml
index c51061fce1f..e26b16dbecf 100644
--- a/lib/talloc/man/talloc.3.xml
+++ b/lib/talloc/man/talloc.3.xml
@@ -767,12 +767,12 @@ if (ptr) memcpy(ptr, p, strlen(p)+1);</programlisting>
   <refsect1><title>PERFORMANCE</title>
     <para>
       All the additional features of talloc(3) over malloc(3) do come at a
-      price.  We have a simple performance test in Samba4 that measures
-      talloc() versus malloc() performance, and it seems that talloc() is
-      about 10% slower than malloc() on my x86 Debian Linux box.  For
-      Samba, the great reduction in code complexity that we get by using
-      talloc makes this worthwhile, especially as the total overhead of
-      talloc/malloc in Samba is already quite small.
+      price. We have a performance test in Samba that measures talloc() versus
+      malloc() performance, and it seems that talloc() is
+      about 50% slower than malloc() (AMD Ryzen 9 3900X). For Samba, the great
+      reduction in code complexity that we get by using talloc makes this
+      worthwhile, especially as the total overhead of talloc/malloc in Samba
+      is already quite small.
     </para>
   </refsect1>
   <refsect1><title>SEE ALSO</title>
diff --git a/lib/talloc/talloc_guide.txt b/lib/talloc/talloc_guide.txt
index dedda6c0678..d6e3646a1bd 100644
--- a/lib/talloc/talloc_guide.txt
+++ b/lib/talloc/talloc_guide.txt
@@ -43,12 +43,11 @@ testsuite.c to clarify how some particular situation is 
handled.
 Performance
 -----------
 
-All the additional features of talloc() over malloc() do come at a
-price. We have a simple performance test in Samba4 that measures
-talloc() versus malloc() performance, and it seems that talloc() is
-about 4% slower than malloc() on my x86 Debian Linux box. For Samba,
-the great reduction in code complexity that we get by using talloc
-makes this worthwhile, especially as the total overhead of
+All the additional features of talloc() over malloc() do come at a price. We
+have a performance test in Samba4 that measures talloc() versus malloc()
+performance, and it seems that talloc() is about 50% slower than malloc() (AMD
+Ryzen 9 3900X). For Samba, the great reduction in code complexity that we get 
by
+using talloc makes this worthwhile, especially as the total overhead of
 talloc/malloc in Samba is already quite small.
 
 
diff --git a/lib/talloc/testsuite.c b/lib/talloc/testsuite.c
index 282ebc6956d..dc0039940ff 100644
--- a/lib/talloc/testsuite.c
+++ b/lib/talloc/testsuite.c
@@ -1,14 +1,14 @@
-/* 
+/*
    Unix SMB/CIFS implementation.
 
    local testing of talloc routines.
 
    Copyright (C) Andrew Tridgell 2004
-   
+
      ** NOTE! The following LGPL license applies to the talloc
      ** library. This does NOT imply that all of Samba is released
      ** under the LGPL
-   
+
    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
    License as published by the Free Software Foundation; either
@@ -42,6 +42,14 @@
 
 #include "talloc_testsuite.h"
 
+#ifndef disable_optimization
+#if __has_attribute(optimize)
+#define disable_optimization __attribute__((optimize("O0")))
+#else /* disable_optimization */
+#define disable_optimization
+#endif
+#endif /* disable_optimization */
+
 static struct timeval private_timeval_current(void)
 {
        struct timeval tv;
@@ -52,7 +60,7 @@ static struct timeval private_timeval_current(void)
 static double private_timeval_elapsed(struct timeval *tv)
 {
        struct timeval tv2 = private_timeval_current();
-       return (tv2.tv_sec - tv->tv_sec) + 
+       return (tv2.tv_sec - tv->tv_sec) +
               (tv2.tv_usec - tv->tv_usec)*1.0e-6;
 }
 
@@ -135,7 +143,7 @@ static void test_log_stdout(const char *message)
 }
 
 /*
-  test references 
+  test references
 */
 static bool test_ref1(void)
 {
@@ -150,7 +158,7 @@ static bool test_ref1(void)
        talloc_named_const(p1, 2, "x2");
        talloc_named_const(p1, 3, "x3");
 
-       r1 = talloc_named_const(root, 1, "r1"); 
+       r1 = talloc_named_const(root, 1, "r1");
        ref = talloc_reference(r1, p2);
        talloc_report_full(root, stderr);
 
@@ -192,7 +200,7 @@ static bool test_ref1(void)
 }
 
 /*
-  test references 
+  test references
 */
 static bool test_ref2(void)
 {
@@ -206,7 +214,7 @@ static bool test_ref2(void)
        talloc_named_const(p1, 1, "x3");
        p2 = talloc_named_const(p1, 1, "p2");
 
-       r1 = talloc_named_const(root, 1, "r1"); 
+       r1 = talloc_named_const(root, 1, "r1");
        ref = talloc_reference(r1, p2);
        talloc_report_full(root, stderr);
 
@@ -247,7 +255,7 @@ static bool test_ref2(void)
 }
 
 /*
-  test references 
+  test references
 */
 static bool test_ref3(void)
 {
@@ -287,7 +295,7 @@ static bool test_ref3(void)
 }
 
 /*
-  test references 
+  test references
 */
 static bool test_ref4(void)
 {
@@ -302,7 +310,7 @@ static bool test_ref4(void)
        talloc_named_const(p1, 1, "x3");
        p2 = talloc_named_const(p1, 1, "p2");
 
-       r1 = talloc_named_const(root, 1, "r1"); 
+       r1 = talloc_named_const(root, 1, "r1");
        ref = talloc_reference(r1, p2);
        talloc_report_full(root, stderr);
 
@@ -338,7 +346,7 @@ static bool test_ref4(void)
 
 
 /*
-  test references 
+  test references
 */
 static bool test_unlink1(void)
 {
@@ -353,7 +361,7 @@ static bool test_unlink1(void)
        talloc_named_const(p1, 1, "x3");
        p2 = talloc_named_const(p1, 1, "p2");
 
-       r1 = talloc_named_const(p1, 1, "r1");   
+       r1 = talloc_named_const(p1, 1, "r1");
        ref = talloc_reference(r1, p2);
        talloc_report_full(root, stderr);
 
@@ -439,11 +447,11 @@ static bool test_misc(void)
        CHECK_BLOCKS("misc", p1, 2);
        CHECK_BLOCKS("misc", root, 3);
 
-       torture_assert("misc", talloc_free(NULL) == -1, 
+       torture_assert("misc", talloc_free(NULL) == -1,
                                   "talloc_free(NULL) should give -1\n");
 
        talloc_set_destructor(p1, fail_destructor);
-       torture_assert("misc", talloc_free(p1) == -1, 
+       torture_assert("misc", talloc_free(p1) == -1,
                "Failed destructor should cause talloc_free to fail\n");
        talloc_set_destructor(p1, NULL);
 
@@ -458,10 +466,10 @@ static bool test_misc(void)
                "failed: strdup on NULL should give NULL\n");
 
        p2 = talloc_strndup(p1, "foo", 2);
-       torture_assert("misc", strcmp("fo", p2) == 0, 
+       torture_assert("misc", strcmp("fo", p2) == 0,
                                   "strndup doesn't work\n");
        p2 = talloc_asprintf_append_buffer(p2, "o%c", 'd');
-       torture_assert("misc", strcmp("food", p2) == 0, 
+       torture_assert("misc", strcmp("food", p2) == 0,
                                   "talloc_asprintf_append_buffer doesn't 
work\n");
        CHECK_BLOCKS("misc", p2, 1);
        CHECK_BLOCKS("misc", p1, 3);
@@ -634,7 +642,7 @@ static bool test_realloc_child(void)
        el1->list3 = talloc(el1, struct el2 *);
        el1->list3[0] = talloc(el1->list3, struct el2);
        el1->list3[0]->name = talloc_strdup(el1->list3[0], "testing2");
-       
+
        el2 = talloc(el1->list, struct el2);
        CHECK_PARENT("el2", el2, el1->list);
        el2_2 = talloc(el1->list2, struct el2);
@@ -742,7 +750,7 @@ static bool test_steal(void)
        talloc_steal(root, p2);
        CHECK_BLOCKS("steal", root, 2);
        CHECK_SIZE("steal", root, 20);
-       
+
        talloc_free(p2);
 
        CHECK_BLOCKS("steal", root, 1);
@@ -851,9 +859,14 @@ static bool test_unref_reparent(void)
        return true;
 }
 
+/* Make the size big enough to not fit into the stack */
+#define ALLOC_SIZE (128 * 1024)
+#define ALLOC_DUP_STRING "talloc talloc talloc talloc talloc talloc talloc"
+
 /*
   measure the speed of talloc versus malloc
 */
+static bool test_speed(void) disable_optimization;
 static bool test_speed(void)
 {
        void *ctx = talloc_new(NULL);
@@ -869,9 +882,9 @@ static bool test_speed(void)
        do {
                void *p1, *p2, *p3;
                for (i=0;i<loop;i++) {
-                       p1 = talloc_size(ctx, loop % 100);
-                       p2 = talloc_strdup(p1, "foo bar");
-                       p3 = talloc_size(p1, 300);
+                       p1 = talloc_size(ctx, loop % ALLOC_SIZE);
+                       p2 = talloc_strdup(p1, ALLOC_DUP_STRING);
+                       p3 = talloc_size(p1, ALLOC_SIZE);
                        (void)p2;
                        (void)p3;
                        talloc_free(p1);
@@ -879,20 +892,20 @@ static bool test_speed(void)
                count += 3 * loop;
        } while (private_timeval_elapsed(&tv) < 5.0);
 
-       fprintf(stderr, "talloc: %.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+       fprintf(stderr, "talloc:\t\t%.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
 
        talloc_free(ctx);
 
-       ctx = talloc_pool(NULL, 1024);
+       ctx = talloc_pool(NULL, ALLOC_SIZE * 2);
 
        tv = private_timeval_current();
        count = 0;
        do {
                void *p1, *p2, *p3;
                for (i=0;i<loop;i++) {
-                       p1 = talloc_size(ctx, loop % 100);
-                       p2 = talloc_strdup(p1, "foo bar");
-                       p3 = talloc_size(p1, 300);
+                       p1 = talloc_size(ctx, loop % ALLOC_SIZE);
+                       p2 = talloc_strdup(p1, ALLOC_DUP_STRING);
+                       p3 = talloc_size(p1, ALLOC_SIZE);
                        (void)p2;
                        (void)p3;
                        talloc_free(p1);
@@ -902,23 +915,62 @@ static bool test_speed(void)
 
        talloc_free(ctx);
 
-       fprintf(stderr, "talloc_pool: %.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+       fprintf(stderr, "talloc_pool:\t%.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+
+       tv = private_timeval_current();
+       count = 0;
+       do {
+               void *p1, *p2, *p3;
+               for (i=0;i<loop;i++) {
+                       p1 = malloc(loop % ALLOC_SIZE);
+                       p2 = strdup(ALLOC_DUP_STRING);
+                       p3 = malloc(ALLOC_SIZE);
+                       free(p1);
+                       free(p2);
+                       free(p3);
+               }
+               count += 3 * loop;
+       } while (private_timeval_elapsed(&tv) < 5.0);
+       fprintf(stderr, "malloc:\t\t%.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+
+       printf("\n# TALLOC_ZERO VS CALLOC SPEED\n");
+
+       ctx = talloc_new(NULL);
+
+       tv = private_timeval_current();
+       count = 0;
+       do {
+               void *p1, *p2, *p3;
+               for (i=0;i<loop;i++) {
+                       p1 = talloc_zero_size(ctx, loop % ALLOC_SIZE);
+                       p2 = talloc_strdup(p1, ALLOC_DUP_STRING);
+                       p3 = talloc_zero_size(p1, ALLOC_SIZE);
+                       (void)p2;
+                       (void)p3;
+                       talloc_free(p1);
+               }
+               count += 3 * loop;
+       } while (private_timeval_elapsed(&tv) < 5.0);
+
+       fprintf(stderr, "talloc_zero:\t%.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+
+       talloc_free(ctx);
 
        tv = private_timeval_current();
        count = 0;
        do {
                void *p1, *p2, *p3;
                for (i=0;i<loop;i++) {
-                       p1 = malloc(loop % 100);
-                       p2 = strdup("foo bar");
-                       p3 = malloc(300);
+                       p1 = calloc(1, loop % ALLOC_SIZE);
+                       p2 = strdup(ALLOC_DUP_STRING);
+                       p3 = calloc(1, ALLOC_SIZE);
                        free(p1);
                        free(p2);
                        free(p3);
                }
                count += 3 * loop;
        } while (private_timeval_elapsed(&tv) < 5.0);
-       fprintf(stderr, "malloc: %.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
+       fprintf(stderr, "calloc:\t\t%.0f ops/sec\n", 
count/private_timeval_elapsed(&tv));
 
        printf("success: speed\n");
 
@@ -928,15 +980,15 @@ static bool test_speed(void)
 static bool test_lifeless(void)
 {
        void *top = talloc_new(NULL);
-       char *parent, *child; 
+       char *parent, *child;
        void *child_owner = talloc_new(NULL);
 
        printf("test: lifeless\n# TALLOC_UNLINK LOOP\n");
 
        parent = talloc_strdup(top, "parent");
-       child = talloc_strdup(parent, "child");  
+       child = talloc_strdup(parent, "child");
        (void)talloc_reference(child, parent);
-       (void)talloc_reference(child_owner, child); 
+       (void)talloc_reference(child_owner, child);
        talloc_report_full(top, stderr);
        talloc_unlink(top, parent);
        talloc_unlink(top, child);
@@ -969,7 +1021,7 @@ static bool test_loop(void)
 
        parent = talloc_strdup(top, "parent");
        req1 = talloc(parent, struct req1);
-       req1->req2 = talloc_strdup(req1, "req2");  
+       req1->req2 = talloc_strdup(req1, "req2");
        talloc_set_destructor(req1->req2, test_loop_destructor);
        req1->req3 = talloc_strdup(req1, "req3");
        (void)talloc_reference(req1->req3, req1);
@@ -979,7 +1031,7 @@ static bool test_loop(void)
        talloc_report_full(NULL, stderr);
        talloc_free(top);
 
-       torture_assert("loop", loop_destructor_count == 1, 
+       torture_assert("loop", loop_destructor_count == 1,
                                   "FAILED TO FIRE LOOP DESTRUCTOR\n");
        loop_destructor_count = 0;
 
@@ -2097,7 +2149,7 @@ static bool test_magic_protection(void)
                 *
                 * Real attacks would attempt to set a real destructor.
                 */
-               memset(p1, '\0', 32);
+               memset_s(p1, 32, '\0', 32);
 
                /* Then the attack takes effect when the memory's freed. */
                talloc_free(pool);
@@ -2220,29 +2272,29 @@ bool torture_local_talloc(struct torture_context *tctx)
        test_reset();
        ret &= test_ref4();
        test_reset();
-       ret &= test_unlink1(); 
+       ret &= test_unlink1();
        test_reset();
        ret &= test_misc();
        test_reset();
        ret &= test_realloc();
        test_reset();
-       ret &= test_realloc_child(); 
+       ret &= test_realloc_child();
        test_reset();
-       ret &= test_steal(); 
+       ret &= test_steal();
        test_reset();
-       ret &= test_move(); 
+       ret &= test_move();
        test_reset();
        ret &= test_unref_reparent();
        test_reset();
-       ret &= test_realloc_fn(); 
+       ret &= test_realloc_fn();
        test_reset();
        ret &= test_type();
        test_reset();
-       ret &= test_lifeless(); 
+       ret &= test_lifeless();
        test_reset();
        ret &= test_loop();
        test_reset();
-       ret &= test_free_parent_deny_child(); 
+       ret &= test_free_parent_deny_child();
        test_reset();
        ret &= test_realloc_on_destructor_parent();
        test_reset();
diff --git a/lib/talloc/wscript b/lib/talloc/wscript
index 8b5e02d36c5..1b240ae3653 100644
--- a/lib/talloc/wscript
+++ b/lib/talloc/wscript
@@ -93,7 +93,7 @@ def build(bld):
                           public_headers=[],
                           enabled=bld.env.TALLOC_COMPAT1)
 
-        testsuite_deps = 'talloc'
+        testsuite_deps = 'talloc replace'
         if bld.CONFIG_SET('HAVE_PTHREAD'):
             testsuite_deps += ' pthread'
 


-- 
Samba Shared Repository

Reply via email to