[Chicken-users] valgrind
I'm still asking myself why I can't run chicken program under valgrind. Since there's a lot going on at this time I'm about to forget. Hence here an update for those who care and the archive. I've traced the call coming from irregex.c down to valgrind complaining as soon as *all-chars* value is accessed. (I did not yet come around to prepare a test case where I'd only cons up a random list (or a list of char's - wild guess) and see if I can sync valgrind's output enough to verify that it will complain on code which is supposed to work at all counts.) (The back of my head now might be about to forget that a less expensive implementations of charset and their merges might be obvious; even in R5RS Scheme. Right now they are a pretty nice test case to see how much a few instructions shaved from the cons operation would speed up the program initialization over all.) Furthermore I'd know that chicken might to weird things to the stack. Maybe valgrind was just a bad choice. Wild guess again. But as far as I understand valgrind so far (which is close to nothing at all), it would instrument then executable code only. Given that I see access to uninitialized memory resulting from stack allocation - I wonder: maybe valgrind is right? Now, at this time it my memory consumption issue does not appear. Real work is pressing. I'll not find the time to dig deeper into these things any time soon. Except if I'm wrong about the "does not appear". /Jerry ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind
On Thu, Sep 29, 2011 at 09:38:44PM +0200, Jörg F. Wittenberger wrote: > I'm still asking myself why I can't run chicken program under valgrind. > > Since there's a lot going on at this time I'm about to forget. > Hence here an update for those who care and the archive. > > I've traced the call coming from irregex.c down to valgrind complaining > as soon as *all-chars* value is accessed. > > (I did not yet come around to prepare a test case where I'd only > cons up a random list (or a list of char's - wild guess) and see if > I can sync valgrind's output enough to verify that it will complain > on code which is supposed to work at all counts.) > > (The back of my head now might be about to forget that a less expensive > implementations of charset and their merges might be obvious; even in > R5RS Scheme. Right now they are a pretty nice test case to see how > much a few instructions shaved from the cons operation would speed up the > program initialization over all.) > > Furthermore I'd know that chicken might to weird things to the stack. > Maybe valgrind was just a bad choice. Wild guess again. > > But as far as I understand valgrind so far (which is close to nothing > at all), it would instrument then executable code only. > > Given that I see access to uninitialized memory resulting from stack > allocation - I wonder: maybe valgrind is right? > > Now, at this time it my memory consumption issue does not appear. > Real work is pressing. I'll not find the time to dig deeper into > these things any time soon. > > Except if I'm wrong about the "does not appear". > > /Jerry > Since it seems as good a day as any for wild-ass speculation, what about this scenario: We allocate an object, set the tag pointer, but don't fill out all of the memory yet. We're yet to need it. Oops! The gc gets called, and that object has to move. So we memcpy it to it's new home and tuck it into bed. Wouldn't valgrind complain that we just memcpy'd uninitialized memory, even though in this case it is a completely safe operation? -Alan -- .i ma'a lo bradi cu penmi gi'e du ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind
On Sep 29 2011, Alan Post wrote: Since it seems as good a day as any for wild-ass speculation, what about this scenario: We allocate an object, set the tag pointer, but don't fill out all of the memory yet. We're yet to need it. Oops! The gc gets called, and that object has to move. So we memcpy it to it's new home and tuck it into bed. Wouldn't valgrind complain that we just memcpy'd uninitialized memory, even though in this case it is a completely safe operation? Maybe. Maybe Felix could shed some light. I can't. However as far as I read it, this should not happen. The pattern is to allocate some space on the stack, fill stuff in, eventually jump to the continuation. GC will only happen during the jump, not within the time between allocation and assignment of initial values. (Beware: It's my reading of source!) ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind
Hallo, On Thu, Sep 29, 2011 at 10:25 PM, Jörg F. Wittenberger wrote: > > The pattern is to allocate some space on the stack, fill stuff in, > eventually jump to the continuation. GC will only happen during the > jump, not within the time between allocation and assignment of initial > values. (Beware: It's my reading of source!) > As I remember, at every entry point of the generated C functions there is a test for the height of the C stack so far. If the size (height * sizeof(C_word)) is bigger than the current nursery size, a GC occurs. At every other point alloca is supposed to succeed. -- -alex http://www.artisancoder.com/ ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] valgrind - more details
I managed to narrow the valgrind complaint's scope: Try as a test: valgrind --log-file=/tmp/csi.vg --track-origins=yes csi -e "(vector 'a 'b)" The result in /tmp/csi.vg looks good. However try: valgrind --log-file=/tmp/csi.vg --track-origins=yes csi -e "42" and run into an example of the problem like this: ==13112== Conditional jump or move depends on uninitialised value(s) ==13112== at 0x510393E: C_a_i_string_to_number (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EF4943: f_9002r (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EF4B06: f_9002 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EBC149: f_13581 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EEFC54: f_13566 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EED615: f_7307 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x50DF742: allocate_vector_2 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x5102975: C_allocate_vector (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4ECA36E: f_7155r (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4ED5DB9: f_7155 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EECE1E: f_7294 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== by 0x4EBC630: f_13773 (in /home/jfw/build/Scheme/chicken-askemos/debian/tmp/usr/lib/libchicken.so.6) ==13112== Uninitialised value was created by a stack allocation ==13112== at 0x4014795: _dl_runtime_resolve (dl-trampoline.S:42) ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind - more details
On Oct 5 2011, Jörg F. Wittenberger wrote: ==13112== Conditional jump or move depends on uninitialised value(s) ==13112== at 0x510393E: C_a_i_string_to_number (in While I've been following this valgrind hint I ran into some code in C_a_i_string_to_number ... as expectable this code is kinda complicated since the problem it solves is just below the level where one would consider to write a real parser and at the same time beyond complexity you want to handle ad hoc. To reduce my confusion, I allowed myself to make some changes, even if those where just for me to make clear what the code tries to do. I found two occurrences of strlen (C_strlen that is), which would for no good reason scan the memory while the result could be computed by simple pointer arithmetic: @@ -7519,19 +7515,19 @@ errno = 0; fn = C_strtod(str, &eptr2); if(fn == HUGE_VAL && errno == ERANGE) return 0; else if(eptr2 == str) return 0; - else if(*eptr2 == '\0' || (eptr != eptr2 && !C_strncmp(eptr2, ".0", C_strlen(eptr2 { + else if(*eptr2 == '\0' || (eptr != eptr2 && !C_strncmp(eptr2, ".0", len - (eptr2-str { *flo = fn; return 2; } return 0; } else if((n & C_INT_SIGN_BIT) != ((n << 1) & C_INT_SIGN_BIT)) { /* doesn't fit into fixnum? */ - if(*eptr == '\0' || !C_strncmp(eptr, ".0", C_strlen(eptr))) { + if(*eptr == '\0' || !C_strncmp(eptr, ".0", len - (eptr-str))) { *flo = (double)n; return 2; } else return 0; } The there is one thing which looks like a typo to me: @@ -7505,11 +7501,11 @@ return 0; } #endif } - if(C_strpbrk(str, "xX\0") != NULL) return 0; + if(C_strpbrk(str, "xX") != NULL) return 0; errno = 0; n = C_strtol(str, &eptr, radix); if(((n == LONG_MAX || n == LONG_MIN) && errno == ERANGE) || *eptr != '\0') { And a few spots, where the goto-dance was too confusion. (Full diff attached.) The sad news: while this might help to make the source better, it's for acaemic reason. So far the problem is the same valgrind --log-file=/tmp/csi.vg --track-origins=yes csi -e "(display (length '( n01 n02 n03 n04 n05 n06 n07 n08 n09 n10 n11 n12 n13 n14 n15 n16 n17 n18 n19 n20 n21 n22 n23 n24 n25 n26 n27 n28 n29 n30 n31 n32 n33 n34 n35 n36 n37 n38 n39 n40 n41 n42)))" will print 42 and exit with a clean valgrind report. valgrind --log-file=/tmp/csi.vg --track-origins=yes csi -e 42 however will complain about access to uninitialised values. /Jerry I'll shut up on this topic (which looks probably boring) once I either know that there is a known reason or someone taking care of it (or helps me to do so). Until then I better keep you informed, since I'm afraid it will come up again. Index: runtime.c === --- runtime.c +++ runtime.c @@ -7304,20 +7304,19 @@ if(C_immediatep(str) || C_header_bits(str) != C_STRING_TYPE) barf(C_BAD_ARGUMENT_TYPE_ERROR, "string->number", str); if((n = C_header_size(str)) == 0) { fail: -n = C_SCHEME_FALSE; -goto fini; +return C_SCHEME_FALSE; } if(n >= STRING_BUFFER_SIZE - 1) goto fail; - C_memcpy(sptr = buffer, C_c_string(str), n > (STRING_BUFFER_SIZE - 1) ? STRING_BUFFER_SIZE : n); + C_memcpy(sptr = buffer, C_c_string(str), n); buffer[ n ] = '\0'; - while(*sptr == '#') { + if(*sptr == '#') { switch(C_tolower((int)*(++sptr))) { case 'b': if(radixpf) goto fail; else { radix = 2; radixpf = 1; } break; case 'o': if(radixpf) goto fail; else { radix = 8; radixpf = 1; } break; case 'd': if(radixpf) goto fail; else { radix = 10; radixpf = 1; } break; case 'x': if(radixpf) goto fail; else { radix = 16; radixpf = 1; } break; @@ -7329,11 +7328,11 @@ ++sptr; } /* Scan for embedded special characters and do basic sanity checking: */ for(eptr = sptr, rptr = sptr; *eptr != '\0'; ++eptr) { -switch(C_tolower((int)*eptr)) { +switch(*eptr) { case '.': if(periodf || ratf || expf) goto fail; periodf = 1; break; @@ -7352,15 +7351,15 @@ sharpf = 0; /* Allow sharp signs in the denominator */ ratf = 1; rptr = eptr+1; break; -case 'e': -case 'd': -case 'f': -case 'l': -case 's': +case 'e': case 'E': +case 'd': case 'D': +case 'f': case 'F': +case 'l': case 'L': +case 's': case 'S': /* Don't set exp flag if we see the "f" in "inf.0" (preceded by 'n') */ /* Other failure modes are handled elsewhere. */ if(radix == 10 && eptr > sptr && C_tolower((int)*(eptr-1)) != 'n') { if (ratf) goto fail; @@ -7377,19 +7376,17 @@ if (eptr == rptr) goto fail; /* Disallow "empty" numbers like "#x" and "1/" */ /* check for rational representation: */ if(rptr != sptr) { if (*(rptr) == '-' || *(rptr) == '+') { - n = C_SCHEME_FALSE; - goto fini; + return C_SCHEME_FALSE; } *(rptr-1) = '\0'; switch(convert_st
Re: [Chicken-users] valgrind - more details
On Oct 5 2011, Jörg F. Wittenberger wrote: I found two occurrences of strlen (C_strlen that is), which would for no good reason scan the memory while the result could be computed by simple pointer arithmetic: @@ -7519,19 +7515,19 @@ errno = 0; ... Sorry. Somehow cut&paste made a mess out of the diff citations. Gladly I appended the full diff. But note: this change only removes unnecessary complexity from the code. The code itself is sill somewhat incomprehensible to me. I'm not sure that I'm in a position to supply incremental improvements here. Best regareds /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind - more details
Dear Jörg, * Jörg F. Wittenberger [111005 22:29]: > While I've been following this valgrind hint I ran into some > code in C_a_i_string_to_number ... as expectable this code > is kinda complicated since the problem it solves is just below > the level where one would consider to write a real parser and > at the same time beyond complexity you want to handle ad hoc. Thanks for your analysis! I am interested in this patch, but I would like to know a couple of things: 1. What is this diff against? I could not apply it to master or 4.7.0 or even 4.7.0.3-st. This would be helpful information to get started. It is a few lines off and through all this confusion maybe you are missing something. I don't know... 2. Did you run a make check against this patch? make check has some number syntax checks in it and I believe changing the while loop below to an if statement should trigger those. If not it could be a bug (in the test and the patch). Peter confirmed that crazy stuff like #x#e123 is valid number syntax. So could I ask you about some of your precious minutes to * get a git checkout, the command is git clone git://code.call-cc.org/chicken-core * diff against master, (make changes run git diff > more-changes.diff ) * run make check and * send the patch to chicken-hack...@nongnu.org ? This would aid me and others in reviewing and understanding the changes (well in this case your intentions are clear, but the while loop is still subtle). Also it would ease the process of trying your proposals for everyone else too. Thanks again for taking your time for this! Christian -- Who can (make) the muddy water (clear)? Let it be still, and it will gradually become clear. Who can secure the condition of rest? Let movement go on, and the condition of rest will gradually arise. -- Lao Tse. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind - more details
On Oct 5 2011, Christian Kellermann wrote: Hi Christian, Dear Jörg, * Jörg F. Wittenberger [111005 22:29]: While I've been following this valgrind hint I ran into some code in C_a_i_string_to_number ... as expectable this code is kinda complicated since the problem it solves is just below the level where one would consider to write a real parser and at the same time beyond complexity you want to handle ad hoc. Thanks for your analysis! I am interested in this patch, but I would like to know a couple of things: Wait, this must be a missunderstanding. This diff was only intended to be read, not compiled. In fact I did not even compile it. Because it does not fix a single thing. Also I'm not sure (after some sleep) that I did not introduce a fresh bug in the hash-prefix evaluation. Today I re-read the R5RS, which left me wondering. Would it be #bi101010.1 or whould it be #b#i101010.1 ? Furthermore: Also I changed my mind: I'm afraid numbers in Scheme are complex enough to use a real parser to read them. The complexity point was here (and a few line down again): else if(*eptr2 == '\0' || (eptr != eptr2 && !C_strncmp(eptr2, ".0", C_strlen(eptr2 { *flo = fn; return 2; } This use of strlen to compute the n for strncmp removes the point from using strncmp alltogether. (Which would be to provide an upper bound to the comparsion in case you don't want to compare to the string end or you string does not end with '\0'. It's even worse, since the string is now scanned twice.) So either one would want to use strcmp here, trusting that the final '\0' is there (we just put one there before, so given that eptr2 (and further down eptr) did not accidentally end up past the string end marker - or one wants to supply a upper bound computable in constant time. That's what I did like this: else if(*eptr2 == '\0' || (eptr != eptr2 && !C_strncmp(eptr2, ".0", len - (eptr2-str { *flo = fn; return 2; } Given that "len" was found to be strlen(str) and eptr2 should be left within the string, we can simply substract the difference (eptr2-str) from the counted len of str. However let me reiterate: it does not fix valgrind's compaint. And I'm not yet seeing any light in the string_number code so far. Best Regards /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind - more details
On Thu, Oct 06, 2011 at 12:47:25PM +0200, Jörg F. Wittenberger wrote: > On Oct 5 2011, Christian Kellermann wrote: > Today I re-read the R5RS, which left me wondering. Would it be > #bi101010.1 or whould it be #b#i101010.1 ? The latter. See the Lexical Syntax section, it has these productions: -> | -> | #e | #i -> #b -> #o -> | #d -> #x > Furthermore: > Also I changed my mind: I'm afraid numbers in Scheme are complex > enough to use a real parser to read them. I agree. There's pure Scheme code for parsing numbers syntax in trunk of the numbers egg (and the latest release has it too). It's not very simple either, but it doesn't have any dependencies. When I posted my patches to fix some of the more esoteric syntax details in Chicken, I initially also tried my Scheme code, but it is about twice as slow as this C code, that's why I decided against switching. Also, this code needs to be callable from C because decode_literal() relies on it. (but that could maybe be worked around) > However let me reiterate: it does not fix valgrind's compaint. I think the patch is a good improvement in readability of the core code, so why not put it in? (of course after fixing the while/if mistake) Cheers, Peter -- http://sjamaan.ath.cx -- "The process of preparing programs for a digital computer is especially attractive, not only because it can be economically and scientifically rewarding, but also because it can be an aesthetic experience much like composing poetry or music." -- Donald Knuth ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] valgrind - more details
On Oct 6 2011, Peter Bex wrote: Furthermore: Also I changed my mind: I'm afraid numbers in Scheme are complex enough to use a real parser to read them. I agree. There's pure Scheme code for parsing numbers syntax in trunk of the numbers egg (and the latest release has it too). It's not very simple either, but it doesn't have any dependencies. When I posted my patches to fix some of the more esoteric syntax details in Chicken, I initially also tried my Scheme code, but it is about twice as slow as this C code, that's why I decided against switching. This sound like a good case to try http://people.csail.mit.edu/jaffer/Docupage/schlep.html Also, this code needs to be callable from C because decode_literal() relies on it. (but that could maybe be worked around) Did I admit in that other mail that I did only compile, but not test this patch? I'd better had done so! In fact the first valgrind complain is gone. "42" without the quotes is now a valid Scheme program wrt. the valgrind report. However let me reiterate: it does not fix valgrind's compaint. Wrong I was. I think the patch is a good improvement in readability of the core code, so why not put it in? (of course after fixing the while/if mistake) So let's go for it. The attached diff is against git master as of 5p.m. Berlin. However the only thing, which this patch does wrt. program logic is to avoid running afoul of a possibly missing end marker in the strlen call. This cast some doubt that the code might still need revision. /Jörg --- /home/jfw/build/Scheme/chicken-core/runtime.c 2011-10-05 00:03:16.0 +0200 +++ runtime.c 2011-10-06 16:52:21.030542814 +0200 @@ -469,7 +469,7 @@ static int C_fcall hash_string(int len, C_char *str, unsigned int m) C_regparm; static C_word C_fcall lookup(C_word key, int len, C_char *str, C_SYMBOL_TABLE *stable) C_regparm; static double compute_symbol_table_load(double *avg_bucket_len, int *total); -static C_word C_fcall convert_string_to_number(C_char *str, int radix, C_word *fix, double *flo) C_regparm; +static C_word C_fcall convert_string_to_number(C_char *str, int len, int radix, C_word *fix, double *flo) C_regparm; static void C_fcall remark_system_globals(void) C_regparm; static void C_fcall really_remark(C_word *x) C_regparm; static C_word C_fcall intern0(C_char *name) C_regparm; @@ -7270,13 +7270,12 @@ if((n = C_header_size(str)) == 0) { fail: -n = C_SCHEME_FALSE; -goto fini; +return C_SCHEME_FALSE; } if(n >= STRING_BUFFER_SIZE - 1) goto fail; - C_memcpy(sptr = buffer, C_c_string(str), n > (STRING_BUFFER_SIZE - 1) ? STRING_BUFFER_SIZE : n); + C_memcpy(sptr = buffer, C_c_string(str), n); buffer[ n ] = '\0'; while(*sptr == '#') { @@ -7295,7 +7294,7 @@ /* Scan for embedded special characters and do basic sanity checking: */ for(eptr = sptr, rptr = sptr; *eptr != '\0'; ++eptr) { -switch(C_tolower((int)*eptr)) { +switch(*eptr) { case '.': if(periodf || ratf || expf) goto fail; @@ -7318,11 +7317,11 @@ ratf = 1; rptr = eptr+1; break; -case 'e': -case 'd': -case 'f': -case 'l': -case 's': +case 'e': case 'E': +case 'd': case 'D': +case 'f': case 'F': +case 'l': case 'L': +case 's': case 'S': /* Don't set exp flag if we see the "f" in "inf.0" (preceded by 'n') */ /* Other failure modes are handled elsewhere. */ if(radix == 10 && eptr > sptr && C_tolower((int)*(eptr-1)) != 'n') { @@ -7343,15 +7342,13 @@ /* check for rational representation: */ if(rptr != sptr) { if (*(rptr) == '-' || *(rptr) == '+') { - n = C_SCHEME_FALSE; - goto fini; + return C_SCHEME_FALSE; } *(rptr-1) = '\0'; -switch(convert_string_to_number(sptr, radix, &n1, &fn1)) { +switch(convert_string_to_number(sptr, n - (sptr - buffer), radix, &n1, &fn1)) { case 0: - n = C_SCHEME_FALSE; - goto fini; + return C_SCHEME_FALSE; case 1: fn1 = (double)n1; @@ -7362,9 +7359,9 @@ sptr = rptr; } - + /* convert number and return result: */ - switch(convert_string_to_number(sptr, radix, &n, &fn)) { + switch(convert_string_to_number(sptr, n - (sptr - buffer), radix, &n, &fn)) { case 0: /* failed */ n = C_SCHEME_FALSE; break; @@ -7435,13 +7432,12 @@ } -C_regparm C_word C_fcall convert_string_to_number(C_char *str, int radix, C_word *fix, double *flo) +C_regparm C_word C_fcall convert_string_to_number(C_char *str, int len, int radix, C_word *fix, double *flo) { unsigned long ln; C_word n; C_char *eptr, *eptr2; double fn; - int len = C_strlen(str); if(radix == 10) { if (len >= 4 && len <= 6) { /* DEPRECATED, TODO: Change to (len == 4) */ @@ -7471,11 +7467,11 @@ #endif } - if(C_strpbrk(str, "xX\0") != NULL) return 0; + if(C_strpbrk(str, "xX") != NULL) return 0; errno = 0; n = C_strtol(str, &eptr, radix); - + if(((n