http://d.puremagic.com/issues/show_bug.cgi?id=2964
Summary: Reading string into associative array key garbles string Product: D Version: 1.043 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: DMD AssignedTo: bugzi...@digitalmars.com ReportedBy: d...@mailinator.com Created an attachment (id=363) --> (http://d.puremagic.com/issues/attachment.cgi?id=363) .tar.gz file with D1 code illustrating bug and one-line sample input text file Either I'm doing something dumb, or I've found a bug where a string gets trashed between storing it as key in an associative array and then getting it back out. The weird thing is it only happens when the string is read in from a file. Adding the same string as a literal doesn't trigger it. The attached D1 code simply reads in each line from a BufferedFile, storing it as key in an uint[string] AA that counts how many times each line occurred. It verifies the the line is valid UTF-8 going in. It then loops over the keys in the AA, verifying that they're valid UTF-8 and printing them out. Only the string fails validation and gives an error if you try to print it out. I don't think there's anything special about the particular string that I'm using. I verified this with three compilers on two operating systems: DMD 1.043 on Ubuntu 8.10 x86_64 gcc version 4.1.3 20070831 (prerelease gdc 0.25, using dmd 1.021) (Ubuntu 0.25-4.1.2-16ubuntu1) gdcmac trunk r229 (based on gcc 4.0.1) on Mac OS X 10.5.5 x86_64 Here is some sample output: Reading data... Matched bad input. Read 1 lines, 1 unique (0 non-UTF). Checking... 2nd validate: string \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\80\245\34\158\255\127\0\0\144\180\123\1\0\0\0\0\112\243\34\158\255\127 didn't validate as UTF Error: 4invalid UTF-8 sequence The Unicode string printed out (as decimal chars) varies each time under Linux, perhaps suggesting its reading some memory it oughtn't? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------