https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109326
Bug ID: 109326 Summary: Bad assembler code generation for valid C on 886-64 Product: gcc Version: og10 (devel/omp/gcc-10) Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: susurrus.of.qualia at gmail dot com Target Milestone: --- Created attachment 54782 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54782&action=edit compiler output I have a bit of code here that is compiling without warnings and producing what appear to be gross errors in the assembler output for some functions. Pertinent info: $ gcc10.4 -v Using built-in specs. COLLECT_GCC=gcc10.4 COLLECT_LTO_WRAPPER=/home/stevet/libexec/gcc/x86_64-pc-linux-gnu/10.4.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-10.4.0/configure --prefix=/home/stevet --program-suffix=10.4 --enable-shared --enable-linker-build-id --without-included-gettext --enable-threads=posix --enable-nls --enable-bootstrap --enable-clocale=gnu --with-tune=generic --enable-languages=c --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.4.0 (GCC) uname -a Linux mx 5.18.0-4mx-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1~mx21+1 (2022-08-22) x86_64 GNU/Linux Unit compilation command: gcc10.4 -c -D_POSIX_C_SOURCE=200112L -DOLOCK_192 -DARCH_64 -DLINUX -I./ -I./ -pthread -m64 -std=c99 -Wall -Wextra -Wno-implicit-fallthrough -Werror -falign-functions=16 -falign-loops=1 -falign-jumps=1 -fno-inline-small-functions -fdiagnostics-color=never -fverbose-asm --save-temps -O3 -ggdb -o olock.o olock.c it should be noted that the bad code generation seems lessened, but not eliminated at -O2. Similarly, the problems were slightly different between gcc-10.2.1 and the most recent 10.x release. First thing to note is the assembler generated for the relatively simple olock_reset_op() function. Near as I can tell, the asm bears exactly zero relation to the C code of that function. The mystery constant $0xa06 seems notable and also appears in init_olock_op_element_struct(). init_olock_op_struct() begins with an access to %fs:0x0, which is then clobbered by an add $0x0, %rax shortly thereafter. Perhaps this is normal. olock_fsm_event() doesn't look good either. There are three callq *%reg instances where there should be at most one. I'm not sure about olock_op_allocator(). olock_opcode_acqs() looks suspicious, but I'm not that well versed in x86 so I could be wrong. If I knew that the dynamic linker would fixup the %fs:0x0 references to something normal I'd have more confidence about the rest of the code, but it looks like about half the functions aren't correct at this point. I've not yet tested any of this code yet; it is still subject to revision while I clean it up. With this type of algorithm it is unfortunately necessary to have mostly correct code before even thinking about testing it. This version is close to that point. As I note I can only attach one file, I'll include the assembler output for the troublesome olock_reset_op() function for reference. 216 0000000000000290 <olock_reset_op>: 217 290: 0f b7 57 10 movzwl 0x10(%rdi),%edx 218 294: 66 85 d2 test %dx,%dx 219 297: 0f 84 f4 04 00 00 je 791 <olock_reset_op+0x501> 220 29d: 8d 42 ff lea -0x1(%rdx),%eax 221 2a0: 66 83 f8 0e cmp $0xe,%ax 222 2a4: 0f 86 e8 04 00 00 jbe 792 <olock_reset_op+0x502> 223 2aa: 89 d1 mov %edx,%ecx 224 2ac: 48 8d 47 2c lea 0x2c(%rdi),%rax 225 2b0: 66 c1 e9 04 shr $0x4,%cx 226 2b4: 83 e9 01 sub $0x1,%ecx 227 2b7: 0f b7 c9 movzwl %cx,%ecx 228 2ba: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 229 2be: 48 c1 e1 07 shl $0x7,%rcx 230 2c2: 48 8d 8c 0f ac 01 00 lea 0x1ac(%rdi,%rcx,1),%rcx 231 2c9: 00 232 2ca: 41 b9 06 0a 00 00 mov $0xa06,%r9d 233 2d0: c7 40 f4 00 00 00 00 movl $0x0,-0xc(%rax) 234 2d7: 41 ba 06 0a 00 00 mov $0xa06,%r10d 235 2dd: 41 bb 06 0a 00 00 mov $0xa06,%r11d 236 2e3: c7 40 0c 00 00 00 00 movl $0x0,0xc(%rax) 237 2ea: be 06 0a 00 00 mov $0xa06,%esi 238 2ef: 41 b8 06 0a 00 00 mov $0xa06,%r8d 239 2f5: 48 05 80 01 00 00 add $0x180,%rax 240 2fb: c7 80 a4 fe ff ff 00 movl $0x0,-0x15c(%rax) 241 302: 00 00 00 242 305: c7 80 bc fe ff ff 00 movl $0x0,-0x144(%rax) 243 30c: 00 00 00 244 30f: c7 80 d4 fe ff ff 00 movl $0x0,-0x12c(%rax) 245 316: 00 00 00 246 319: c7 80 ec fe ff ff 00 movl $0x0,-0x114(%rax) 247 320: 00 00 00 248 323: c7 80 04 ff ff ff 00 movl $0x0,-0xfc(%rax) 249 32a: 00 00 00 250 32d: c7 80 1c ff ff ff 00 movl $0x0,-0xe4(%rax) 251 334: 00 00 00 252 337: c7 80 34 ff ff ff 00 movl $0x0,-0xcc(%rax) 253 33e: 00 00 00 254 341: c7 80 4c ff ff ff 00 movl $0x0,-0xb4(%rax) 255 348: 00 00 00 256 34b: c7 80 64 ff ff ff 00 movl $0x0,-0x9c(%rax) 257 352: 00 00 00 258 355: c7 80 7c ff ff ff 00 movl $0x0,-0x84(%rax) 259 35c: 00 00 00 260 35f: c7 40 94 00 00 00 00 movl $0x0,-0x6c(%rax) 261 366: c7 40 ac 00 00 00 00 movl $0x0,-0x54(%rax) 262 36d: c7 40 c4 00 00 00 00 movl $0x0,-0x3c(%rax) 263 374: c7 40 dc 00 00 00 00 movl $0x0,-0x24(%rax) 264 37b: c6 80 7c fe ff ff 00 movb $0x0,-0x184(%rax) 265 382: c6 80 94 fe ff ff 00 movb $0x0,-0x16c(%rax) 266 389: c6 80 ac fe ff ff 00 movb $0x0,-0x154(%rax) 267 390: c6 80 c4 fe ff ff 00 movb $0x0,-0x13c(%rax) 268 397: c6 80 dc fe ff ff 00 movb $0x0,-0x124(%rax) 269 39e: c6 80 f4 fe ff ff 00 movb $0x0,-0x10c(%rax) 270 3a5: c6 80 0c ff ff ff 00 movb $0x0,-0xf4(%rax) 271 3ac: c6 80 24 ff ff ff 00 movb $0x0,-0xdc(%rax) 272 3b3: c6 80 3c ff ff ff 00 movb $0x0,-0xc4(%rax) 273 3ba: c6 80 54 ff ff ff 00 movb $0x0,-0xac(%rax) 274 3c1: c6 80 6c ff ff ff 00 movb $0x0,-0x94(%rax) 275 3c8: c6 40 84 00 movb $0x0,-0x7c(%rax) 276 3cc: c6 40 9c 00 movb $0x0,-0x64(%rax) 277 3d0: c6 40 b4 00 movb $0x0,-0x4c(%rax) 278 3d4: c6 40 cc 00 movb $0x0,-0x34(%rax) 279 3d8: c6 40 e4 00 movb $0x0,-0x1c(%rax) 280 3dc: 66 44 89 88 80 fe ff mov %r9w,-0x180(%rax) 281 3e3: ff 282 3e4: 41 b9 06 0a 00 00 mov $0xa06,%r9d 283 3ea: 66 44 89 90 98 fe ff mov %r10w,-0x168(%rax) 284 3f1: ff 285 3f2: 41 ba 06 0a 00 00 mov $0xa06,%r10d 286 3f8: 66 44 89 98 b0 fe ff mov %r11w,-0x150(%rax) 287 3ff: ff 288 400: 41 bb 06 0a 00 00 mov $0xa06,%r11d 289 406: 66 89 b0 c8 fe ff ff mov %si,-0x138(%rax) 290 40d: be 06 0a 00 00 mov $0xa06,%esi 291 412: 66 44 89 80 e0 fe ff mov %r8w,-0x120(%rax) 292 419: ff 293 41a: 41 b8 06 0a 00 00 mov $0xa06,%r8d 294 420: 66 44 89 88 f8 fe ff mov %r9w,-0x108(%rax) 295 427: ff 296 428: 41 b9 06 0a 00 00 mov $0xa06,%r9d 297 42e: 66 44 89 90 10 ff ff mov %r10w,-0xf0(%rax) 298 435: ff 299 436: 41 ba 06 0a 00 00 mov $0xa06,%r10d 300 43c: 66 44 89 98 28 ff ff mov %r11w,-0xd8(%rax) 301 443: ff 302 444: 41 bb 06 0a 00 00 mov $0xa06,%r11d 303 44a: 66 89 b0 40 ff ff ff mov %si,-0xc0(%rax) 304 451: be 06 0a 00 00 mov $0xa06,%esi 305 456: 66 44 89 80 58 ff ff mov %r8w,-0xa8(%rax) 306 45d: ff 307 45e: 41 b8 06 0a 00 00 mov $0xa06,%r8d 308 464: 66 44 89 88 70 ff ff mov %r9w,-0x90(%rax) 309 46b: ff 310 46c: 41 b9 06 0a 00 00 mov $0xa06,%r9d 311 472: 66 44 89 50 88 mov %r10w,-0x78(%rax) 312 477: 66 44 89 58 a0 mov %r11w,-0x60(%rax) 313 47c: 66 89 70 b8 mov %si,-0x48(%rax) 314 480: 66 44 89 40 d0 mov %r8w,-0x30(%rax) 315 485: 66 44 89 48 e8 mov %r9w,-0x18(%rax) 316 48a: 48 39 c8 cmp %rcx,%rax 317 48d: 0f 85 37 fe ff ff jne 2ca <olock_reset_op+0x3a> 318 493: 89 d0 mov %edx,%eax 319 495: 83 e0 f0 and $0xfffffff0,%eax 320 498: f6 c2 0f test $0xf,%dl 321 49b: 0f 84 f8 02 00 00 je 799 <olock_reset_op+0x509> 322 4a1: 0f b7 f0 movzwl %ax,%esi 323 4a4: 8d 48 01 lea 0x1(%rax),%ecx 324 4a7: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 325 4ab: 48 c1 e6 03 shl $0x3,%rsi 326 4af: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 327 4b3: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 328 4ba: 00 329 4bb: 41 c6 40 28 00 movb $0x0,0x28(%r8) 330 4c0: 41 b8 06 0a 00 00 mov $0xa06,%r8d 331 4c6: 66 44 89 44 37 2c mov %r8w,0x2c(%rdi,%rsi,1) 332 4cc: 66 39 d1 cmp %dx,%cx 333 4cf: 0f 83 bc 02 00 00 jae 791 <olock_reset_op+0x501> 334 4d5: 0f b7 c9 movzwl %cx,%ecx 335 4d8: 41 bb 06 0a 00 00 mov $0xa06,%r11d 336 4de: 8d 70 02 lea 0x2(%rax),%esi 337 4e1: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 338 4e5: 48 c1 e1 03 shl $0x3,%rcx 339 4e9: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 340 4ed: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 341 4f4: 00 342 4f5: 41 c6 40 28 00 movb $0x0,0x28(%r8) 343 4fa: 66 44 89 5c 0f 2c mov %r11w,0x2c(%rdi,%rcx,1) 344 500: 66 39 d6 cmp %dx,%si 345 503: 0f 83 88 02 00 00 jae 791 <olock_reset_op+0x501> 346 509: 0f b7 f6 movzwl %si,%esi 347 50c: 41 ba 06 0a 00 00 mov $0xa06,%r10d 348 512: 8d 48 03 lea 0x3(%rax),%ecx 349 515: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 350 519: 48 c1 e6 03 shl $0x3,%rsi 351 51d: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 352 521: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 353 528: 00 354 529: 41 c6 40 28 00 movb $0x0,0x28(%r8) 355 52e: 66 44 89 54 37 2c mov %r10w,0x2c(%rdi,%rsi,1) 356 534: 66 39 ca cmp %cx,%dx 357 537: 0f 86 54 02 00 00 jbe 791 <olock_reset_op+0x501> 358 53d: 0f b7 c9 movzwl %cx,%ecx 359 540: 41 b9 06 0a 00 00 mov $0xa06,%r9d 360 546: 8d 70 04 lea 0x4(%rax),%esi 361 549: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 362 54d: 48 c1 e1 03 shl $0x3,%rcx 363 551: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 364 555: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 365 55c: 00 366 55d: 41 c6 40 28 00 movb $0x0,0x28(%r8) 367 562: 66 44 89 4c 0f 2c mov %r9w,0x2c(%rdi,%rcx,1) 368 568: 66 39 f2 cmp %si,%dx 369 56b: 0f 86 20 02 00 00 jbe 791 <olock_reset_op+0x501> 370 571: 0f b7 f6 movzwl %si,%esi 371 574: 8d 48 05 lea 0x5(%rax),%ecx 372 577: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 373 57b: 48 c1 e6 03 shl $0x3,%rsi 374 57f: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 375 583: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 376 58a: 00 377 58b: 41 c6 40 28 00 movb $0x0,0x28(%r8) 378 590: 41 b8 06 0a 00 00 mov $0xa06,%r8d 379 596: 66 44 89 44 37 2c mov %r8w,0x2c(%rdi,%rsi,1) 380 59c: 66 39 ca cmp %cx,%dx 381 59f: 0f 86 ec 01 00 00 jbe 791 <olock_reset_op+0x501> 382 5a5: 0f b7 c9 movzwl %cx,%ecx 383 5a8: 41 bb 06 0a 00 00 mov $0xa06,%r11d 384 5ae: 8d 70 06 lea 0x6(%rax),%esi 385 5b1: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 386 5b5: 48 c1 e1 03 shl $0x3,%rcx 387 5b9: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 388 5bd: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 389 5c4: 00 390 5c5: 41 c6 40 28 00 movb $0x0,0x28(%r8) 391 5ca: 66 44 89 5c 0f 2c mov %r11w,0x2c(%rdi,%rcx,1) 392 5d0: 66 39 f2 cmp %si,%dx 393 5d3: 0f 86 b8 01 00 00 jbe 791 <olock_reset_op+0x501> 394 5d9: 0f b7 f6 movzwl %si,%esi 395 5dc: 41 ba 06 0a 00 00 mov $0xa06,%r10d 396 5e2: 8d 48 07 lea 0x7(%rax),%ecx 397 5e5: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 398 5e9: 48 c1 e6 03 shl $0x3,%rsi 399 5ed: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 400 5f1: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 401 5f8: 00 402 5f9: 41 c6 40 28 00 movb $0x0,0x28(%r8) 403 5fe: 66 44 89 54 37 2c mov %r10w,0x2c(%rdi,%rsi,1) 404 604: 66 39 ca cmp %cx,%dx 405 607: 0f 86 84 01 00 00 jbe 791 <olock_reset_op+0x501> 406 60d: 0f b7 c9 movzwl %cx,%ecx 407 610: 41 b9 06 0a 00 00 mov $0xa06,%r9d 408 616: 8d 70 08 lea 0x8(%rax),%esi 409 619: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 410 61d: 48 c1 e1 03 shl $0x3,%rcx 411 621: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 412 625: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 413 62c: 00 414 62d: 41 c6 40 28 00 movb $0x0,0x28(%r8) 415 632: 66 44 89 4c 0f 2c mov %r9w,0x2c(%rdi,%rcx,1) 416 638: 66 39 f2 cmp %si,%dx 417 63b: 0f 86 50 01 00 00 jbe 791 <olock_reset_op+0x501> 418 641: 0f b7 f6 movzwl %si,%esi 419 644: 8d 48 09 lea 0x9(%rax),%ecx 420 647: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 421 64b: 48 c1 e6 03 shl $0x3,%rsi 422 64f: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 423 653: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 424 65a: 00 425 65b: 41 c6 40 28 00 movb $0x0,0x28(%r8) 426 660: 41 b8 06 0a 00 00 mov $0xa06,%r8d 427 666: 66 44 89 44 37 2c mov %r8w,0x2c(%rdi,%rsi,1) 428 66c: 66 39 ca cmp %cx,%dx 429 66f: 0f 86 1c 01 00 00 jbe 791 <olock_reset_op+0x501> 430 675: 0f b7 c9 movzwl %cx,%ecx 431 678: 41 bb 06 0a 00 00 mov $0xa06,%r11d 432 67e: 8d 70 0a lea 0xa(%rax),%esi 433 681: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 434 685: 48 c1 e1 03 shl $0x3,%rcx 435 689: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 436 68d: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 437 694: 00 438 695: 41 c6 40 28 00 movb $0x0,0x28(%r8) 439 69a: 66 44 89 5c 0f 2c mov %r11w,0x2c(%rdi,%rcx,1) 440 6a0: 66 39 f2 cmp %si,%dx 441 6a3: 0f 86 e8 00 00 00 jbe 791 <olock_reset_op+0x501> 442 6a9: 0f b7 f6 movzwl %si,%esi 443 6ac: 41 ba 06 0a 00 00 mov $0xa06,%r10d 444 6b2: 8d 48 0b lea 0xb(%rax),%ecx 445 6b5: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 446 6b9: 48 c1 e6 03 shl $0x3,%rsi 447 6bd: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 448 6c1: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 449 6c8: 00 450 6c9: 41 c6 40 28 00 movb $0x0,0x28(%r8) 451 6ce: 66 44 89 54 37 2c mov %r10w,0x2c(%rdi,%rsi,1) 452 6d4: 66 39 ca cmp %cx,%dx 453 6d7: 0f 86 b4 00 00 00 jbe 791 <olock_reset_op+0x501> 454 6dd: 0f b7 c9 movzwl %cx,%ecx 455 6e0: 41 b9 06 0a 00 00 mov $0xa06,%r9d 456 6e6: 8d 70 0c lea 0xc(%rax),%esi 457 6e9: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 458 6ed: 48 c1 e1 03 shl $0x3,%rcx 459 6f1: 4c 8d 04 0f lea (%rdi,%rcx,1),%r8 460 6f5: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 461 6fc: 00 462 6fd: 41 c6 40 28 00 movb $0x0,0x28(%r8) 463 702: 66 44 89 4c 0f 2c mov %r9w,0x2c(%rdi,%rcx,1) 464 708: 66 39 f2 cmp %si,%dx 465 70b: 0f 86 80 00 00 00 jbe 791 <olock_reset_op+0x501> 466 711: 0f b7 f6 movzwl %si,%esi 467 714: 8d 48 0d lea 0xd(%rax),%ecx 468 717: 48 8d 34 76 lea (%rsi,%rsi,2),%rsi 469 71b: 48 c1 e6 03 shl $0x3,%rsi 470 71f: 4c 8d 04 37 lea (%rdi,%rsi,1),%r8 471 723: 41 c7 40 20 00 00 00 movl $0x0,0x20(%r8) 472 72a: 00 473 72b: 41 c6 40 28 00 movb $0x0,0x28(%r8) 474 730: 41 b8 06 0a 00 00 mov $0xa06,%r8d 475 736: 66 44 89 44 37 2c mov %r8w,0x2c(%rdi,%rsi,1) 476 73c: 66 39 ca cmp %cx,%dx 477 73f: 76 50 jbe 791 <olock_reset_op+0x501> 478 741: 0f b7 c9 movzwl %cx,%ecx 479 744: 83 c0 0e add $0xe,%eax 480 747: 48 8d 0c 49 lea (%rcx,%rcx,2),%rcx 481 74b: 48 c1 e1 03 shl $0x3,%rcx 482 74f: 48 8d 34 0f lea (%rdi,%rcx,1),%rsi 483 753: c7 46 20 00 00 00 00 movl $0x0,0x20(%rsi) 484 75a: c6 46 28 00 movb $0x0,0x28(%rsi) 485 75e: be 06 0a 00 00 mov $0xa06,%esi 486 763: 66 89 74 0f 2c mov %si,0x2c(%rdi,%rcx,1) 487 768: 66 39 c2 cmp %ax,%dx 488 76b: 76 24 jbe 791 <olock_reset_op+0x501> 489 76d: 0f b7 c0 movzwl %ax,%eax 490 770: 48 8d 04 40 lea (%rax,%rax,2),%rax 491 774: 48 c1 e0 03 shl $0x3,%rax 492 778: 48 8d 14 07 lea (%rdi,%rax,1),%rdx 493 77c: c7 42 20 00 00 00 00 movl $0x0,0x20(%rdx) 494 783: c6 42 28 00 movb $0x0,0x28(%rdx) 495 787: ba 06 0a 00 00 mov $0xa06,%edx 496 78c: 66 89 54 07 2c mov %dx,0x2c(%rdi,%rax,1) 497 791: c3 retq 498 792: 31 c0 xor %eax,%eax 499 794: e9 08 fd ff ff jmpq 4a1 <olock_reset_op+0x211> 500 799: c3 retq 501 79a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) There seems to be some structure in the above, but in comparison to the source it doesn't seem the slightest bit relevant.