[ https://issues.apache.org/jira/browse/PROTON-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955707#comment-16955707 ]
Roddie Kieley commented on PROTON-2122: --------------------------------------- [~jdanek] Hadn't seen this comment but realized same here in testing. I also changed the [magic 16|https://github.com/apache/qpid-proton/blob/0.29.0/c/src/sasl/sasl.c#L475] to be a magic 32 here as on this version of macOS, 10.14, it was listing 24 values for 12 in duplicate: {quote} SRP SRP GS2-IAKERB GS2-KRB5 SCRAM-SHA-1 SCRAM-SHA-256 SCRAM-SHA-256 SCRAM-SHA-1 GS2-KRB5 GS2-IAKERB GSS-SPNEGO GSSAPI GSS-SPNEGO GSSAPI DIGEST-MD5 DIGEST-MD5 OTP OTP NTLM CRAM-MD5 NTLM CRAM-MD5 ANONYMOUS ANONYMOUS {quote} With the magic 32 sasl related tests pass and close the gap in earlier reported test success rate: {quote} 62/62 Test #62: system_tests_router_mesh .......................... Passed 37.57 sec 82% tests passed, 11 tests failed out of 62 Total Test time (real) = 945.43 sec The following tests FAILED: 26 - system_tests_protocol_settings (Failed) 27 - system_tests_qdmanage (Failed) 29 - system_tests_sasl_plain (Failed) 37 - system_tests_log_message_components (Failed) 39 - system_tests_auth_service_plugin (Failed) 40 - system_tests_authz_service_plugin (Failed) 43 - system_tests_topology_disposition (Failed) 50 - system_tests_ssl (Failed) 54 - system_tests_http (Failed) 57 - system_tests_core_client (Failed) 60 - system_tests_multicast (Failed) Errors while running CTest (p3venv) i7mbp:0 rkieley$ {quote} As to 1), why there are duplicates, I'm not sure. I don't recall it being this way when I did the original OSX work via PROTON-522, although I'd have to look at my logs to be sure. At the moment I'm on 10.14 with cyrus-sasl 2.1.27 vs. 10.11 with cyrus-sasl 2.1.26, so it's possible either of those could make a difference. Would need to test to find out. In neither case was there a mixture of system sasl and MacPorts sasl as that had caused other subtle issues that were resolved by ensuring only one or the other was utilized. Maybe [~astitcher] knows off hand. For 2), I always got a message here, 'Abort Trap: 6'. Stepping through with the debugger would always yield a SIGABRT. Taking a look at what abort trap 6 indicates, it says that you've done some bad memory stuff and if you step through with the debugger the failure occurs [here|https://github.com/apache/qpid-proton/blob/0.29.0/c/src/sasl/sasl.c#L484] calling {code} free(mechlist) {code} Ultimately the solution here is to not have a magic number and be a little more sophisticated in handling the memory. However I think the can could be kicked down the road a little further by going with the magic 32 for the moment, and updating the TODO to say something like {quote} // TODO: PROTON-2122: Replace magic number with dynamically sized memory {quote} which would at least eliminate the failures for the moment and allow the test failures to be cleaned up to 100% hopefully. > Hardcoded limit of 16 sasl mechanisms is insufficient on macOS > -------------------------------------------------------------- > > Key: PROTON-2122 > URL: https://issues.apache.org/jira/browse/PROTON-2122 > Project: Qpid Proton > Issue Type: Bug > Components: proton-c > Affects Versions: proton-c-0.29.0 > Environment: macOS 10.14 > Reporter: Jiri Daněk > Priority: Major > > On macOS, any time a SASL exchange happens in tests, e.g. > {{cpp-example-container}} or qpid dispatch tests, or when > {{cpp/examples/simple_send}} connects to {{cpp/examples/broker}}, the broker > crashes with the following stack trace > {noformat} > Thread 1 Crashed: > 0 libsystem_kernel.dylib 0x00007fff697c62c6 __pthread_kill + 10 > 1 libsystem_pthread.dylib 0x00007fff69881bf1 pthread_kill + 284 > 2 libsystem_c.dylib 0x00007fff69730745 __abort + 144 > 3 libsystem_c.dylib 0x00007fff69730ff3 __stack_chk_fail + > 205 > 4 libqpid-proton-core.10.dylib 0x0000000106d81f89 > pni_post_sasl_frame + 1321 > 5 ??? 0x00007fd57acb59bb 0 + 140554864908731 > {noformat} > AddressSanitizer (Valgrind does not run on most recent macOS releases) points > out the reason. > {noformat} > $ ../cmake-build-debug/cpp/examples/broker > broker listening on 5672 > ================================================================= > ==42793==ERROR: AddressSanitizer: stack-buffer-overflow on address > 0x70000c4bcbe0 at pc 0x0001013663e1 bp 0x70000c4bc830 sp 0x70000c4bc828 > WRITE of size 8 at 0x70000c4bcbe0 thread T3 > #0 0x1013663e0 in pni_split_mechs sasl.c:443 > #1 0x1013646ea in pni_post_sasl_frame sasl.c:480 > #2 0x101357fad in pn_output_write_sasl sasl.c:677 > #3 0x101323909 in transport_produce transport.c:2751 > #4 0x10131ffd3 in pn_transport_pending transport.c:3030 > #5 0x1012b8755 in pn_connection_driver_write_buffer > connection_driver.c:120 > #6 0x10120240f in leader_process_pconnection libuv.c:909 > #7 0x1011f8b48 in leader_lead_lh libuv.c:1008 > #8 0x1011f94f3 in pn_proactor_wait libuv.c:1062 > #9 0x10188c55d in proton::container::impl::thread() > proactor_container_impl.cpp:753 > #10 0x1018bca31 in void* > std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, > std::__1::default_delete<std::__1::__thread_struct> >, void > (proton::container::impl::*)(), proton::container::impl*> >(void*) thread:352 > #11 0x7fff6987f2ea in _pthread_body > (libsystem_pthread.dylib:x86_64+0x32ea) > #12 0x7fff69882248 in _pthread_start > (libsystem_pthread.dylib:x86_64+0x6248) > #13 0x7fff6987e40c in thread_start (libsystem_pthread.dylib:x86_64+0x240c) > Address 0x70000c4bcbe0 is located in stack of thread T3 at offset 192 in frame > #0 0x101363ccf in pni_post_sasl_frame sasl.c:462 > This frame has 3 object(s): > [32, 48) 'out' (line 464) > [64, 192) 'mechs' (line 475) <== Memory access at offset 192 overflows > this variable > [224, 228) 'count' (line 478) > HINT: this may be a false positive if your program uses some custom stack > unwind mechanism, swapcontext or vfork > (longjmp and C++ exceptions *are* supported) > Thread T3 created by T0 here: > #0 0x101f5dadd in wrap_pthread_create > (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x56add) > #1 0x1018bc4ab in std::__1::thread::thread<void > (proton::container::impl::*)(), proton::container::impl*, void>(void > (proton::container::impl::*&&)(), proton::container::impl*&&) thread:368 > #2 0x10188da97 in proton::container::impl::run(int) > proactor_container_impl.cpp:802 > #3 0x100f0223c in main broker.cpp:427 > #4 0x7fff6968b3d4 in start (libdyld.dylib:x86_64+0x163d4) > SUMMARY: AddressSanitizer: stack-buffer-overflow sasl.c:443 in pni_split_mechs > Shadow bytes around the buggy address: > 0x1e0001897920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e0001897930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e0001897940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e0001897950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e0001897960: 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2 00 00 00 00 > =>0x1e0001897970: 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2 f2 > 0x1e0001897980: 04 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e0001897990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e00018979a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e00018979b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x1e00018979c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > ==42793==ABORTING > Abort trap: 6 > {noformat} > The problem is that {{mechlist}} is (on my machine) > {noformat} > "SRP SRP GS2-IAKERB GS2-KRB5 SCRAM-SHA-1 SCRAM-SHA-256 SCRAM-SHA-256 > SCRAM-SHA-1 GS2-KRB5 GS2-IAKERB GSS-SPNEGO GSSAPI GSS-SPNEGO GSSAPI > DIGEST-MD5 DIGEST-MD5 OTP OTP NTLM CRAM-MD5 NTLM CRAM-MD5 ANONYMOUS ANONYMOUS" > {noformat} > which is over the limit of 16. > This is not a security issue, because the mechanism list is created on server > based on what cyrus-sasl mechs are installed. It is not based on data sent > over the network. > cc [~astitcher], [~rkieley] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org