Pavel Moravec created QPID-6157: ----------------------------------- Summary: linearstore: segfault when 2 journals request new journal file from empty EFP Key: QPID-6157 URL: https://issues.apache.org/jira/browse/QPID-6157 Project: Qpid Issue Type: Bug Components: C++ Broker Affects Versions: 0.30 Reporter: Pavel Moravec Assignee: Pavel Moravec
Description of problem: Broker using linearstore module can segfault when: - EFP is empty - 2 journals concurrently request new journal file from EFP There is a race condition described in Additional info that leads to segfault. Version-Release number of selected component (if applicable): any How reproducible: 100% in few minutes (on faster machines) Steps to Reproduce: Reproducer script: topics=10 queues_per_topic=10 rm -rf /var/lib/qpidd/* /tmp/qpidd.log service qpidd restart echo "$(date): creating $(($((topics))*$((queues_per_topic)))) queues" for i in $(seq 1 $topics); do for j in $(seq 1 $queues_per_topic); do qpid-receive -a "Durable_${i}_${j}; {create:always, node:{durable:true, x-bindings:[{exchange:'amq.direct', queue:'Durable_${i}_${j}', key:'${i}'}] }}" & done done wait echo "$(date): queues created" while true; do echo "$(date): publishing messages.." for i in $(seq 1 $topics); do qpid-send -a "amq.direct/${i}" -m 1000000 --durable=yes --content-size=1000 & done wait echo "$(date): consuming messages.." for i in $(seq 1 $topics); do for j in $(seq 1 $queues_per_topic); do qpid-receive -a "Durable_${i}_${j}" -m 1000000 --print-content=no & done done wait done #end of the script Actual results: segfault with bt: Thread 1 (Thread 0x7ff85b3f1700 (LWP 17810)): #0 0x00007ff9927104f3 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::assign(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6 No symbol table info available. #1 0x00007ff98e59d6a1 in operator= (this=0x1ab3480) at /usr/include/c++/4.4.7/bits/basic_string.h:511 No locals. #2 qpid::linearstore::journal::EmptyFilePool::popEmptyFile (this=0x1ab3480) at /usr/src/debug/qpid-0.22/cpp/src/qpid/linearstore/journal/EmptyFilePool.cpp:213 l = {_sm = @0x1ab34f8} emptyFileName = "" isEmpty = true #3 0x00007ff98e59ddec in qpid::linearstore::journal::EmptyFilePool::takeEmptyFile (this=0x1ab3480, destDirectory= "/var/lib/qpidd/qls/jrnl/DurableQueue") at /usr/src/debug/qpid-0.22/cpp/src/qpid/linearstore/journal/EmptyFilePool.cpp:108 emptyFileName = "" newFileName = "" Expected results: no segfault Additional info: Relevant source code: std::string EmptyFilePool::popEmptyFile() { std::string emptyFileName; bool isEmpty = false; { slock l(emptyFileListMutex_); isEmpty = emptyFileList_.empty(); } if (isEmpty) { createEmptyFile(); } { slock l(emptyFileListMutex_); emptyFileName = emptyFileList_.front(); <-- line 213 emptyFileList_.pop_front(); } return emptyFileName; } If two requests (R1 and R2) are made concurrently when EFP is empty such that: - R1 performs most of the function until line 212 (second lock) - this means creating one empty file - R2 performs the same - but now EFP has one file so no new file to be created - R1 (or R2, it does not matter) continues on line 212 and further - so it takes the empty file - the second request tries to take an empty file from the empty EFP and triggers the segfault -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org