Pavel Moravec created QPID-6157:
-----------------------------------

             Summary: linearstore: segfault when 2 journals request new journal 
file from empty EFP
                 Key: QPID-6157
                 URL: https://issues.apache.org/jira/browse/QPID-6157
             Project: Qpid
          Issue Type: Bug
          Components: C++ Broker
    Affects Versions: 0.30
            Reporter: Pavel Moravec
            Assignee: Pavel Moravec


Description of problem:
Broker using linearstore module can segfault when:
- EFP is empty
- 2 journals concurrently request new journal file from EFP

There is a race condition described in Additional info that leads to segfault.


Version-Release number of selected component (if applicable):
any


How reproducible:
100% in few minutes (on faster machines)


Steps to Reproduce:
Reproducer script:

topics=10
queues_per_topic=10

rm -rf /var/lib/qpidd/* /tmp/qpidd.log
service qpidd restart

echo "$(date): creating $(($((topics))*$((queues_per_topic)))) queues"
for i in $(seq 1 $topics); do
  for j in $(seq 1 $queues_per_topic); do
    qpid-receive -a "Durable_${i}_${j}; {create:always, node:{durable:true, 
x-bindings:[{exchange:'amq.direct', queue:'Durable_${i}_${j}', key:'${i}'}] }}" 
&
  done
done
wait

echo "$(date): queues created"
while true; do
  echo "$(date): publishing messages.."
  for i in $(seq 1 $topics); do
    qpid-send -a "amq.direct/${i}" -m 1000000 --durable=yes --content-size=1000 
&
  done
  wait
  echo "$(date): consuming messages.."
  for i in $(seq 1 $topics); do
    for j in $(seq 1 $queues_per_topic); do
      qpid-receive -a "Durable_${i}_${j}" -m 1000000 --print-content=no &
    done
  done
  wait
done

#end of the script


Actual results:
segfault with bt:

Thread 1 (Thread 0x7ff85b3f1700 (LWP 17810)):
#0  0x00007ff9927104f3 in std::basic_string<char, std::char_traits<char>, 
std::allocator<char> >::assign(std::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6
No symbol table info available.
#1  0x00007ff98e59d6a1 in operator= (this=0x1ab3480) at 
/usr/include/c++/4.4.7/bits/basic_string.h:511
No locals.
#2  qpid::linearstore::journal::EmptyFilePool::popEmptyFile (this=0x1ab3480)
    at 
/usr/src/debug/qpid-0.22/cpp/src/qpid/linearstore/journal/EmptyFilePool.cpp:213
        l = {_sm = @0x1ab34f8}
        emptyFileName = ""
        isEmpty = true
#3  0x00007ff98e59ddec in 
qpid::linearstore::journal::EmptyFilePool::takeEmptyFile (this=0x1ab3480, 
destDirectory=
    "/var/lib/qpidd/qls/jrnl/DurableQueue")
    at 
/usr/src/debug/qpid-0.22/cpp/src/qpid/linearstore/journal/EmptyFilePool.cpp:108
        emptyFileName = ""
        newFileName = ""


Expected results:
no segfault


Additional info:
Relevant source code:

std::string EmptyFilePool::popEmptyFile() {
    std::string emptyFileName;
    bool isEmpty = false;
    {
        slock l(emptyFileListMutex_);
        isEmpty = emptyFileList_.empty();
    }
    if (isEmpty) {
        createEmptyFile();
    }
    {
        slock l(emptyFileListMutex_);
        emptyFileName = emptyFileList_.front();    <-- line 213
        emptyFileList_.pop_front();
    }
    return emptyFileName;
}

If two requests (R1 and R2) are made concurrently when EFP is empty such that:
- R1 performs most of the function until line 212 (second lock)
  - this means creating one empty file
- R2 performs the same - but now EFP has one file so no new file to be created
- R1 (or R2, it does not matter) continues on line 212 and further
  - so it takes the empty file
- the second request tries to take an empty file from the empty EFP and 
triggers the segfault




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to