Hello,

I've added this little Pull Request 
<https://github.com/scrapy/queuelib/pull/16/files> that adds the thread 
safety note on the readme.

> Another thing is that, if my understanding is right, FifoMemoryQueue 
should be fine,

It's tricky. deque is thread safe but the way it's used isn't. It wouldn't 
be thread safe if it was resulting corrupted data, random values, crashes 
etc. This isn't what happens here. deque works fine.

What we have is a *race condition* which happens whenever anyone writes:


if <condition on an object>:
    do something on that object assuming the condition

... without holding a lock. The reason is clear if I do (I have the right 
to do):

if <condition on an object>:
    sleep(30 minutes)
    do something on that object assuming the condition

any threading model I've seen doesn't guarantee you that the state of the 
object will remain the same because you just checked an if. By the time you 
act on an if, your pre-condition, in a multi-threaded environment will 
likely change.

This happens in deque's case with this:

return q.popleft() if q else None

I can rewrite this to:

if q:
    return q.popleft()
else:
    return None

This means that this can happen:

T1: if q: <--- evaluates to True
---------------
T2: if q: <--- evaluates to True
T2:     return q.popleft() <--- pops last item
---------------
T1:     return q.popleft() <--- exception


If you change the code to:

try:
    return q.popleft()
except IndexError:
    return None

there's no race condition anymore and you have correct code.

> I supposed there is no interest in transforming queuelib to provide 
thread-safety at the expense of adding more complexity. Am I correct?

I would guess so. Actually you aren't talking little complexity here... you 
are talking tons of complexity. Part of the Twisted/async.io "manifesto" is 
that threads are very difficult to get write and don't compose well so keep 
thing simple and use a single thread (where it makes sense).




On Monday, March 21, 2016 at 8:15:54 PM UTC, Alex Railean wrote:
>
> Hi, thank you for the clear visualization of the behaviour of different 
> threads and the highlighting - your explanation is very helpful.
>
> However, it is a pity that the result is negative. Given that the current 
> design is sufficient for solving the problems scrapy uses it for - I 
> supposed there is no interest in transforming queuelib to provide 
> thread-safety at the expense of adding more complexity. Am I correct?
>
>
> My alternative is to tweak the design of the code where I wanted to apply 
> queuelib originally, by making sure that only one thread gets to use the 
> queue, thus avoiding the issue altogether.
>
>
> I do think there's at least one thing we can do - to state that somewhere 
> in a FAQ, as I suspect that I am not the first person on the planet to 
> inquire about this.
>
> Another thing is that, if my understanding is right, FifoMemoryQueue 
> should be fine, provided that T1 only pushes and T2 only pops, because 
> according to the manual {
> Deques support thread-safe, memory efficient appends and pops from either 
> side of the deque ...
> }
>
> Are there any scenarios in which this is not going to be the case?
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to