On 2/25/2023 10:52 AM, Skip Montanaro wrote:
I have a multi-threaded program which calls out to a non-thread-safe
library (not mine) in a couple places. I guard against multiple
threads executing code there using threading.Lock. The code is
straightforward:
from threading import Lock
# Something in textblob and/or nltk doesn't play nice with no-gil, so just
# serialize all blobby accesses.
BLOB_LOCK = Lock()
def get_terms(text):
with BLOB_LOCK:
phrases = TextBlob(text, np_extractor=EXTRACTOR).noun_phrases
for phrase in phrases:
yield phrase
When I monitor the application using py-spy, that with statement is
consuming huge amounts of CPU. Does threading.Lock.acquire() sleep
anywhere? I didn't see anything obvious poking around in the C code
which implements this stuff. I'm no expert though, so could easily
have missed something.
I'm no expert on locks, but you don't usually want to keep a lock while
some long-running computation goes on. You want the computation to be
done by a separate thread, put its results somewhere, and then notify
the choreographing thread that the result is ready.
This link may be helpful -
https://anandology.com/blog/using-iterators-and-generators/
--
https://mail.python.org/mailman/listinfo/python-list