Stefan Miklosovic created CASSANDRA-17350:
---------------------------------------------

             Summary: Investigate and remove Windows-specific threading-related 
code in copyutils.py
                 Key: CASSANDRA-17350
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17350
             Project: Cassandra
          Issue Type: Improvement
          Components: Tool/cqlsh
            Reporter: Stefan Miklosovic


There are bits of the code to be removed or refactored related to how Windows 
were treating threading in copyutils.py as Windows is not longer supported.

FilesReader, FeedingProcess and ChildProcess:

[~Bowen Song] put it the best so I just copy it here from GitHub PR for 16956, 
this code relates to FilesReader, FeedingProcess and ChildProcess Python 
classes.

We agreed on the fact that 16956 may be merged without this being addressed as 
it requires further investigation in the matter which would unncessarily 
postpone and delay it.

Bowen's take on this:

I believe that we should move the code from on_fork() to __init__(), and may 
also remove the on_fork() method if it's no longer used.

The problem with Windows and Python multiprocessing is that Windows doesn't 
support fork(), therefore Python implemented a workaround. On Windows, Python 
multiprocessing library uses pickle to serialise everything in memory, spawn a 
new process, and then restores the memory content from the serialised data. The 
ReceivingChannel and SendingChannel objects are not serialisable because they 
have file descriptors (which I believe it's called a file handle on Windows) in 
them, therefore the code has to handle them after the fake fork().

However, I'm concerned that moving the code from on_fork() to __init__() may 
have other unintended side effects. For example, the file descriptors (FDs) 
will be opened before the fork, in some edge cases the fork may never happen 
(e.g.: an exception raised in or just after the init, but before the fork). 
Where's the code responsible for closing the FDs on the parent process side? 
Will this cause any FD leak? This clearly requires further work to find out.

To be honest, I don't think removing the comments without addressing the above 
is a wise move. Future developers wouldn't have the opportunity to understand 
why the code is written in this way if the comment is removed. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to