[issue35813] shared memory construct to avoid need for serialization between processes

Davin Potts Fri, 15 Feb 2019 08:45:08 -0800


Davin Potts <[email protected]> added the comment:


These questions (originally asked in comments on GH-11816) seemed more 
appropriate to discuss here:
    Why should the user want to use `SharedMemory` directly?
    Why not just go through the manager?  Also, perhaps a
    naive question: don't you _always_ need a `start()`ed
    manager in order for the processes to communicate?
    Doesn't `SharedMemoryServer` has to be involved?

I think it helps to discuss the last question first.  A SharedMemoryManager is 
*not* needed for two processes to share information across a shared memory 
block, nor is a SharedMemoryServer required.  The docs have examples 
demonstrating this but here is another meant to showcase exactly this:

    Start up a Python shell and do the following:
        >>> from multiprocessing import shared_memory
        >>> shm = shared_memory.SharedMemory(name=None, size=10)
        >>> shm.buf[:5] = b'Feb15'
        >>> shm.name  # Note this name and use it in the next steps
        'psm_26792_26631'

    Start up a second Python shell in a new window and do the following:
        >>> from multiprocessing import shared_memory
        >>> also_shm = shared_memory.SharedMemory(name='psm_26792_26631')  # 
Use that same name
        >>> bytes(also_shm.buf[:5])
        b'Feb15'

    If also_shm.buf is further modified in the second shell, those
    changes will be visible on shm.buf in the first shell.  The same
    is true of the reverse.

The key point is that there is no sending of messages between the processes at 
all.  In stark contrast, SyncManager offers and supports objects held in 
"distributed shared memory" where messages must be sent from one process to 
another to access or manipulate data; those objects held in "distributed shared 
memory" *must* have a SyncManager+Server to enable their use.  That is not 
needed at all for SharedMemory because access to and manipulation of the data 
is performed directly without the cost-delay of messaging.

This begs a new question, "so what is the SharedMemoryManager used for then?"  
The docs answer:
    To assist with the life-cycle management of shared memory
    especially across distinct processes, a BaseManager subclass,
    SharedMemoryManager, is also provided.
Because shared memory blocks are not "owned" by a single process, they are not 
destroyed/freed when a process exits.  A SharedMemoryManager is used to ensure 
the free-ing of a shared memory block when it is no longer needed.

New SharedMemory instances may be created via a SharedMemoryManager (in which 
case their birth-to-death life-cycle is being managed) or they may be created 
directly as seen in the above example.  Returning to the first question, "Why 
should the user want to use `SharedMemory` directly?", there are more use cases 
than these two:
1. In process 1, a shared memory block is created by calling 
SharedMemoryManager.SharedMemory().  In process 2, we need to attach to that 
existing shared memory block and can do so by referring to its name.  This is 
accomplished as in the above example by simply calling 
SharedMemory(name='uniquename').  We do not want to attach to it via a second 
SharedMemoryManager because only one manager should oversee the life-cycle of a 
single shared memory block.
2. Sometimes direct management of the life-cycle of a shared memory block is 
desirable.  For example, on systems supporting POSIX shared memory, it is a 
feature that shared memory blocks outlive processes.  Some services choose to 
speed a service restart by preserving state data in shared memory, saving the 
newly restarted service from rebuilding it.  The SharedMemoryManager provides 
one life-cycle strategy but can not cover all scenarios so the option to 
directly manage it is important.

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue35813>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue35813] shared memory construct to avoid need for serialization between processes

Reply via email to