Re: Parallel copy

Heikki Linnakangas Fri, 30 Oct 2020 09:37:39 -0700

On 27/10/2020 15:36, vignesh C wrote:

Attached v9 patches have the fixes for the above comments.

I find this design to be very complicated. Why does the line-boundaryinformation need to be in shared memory? I think this would be muchsimpler if each worker grabbed a fixed-size block of raw data, andprocessed that.

In your patch, the leader process scans the input to find out where oneline ends and another begins, and because of that decision, the leaderneeds to make the line boundaries available in shared memory, for theworker processes. If we moved that responsibility to the workerprocesses, you wouldn't need to keep the line boundaries in sharedmemory. A worker would only need to pass enough state to the next workerto tell it where to start scanning the next block.

Whether the leader process finds the EOLs or the worker processes, it'spretty clear that it needs to be done ASAP, for a chunk at a time,because that cannot be done in parallel. I think some refactoring inCopyReadLine() and friends would be in order. It probably would befaster, or at least not slower, to find all the EOLs in a block in onetight loop, even when parallel copy is not used.


- Heikki

Re: Parallel copy

Reply via email to