[GitHub] spark pull request: [SPARK-1888] enhance MEMORY_AND_DISK mode by d...

cloud-fan Wed, 21 May 2014 02:32:35 -0700

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/791#issuecomment-43733545
  
    `ensureFreeSpace` has 2 jobs. 1) iterate entries and select blocks to be 
dropped. 2) if to-be-dropped blocks can free enough space, mark them as 
dropping and return them to the caller.
    `ensureFreeSpace` is called within putLock, so each thread will see the 
dropping flag modification(I will discuss flag resetting in exception handling 
later) and thus get different to-be-dropped blocks. And block reading don't 
need the dropping flag so no conflict there. Let's consider block removing and 
exception handling(reset dropping flag)
    Job 1 of `ensureFreeSpace`(selecting) and removing are both synchronized by 
`entries`, so they must process by turn. 
    If a block is removed first, then everything is OK. 
    If a block is removed after Job 2 of `ensureFreeSpace`(marking) which is 
also synchronized by `entries`(in my modification), then the block will be 
dropped into disk and managed by diskStore, which I think is OK.
    If a block is removed between selecting and marking, the marking will check 
if entry is null, so it's OK, too.
    About exception handling, flag resetting is also synchronized by `entries`, 
so it won't process during selecting and marking.
    If resetting happened before selecting, then selecting will be able to 
select these blocks and re-drop them.
    If resetting happened after selecting, which means the selected 
to-be-dropped blocks won't include the resetted blocks, so there is no conflict.
    Actually there are 3 place that write or read the dropping flag(selecting, 
marking and resetting) and they are all synchronized by `entries`, so I think 
we don't need to define the flag as volatile.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1888] enhance MEMORY_AND_DISK mode by d...

Reply via email to