[ 
https://issues.apache.org/jira/browse/HDFS-11820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010962#comment-16010962
 ] 

Kihwal Lee commented on HDFS-11820:
-----------------------------------

Jira is for bug report. Please use relevant mailing list for discussions and 
questions.

Edit logging is almost thread safe, but not entirely. Concurrency is controlled 
by the rule that edit logging should happen with the name system write lock 
held and logSync() after releasing the lock. A successful response only goes 
back to the user after successful logSync().

> Thread safety in logEdit?
> -------------------------
>
>                 Key: HDFS-11820
>                 URL: https://issues.apache.org/jira/browse/HDFS-11820
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>            Reporter: Tummy Bunny
>
> Hi there,
> I am new to Hadoop and trying to understand how things work under the hood by 
> browsing through some of the codes.
> I noticed a potential thread safety issue in in FSEditLog.java in version 
> 2.7.1 where the following patterns is used (the current trunk also use the 
> same pattern):
> 1. Instance of FSEditLogOp is retrieved from cache for reuse 
> 2. Set the attributes (e.g. path, timestamp, etc)
> 3. Invoke logEdit(*op*). This method has synchronized block in it, but also 
> has *wait* if auto-sync is scheduled
> Now, if I have two almost simultaneous rename operations, right after each is 
> about to write edit log:
> Thread #1 acquired instance of RenameOp, set the attributes, and invoked 
> logEdit, then it waits because auto-sync is scheduled.
> Thread #2 catches up, and acquires same instance of RenameOp, sets 
> *different* attributes, and invokes logEdit.. It blocks because of 
> synchronized block inside logEdit(...), but it manages to modify the 
> attributes of RenameOp.
> The second renameOp could end up being logged twice because both renameOps 
> are actually the same instance. 
> The fix is to have synchronized(*op*) prior to calling logEdit(*op*) or clone 
> the op before using it.
> I could be wrong. Am I missing something?
> Thanks,
> Alexander Koentjara



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to