[ https://issues.apache.org/jira/browse/ZOOKEEPER-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kezhu Wang resolved ZOOKEEPER-3102. ----------------------------------- Resolution: Won't Fix The [pr|https://github.com/apache/zookeeper/pull/584] has concluded to not fix this. From my perspective, we have not concurrent writes. But we do have concurrent reads along side write. There are {{PrepRequestProcessor}}'s read, {{ZooKeeperServer::takeSnapshot}} and {{FinalRequestProcessor}}'s write. So we only have to protect partial reads from concurrent write. > Potential race condition when create ephemeral nodes > ---------------------------------------------------- > > Key: ZOOKEEPER-3102 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3102 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.6.0 > Environment: operating system: macOS High Sierra 10.13.6 > java version: 8u152 > > Reporter: LuoFucong > Priority: Minor > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > The method > {code:java} > public void createNode(final String path, byte data[], List<ACL> acl, long > ephemeralOwner, int parentCVersion, long zxid, long time, Stat outputStat) > {code} > > in class DataTree may conceal a potential race condition regarding the > session ephemeral nodes map "Map<Long, HashSet<String>> ephemerals". > Specifically, the codes start from line 455: > > {code:java} > } else if (ephemeralOwner != 0) { > HashSet<String> list = ephemerals.get(ephemeralOwner); > if (list == null) { > list = new HashSet<String>(); > ephemerals.put(ephemeralOwner, list); > } > synchronized (list) { > list.add(path); > } > }{code} > > When an ephemeral owner tries to create nodes concurrently (under different > parent nodes), an empty "HashSet<String>" might be created multiple times, > and replace each other. > The following unit test reveals the race condition: > > {code:java} > @Test(timeout = 60000) > public void testSessionEphemeralNodesConcurrentlyCreated() > throws InterruptedException, NodeExistsException, NoNodeException { > long session = 0x1234; > int concurrent = 10; > Thread[] threads = new Thread[concurrent]; > CountDownLatch latch = new CountDownLatch(1); > for (int i = 0; i < concurrent; i++) { > String parent = "/test" + i; > dt.createNode(parent, new byte[0], null, 0, -1, 1, 1); > Thread thread = new Thread(() -> { > try { > latch.await(); > } catch (InterruptedException e) { > throw new RuntimeException(e); > } > String path = parent + "/0"; > try { > dt.createNode(path, new byte[0], null, session, -1, 1, 1); > } catch (Exception e) { > throw new IllegalStateException(e); > } > }); > thread.start(); > threads[i] = thread; > } > latch.countDown(); > for (Thread thread : threads) { > thread.join(); > } > int sessionEphemerals = dt.getEphemerals(session).size(); > Assert.assertEquals(concurrent, sessionEphemerals); > } > {code} > The session "0x1234" has created 10 ephemeral nodes "/test\{0~9}/0" > concurrently (in 10 threads), so its ephemeral nodes size retrieved from > DataTree should be 10 while doesn't (assertion fail). > > The fix should be easy: > > {code:java} > private final ConcurrentMap<Long, HashSet<String>> ephemerals = new > ConcurrentHashMap<>(); > ... > } else if (ephemeralOwner != 0) { > HashSet<String> list = ephemerals.get(ephemeralOwner); > if (list == null) { > list = new HashSet<String>(); > HashSet<String> _list; > if ((_list = ephemerals.putIfAbsent(ephemeralOwner, list)) != null) { > list = _list; > } > } > synchronized (list) { > list.add(path); > } > } > {code} > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)