[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Diederen updated ZOOKEEPER-4846:
---------------------------------------
    Description: 
ZooKeeper snapshots are {_}fuzzy{_}, as the server does not stop processing 
requests while ACLs and nodes are being streamed to disk.

ACLs, notably, are streamed {_}first{_}, as a mapping between the full 
serialized ACL and an "ACL ID" referenced by the node.

Consequently, a snapshot can very well contain ACL IDs which do not exist in 
the mapping. Prior to ZOOKEEPER-4799, such situations would produce harmless 
(if annoying) "Ignoring acl XYZ as it does not exist in the cache" INFO entries 
in the server logs.

With ZOOKEEPER-4799, we started "eagerly" fetching the referenced ACLs in 
{{DataTree}} operations such as {{{}createNode{}}}, {{{}deleteNode{}}}, etc.—as 
opposed to just fetching them from request processors.

This can result in fatal errors during the {{fastForwardFromEdits}} phase of 
restoring a database, when transactions are processed on top of an inconsistent 
data tree—preventing the server from starting.

The errors are thrown in this code path:
{code:java}
// ReferenceCountedACLCache.java:90
List<ACL> acls = longKeyMap.get(longVal);
if (acls == null) {
    LOG.error("ERROR: ACL not available for long {}", longVal);
    throw new RuntimeException("Failed to fetch acls for " + longVal);
}
{code}
Here is a scenario leading to such a failure:
 * An existing node {{{}/foo{}}}, sporting an unique ACL, is deleted. This is 
recorded in transaction log {{{}$SNAP-1{}}}; said ACL is also deallocated;
 * Snapshot {{$SNAP}} is started;
 * The ACL map is serialized to {{{}$SNAP{}}};
 * A new node {{/foo}} sporting the same unique ACL is created in a portion of 
the data tree which still has to be serialized;
 * Node {{/foo}} is serialized to {{{}$SNAP{}}}—but its ACL isn't;
 * The server is restarted;
 * The {{DataTree}} is initialized from {{{}$SNAP{}}}, including node {{/foo}} 
with a dangling ACL reference;
 * Transaction log {{$SNAP-1}} is being replayed, leading to a 
{{{}deleteNode("/foo"){}}};
 * {{getACL(node)}} panics, preventing a successful restart.

  was:(Under investigation.)


> Failure to reload database due to missing ACL
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-4846
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4846
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Damien Diederen
>            Assignee: Damien Diederen
>            Priority: Major
>
> ZooKeeper snapshots are {_}fuzzy{_}, as the server does not stop processing 
> requests while ACLs and nodes are being streamed to disk.
> ACLs, notably, are streamed {_}first{_}, as a mapping between the full 
> serialized ACL and an "ACL ID" referenced by the node.
> Consequently, a snapshot can very well contain ACL IDs which do not exist in 
> the mapping. Prior to ZOOKEEPER-4799, such situations would produce harmless 
> (if annoying) "Ignoring acl XYZ as it does not exist in the cache" INFO 
> entries in the server logs.
> With ZOOKEEPER-4799, we started "eagerly" fetching the referenced ACLs in 
> {{DataTree}} operations such as {{{}createNode{}}}, {{{}deleteNode{}}}, 
> etc.—as opposed to just fetching them from request processors.
> This can result in fatal errors during the {{fastForwardFromEdits}} phase of 
> restoring a database, when transactions are processed on top of an 
> inconsistent data tree—preventing the server from starting.
> The errors are thrown in this code path:
> {code:java}
> // ReferenceCountedACLCache.java:90
> List<ACL> acls = longKeyMap.get(longVal);
> if (acls == null) {
>     LOG.error("ERROR: ACL not available for long {}", longVal);
>     throw new RuntimeException("Failed to fetch acls for " + longVal);
> }
> {code}
> Here is a scenario leading to such a failure:
>  * An existing node {{{}/foo{}}}, sporting an unique ACL, is deleted. This is 
> recorded in transaction log {{{}$SNAP-1{}}}; said ACL is also deallocated;
>  * Snapshot {{$SNAP}} is started;
>  * The ACL map is serialized to {{{}$SNAP{}}};
>  * A new node {{/foo}} sporting the same unique ACL is created in a portion 
> of the data tree which still has to be serialized;
>  * Node {{/foo}} is serialized to {{{}$SNAP{}}}—but its ACL isn't;
>  * The server is restarted;
>  * The {{DataTree}} is initialized from {{{}$SNAP{}}}, including node 
> {{/foo}} with a dangling ACL reference;
>  * Transaction log {{$SNAP-1}} is being replayed, leading to a 
> {{{}deleteNode("/foo"){}}};
>  * {{getACL(node)}} panics, preventing a successful restart.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to