[
https://issues.apache.org/jira/browse/SLING-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839419#comment-17839419
]
Paul Bjorkstrand commented on SLING-12300:
------------------------------------------
I apologize if I am perceived as trivializing; I am trying to understand your
concern, but I am not able to find any reason (other than security/privacy) why
having the UUIDs addressable is a problem. I do want to understand what the
concern about having some kind of predictability is, though.
Based on part of your previous comments, I would like to address some of the
concerns strictly related to predictability.
bq. Basically, I'm not putting a server on the public internet with predictable
addresses.
I don't think that predictability is necessarily a problem in itself. In many
situations (especially for things based on Sling), predictability is not a
detriment but a feature. I know that there are Sling implementations that are
not AEM, but using AEM as an example, you already have a large swath of
partially predictable addresses. Public data primarily lives under
{{/content}}. (Mostly private) user data lives under {{/home}}, (almost
entirely private) application data/code lives under {{/apps}}. These root
segments are entirely predictable.
If you will oblige, I would like to run through an example comparing UUIDs vs.
paths in terms of predictability, starting with some assumptions:
# The "root path segments" in a given implementation are entirely predictable.
# Paths consist of characters found in the following regex: {{[A-Za-z0-9-/]}}.
This gives you 64 characters to choose from (more or fewer characters could be
used, and these seem common enough in URLs).
#* I chose this character set because it made the math easier, since 64 is a
power of 2.
#* More characters could be used, but it doesn't really change the math much.
# Paths of nodes are entirely random, using the character set above.
# Every node in a given Sling instance has {{mix:referenceable}} applied and is
provisioned a UUID.
# Paths could be any random combination of the characters from the regex above.
In theory, a path could be of unlimited length, while UUIDs are finite, that is
true. Using the assumptions above, for a path to be less predictable than a
UUID, it would need to be at least than 21 characters long _beyond any
predictable root(s)_. I won't go into the math too deeply, but the point where
64^x^ crosses 2^122^ is approximately 20.41. Anything 20 characters or less is
more prone to brute force attacks than a UUID!
_Note, even if you use all 8 bits of ASCII, you get 256^x^, which makes the
length needed 16 characters (~15.25).
In practice, paths are almost never random strings of characters. They have
meaning in their names, often semantic and syntactic restrictions (depending on
the creator's language). System rules can also reduce what paths are allowed
(e.g., {{//}} is not legal because of the sequential slashes). Additionally,
paths are usually relatively short. My experience tells me that paths are
rarely more than 50 characters long in the public part of the application.
Lastly, not every node in a given instance is going to have
{{mix:referenceable}}, and thus not every node has a UUID that could be
brute-forced.
If your concern is not security or privacy related can help me understand what
it is [~enorman]?
> Provide a way to retrieve a JCR backed resource by its node identifier
> ----------------------------------------------------------------------
>
> Key: SLING-12300
> URL: https://issues.apache.org/jira/browse/SLING-12300
> Project: Sling
> Issue Type: New Feature
> Components: JCR
> Reporter: Radu Cotescu
> Assignee: Radu Cotescu
> Priority: Major
> Fix For: JCR Resource 3.3.0
>
>
> Since all {{javax.jcr.Nodes}} have an identifier [0], a useful feature would
> be {{Resource}} retrieval by node id, which could be its {{jcr:uuid}}
> property for referenceable nodes or the path. In systems that would like to
> use UUID addressing, this would reduce the need for executing JCR queries for
> resource retrieval and would avoid double-reads via the JCR and then Sling
> API to obtain the resource.
> In order to provide a unified behaviour, paths starting with the {{/jcr:id/}}
> prefix should use the resource retrieval by node identifier.
> [0] -
> https://javadoc.io/static/javax.jcr/jcr/2.0/javax/jcr/Node.html#getIdentifier()
--
This message was sent by Atlassian Jira
(v8.20.10#820010)