[ 
https://issues.apache.org/jira/browse/SLING-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839419#comment-17839419
 ] 

Paul Bjorkstrand commented on SLING-12300:
------------------------------------------

I apologize if I am perceived as trivializing; I am trying to understand your 
concern, but I am not able to find any reason (other than security/privacy) why 
having the UUIDs addressable is a problem. I do want to understand what the 
concern about having some kind of predictability is, though.

Based on part of your previous comments, I would like to address some of the 
concerns strictly related to predictability.

bq. Basically, I'm not putting a server on the public internet with predictable 
addresses.

I don't think that predictability is necessarily a problem in itself. In many 
situations (especially for things based on Sling), predictability is not a 
detriment but a feature. I know that there are Sling implementations that are 
not AEM, but using AEM as an example, you already have a large swath of 
partially predictable addresses. Public data primarily lives under 
{{/content}}. (Mostly private) user data lives under {{/home}}, (almost 
entirely private) application data/code lives under {{/apps}}. These root 
segments are entirely predictable.

If you will oblige, I would like to run through an example comparing UUIDs vs. 
paths in terms of predictability, starting with some assumptions:

# The "root path segments" in a given implementation are entirely predictable.
# Paths consist of characters found in the following regex: {{[A-Za-z0-9-/]}}. 
This gives you 64 characters to choose from (more or fewer characters could be 
used, and these seem common enough in URLs).
#* I chose this character set because it made the math easier, since 64 is a 
power of 2.
#* More characters could be used, but it doesn't really change the math much.
# Paths of nodes are entirely random, using the character set above.
# Every node in a given Sling instance has {{mix:referenceable}} applied and is 
provisioned a UUID.
# Paths could be any random combination of the characters from the regex above.

In theory, a path could be of unlimited length, while UUIDs are finite, that is 
true. Using the assumptions above, for a path to be less predictable than a 
UUID, it would need to be at least than 21 characters long _beyond any 
predictable root(s)_. I won't go into the math too deeply, but the point where 
64^x^ crosses 2^122^ is approximately 20.41. Anything 20 characters or less is 
more prone to brute force attacks than a UUID!

_Note, even if you use all 8 bits of ASCII, you get 256^x^, which makes the 
length needed 16 characters (~15.25).

In practice, paths are almost never random strings of characters. They have 
meaning in their names, often semantic and syntactic restrictions (depending on 
the creator's language). System rules can also reduce what paths are allowed 
(e.g., {{//}} is not legal because of the sequential slashes). Additionally, 
paths are usually relatively short. My experience tells me that paths are 
rarely more than 50 characters long in the public part of the application. 
Lastly, not every node in a given instance is going to have 
{{mix:referenceable}}, and thus not every node has a UUID that could be 
brute-forced.

If your concern is not security or privacy related can help me understand what 
it is [~enorman]?

> Provide a way to retrieve a JCR backed resource by its node identifier
> ----------------------------------------------------------------------
>
>                 Key: SLING-12300
>                 URL: https://issues.apache.org/jira/browse/SLING-12300
>             Project: Sling
>          Issue Type: New Feature
>          Components: JCR
>            Reporter: Radu Cotescu
>            Assignee: Radu Cotescu
>            Priority: Major
>             Fix For: JCR Resource 3.3.0
>
>
> Since all {{javax.jcr.Nodes}} have an identifier [0], a useful feature would 
> be {{Resource}} retrieval by node id, which could be its {{jcr:uuid}} 
> property for referenceable nodes or the path. In systems that would like to 
> use UUID addressing, this would reduce the need for executing JCR queries for 
> resource retrieval and would avoid double-reads via the JCR and then Sling 
> API to obtain the resource.
> In order to provide a unified behaviour, paths starting with the {{/jcr:id/}} 
> prefix should use the resource retrieval by node identifier.
> [0] - 
> https://javadoc.io/static/javax.jcr/jcr/2.0/javax/jcr/Node.html#getIdentifier()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to