Re: Getting a value by its data identifier

Felix Meschberger Fri, 15 Mar 2013 00:47:04 -0700

Hi,

I think there really are two sides to the story:


(a) getting an ID
(b) getting the data for that ID

We may or may not be able -- on a large scale -- to prevent (a). After all 
"getting an ID" might just be the result of wold guessing and doing a brute 
force attack.

We have to be able to limit (b): While restricting to "admin" sessions might be 
an option, I think that is not the right way to do it. I tend to agree with 
AlexK that a permission might be the way to do it. The problematic thing really 
is that permission checking is hooked to a repository path (and thus related to 
an Item) whereas here we don't have an item: The DataStore BLOB does not know 
where it belongs to -- and in a shared DataStore setup, there might not even be 
an "owner" property.

In short: forget about (a). For(b) use a custom permission on / to grant access 
to the new method (denied by default, of course).

Regards
Felix

Am 12.03.2013 um 16:09 schrieb Thomas Mueller:

> Hi,
> 
>> (a) Would such a method technically be possible (preventing actual large
>> binary data copy !) ?
> 
> Yes I think it's possible. Would this be needed for Oak or Jackrabbit 2.x
> or both?
> 
>> (c) Can we and if yes, how can we control access ?
> 
> Currently the content identifier is the content hash (SHA-1), so there is
> no risk of "enumeration" or "scanning" attack (not sure what is the right
> word for this - where the attacker blindly tries out many possible ids in
> the hope to find one).
> 
> One risk is that an attacker can "prove" a certain document is stored in
> the repository, where the attacker already has the document or at least
> knows the hash code. For example he could prove the "wikileaks file x" is
> stored in the repository, which might be a problem if possession of the
> "wikileaks file x" is illegal. Not sure if we need protection against
> that; if yes, we might only allow this method to be called for admin
> sessions or so.
> 
> Another risk is that an attacker that has a list of identifiers might be
> able to get the documents in that way, if they are stored in the
> repository. The question is how did the attacker get the identifier, but
> if it's a simple SHA-1 it might be a bigger risk. One way to protect
> against that might be to encrypt the SHA-1 hash code with a
> repository-wide, configurable "private key" or so.
> 
> Regards,
> Thomas
> 


--
Felix Meschberger | Principal Scientist | Adobe

Re: Getting a value by its data identifier

Reply via email to