Re: [whatwg] Persistent storage is critically flawed.

Ian Hickson Sun, 27 Aug 2006 21:09:57 -0700

On 8/27/06, Shannon Baker <[EMAIL PROTECTED]> wrote:


== 1: Authors failure to handle the implications of "global" storage. ==
First lets talk about the global store (|globalStorage['']) which is
accessible from ALL domains.


This is mentioned in the "Security and privacy" section; the third
bullet point here for example suggests blocking access to "public"
storage areas:

  http://whatwg.org/specs/web-apps/current-work/#user-tracking

Did anyone stop to really consider the implications of this? I mean,
sure the standard implies that UA's should deal with the security
implications of this themselves, but what if they don't? Let's say a UA
does allow access to this global storage, what would we expect to find
in this storage space? Does the author really believe that this will be
only used for sharing preferences between domains for the benefit of the
user? Hell no! It's going to look like this:

KEY                           VALUE
adsense3wd4ghgtut9jhn
kjh234kj23u4y2j34234hkj234hkj23h4k234k234   <--  Advertiser user tracking
johnyizcool                   I Kickerz Azz!!!!!!
    <--  Attention freak
USconspiracy                  911 was an inside job. Tell
everybody!      <--  Political activist
UScitID
kh546jkh45856456h45iu6y46j45j6h54kj6h45k6   <--  Government spying
GodsLove.com                  Warning! This user supports
abortion.       <--  Vigilantie user tracking


Yes, there's an entire section of the spec discussing this in detail,
with suggested solutions.

|What possible use could this storage region ever have to a legitimate
site? Especially when sensible UA's will just block it anyway? I for one
do not want my browser becoming some sort of global 'grafitti wall'
written on by every website I visit. Truthfully I cannot come up with a
single legitimate use for the 'global' or 'com' regions that cannot be
handled by per-domain storage or global storage with ACLs (see next point).


Indeed, the spec suggests blocking such access.

== 2: Naive access controls which will result in guaranteed privacy
violations. ==
The standard advocates the two-way sharing of data between domains and
subdomains - Namely that host.example.com should share data with the
servers at 'www.host.example.com', 'example.com', and all servers rooted
at '.com'. In its own words: "Each domain and each subdomain has its own
separate storage area. Subdomains can access the storage areas of parent
domains, and domains can access the storage areas of subdomains."

My objection to this is similar to my objection to the 'global' storage
space - It's totally naive. The whole scheme is based on the unfounded
belief that there is a guaranteed trust relationship available between
the parties controlling each of these domains.


There generally is; but for the two cases where there are not, see:

  http://whatwg.org/specs/web-apps/current-work/#storage

...and:

  http://whatwg.org/specs/web-apps/current-work/#storage0

Basically, for the few cases where an author doesn't control his
subdomain space, he should be careful. But this goes without saying.
The same requirement (that authors be responsible) applies to all Web
technologies, for example CGI script authors must be careful not to
allow SQL injection attacks, must check Referer headers, must ensure
POST/GET requests are handled appropriately, and so forth.

Sure, one may be reliant
on another for DNS redirection but that hardly implies that one wishes
to share potentially confidential data with the other. As the author
themselves stated there is no guarantee that users of geocities.com
sub-domains wish their users data to be shared with GeoCities.


Indeed; users are geocities.com shouldn't be using this service, and
geocities themselves should put their data (if any) in a private
subdomain space.

The
author states that geocities could mitigate this risk with a fake
sub-domain but how does that help the owner of mysite.geocities.com?


It doesn't. The solution for mysite.geocities.com is to get their own domain.

The
author implies that UA's should deal with this themselves and fails to
provide any REALISTIC guidelines for them to do so (sure lets hardcode
all the TLD's and free hosting providers).


The spec was written in conjunction with UA vendors. It is realistic
for UA vendors to provide a hardcoded list of TLDs; in fact, there is
significant work underway to create such a list (and have it be
regualrly updated). That work was originally started for use for HTTP
Cookie implementations, which have similar problems, but would be very
useful for Storage API implementations (although, again as noted in
the draft, not imperative for a secure implementation if the author is
responsible.

What annoys me is that the
author acknowledges the issue and then passes the buck to browser
manufacturers as though it's their problem and they should solve it in
any (incompatible or non-compliant) way they like.


Any solution must be compliant, by definition; regarding
compatibility, it isn't clear to me that the suggestion in the spec
would be incompatible.

But why bother? This whole problem is easily solved by allowing data to
be stored with an access control list (ACL). For example the site
developer should be able to specify that a data object be available to
'*.example.com' and 'fred.geocities.com' only. How this is done (as a
string or array) is irrelevant to this post but it must be done rather
than relying on implicit trust where none exists.


One could create much more complex APIs, naturally, but I do not see
that this would solve the problems. It wouldn't solve the issue of
authors who don't understand the security implications of their code,
for instance. It also wouldn't prevent the security issue you
mentioned -- why couldn't all *.geocities.com sites cooperate to
violate the user's privacy? Or *.co.uk sites, for that matter? (Note
that it is already possible today to do such tracking with cookies; in
fact it's already possible today even without cookies if you use
Referer tracking, and even without Referer tracking one can use IP and
User-Agent fingerprinting combined with log analysis to perform quite
thorough tracking.)

== 3: Lack of privilege separation. ==
The proposal assumes that the shared data should be readable and
writable by all sub and parent domains. I believe there is no reason why
this shouldn't be extended to provide 'access control' similar to that
implemented by standard file systems. For example if I want to publish
an object called 'myKey' and make it accessable to other sites it does
not automatically mean I want them to be able to modify or delete it. It
is important that global storage allows read-only access to variables if
it is to be widely adopted for information sharing between untrusting
parties.


Certainly one could add a .readonly field or some such to storage data
items, or even fully fledged ACL APIs, but I don't think that should
be available in a first version, and I'm not sure it's really useful
in later versions either.

== 4: Messy API requiring callbacks to handle concurrency. ==
The author uses a complicated method of handling concurrency by using
callbacks triggered by setItem() to interrupt processing in other open
pages (ie, other tabs or frames) which could access the same data. Why
can I not simply lock the item during updates or long reads and force
other scripts to wait? While I'm unsure wether ECMAscript can handle
proper database-style transactions it seems like it would be fairly easy
for the developer to implement critical sections by using shared storage
objects or metadata as mutexes and semaphores. I can't see what role the
callback mechanism would fulfill that could not be handled more easily
using traditional transactional logic.


I don't really understand what this is referring to. Could you show an
example of the transaction/callback system you refer to? The API is
intended to be really simple, just specify the item name and there you
go.

== Conclusion ==
In conclusion it appears to me that the proposal is based on several
fundamentally flawed security assumptions and is overly complex. I see
this becoming a hiding place for viruses, malware and tracking cookies.
Any sensible browser manufacturer would turn this feature off or limit
its scope - thus rendering it inoperable for the many beneficial uses it
would otherwise have. Those browsers that support this proposal are
likely to do so in incompatible ways - due largely to the faults and
omissions in this proposal that it implies UA's will solve. It seems
like a large amount of browser sniffing will be required to have any
assurance that persistent storage will work as advertised. Therefore,
the global storage proposal must be fixed or removed.


While I agree that there are valid concerns, I believe they are all
addressed explicitly in the spec, with suggested solutions.

I would be interested in seeing a concrete proposal for a better
solution; I don't really see what a better solution would be.

--
Ian Hickson

Re: [whatwg] Persistent storage is critically flawed.

Reply via email to