Ian Hickson wrote:

This is mentioned in the "Security and privacy" section; the third
bullet point here for example suggests blocking access to "public"
storage areas:

  http://whatwg.org/specs/web-apps/current-work/#user-tracking

I did read the suggestions and I know the authors have given these issues thought. However, my concern is that the solutions are all 'suggestions' rather than rules. I believe the standard should be more definitive to eliminate the potential for browser inconsistencies.

Yes, there's an entire section of the spec discussing this in detail,
with suggested solutions.

Again, the key word here is 'suggest'.

Indeed, the spec suggests blocking such access.

Suggest. See where I'm going with this. The spec is too loose.

There generally is; but for the two cases where there are not, see:

  http://whatwg.org/specs/web-apps/current-work/#storage

...and:

  http://whatwg.org/specs/web-apps/current-work/#storage0

Basically, for the few cases where an author doesn't control his
subdomain space, he should be careful. But this goes without saying.
The same requirement (that authors be responsible) applies to all Web
technologies, for example CGI script authors must be careful not to
allow SQL injection attacks, must check Referer headers, must ensure
POST/GET requests are handled appropriately, and so forth.

As I pointed out this only gives control to the parent domain, not the child without regard for the real-world political relationship between the two. Also the implication here is that the 'parent' domain is more trustworthy and important than the child - that it should always be able to read a subdomains private user data. The spec doesn't give the developer a chance to be responsible when it hands out user data to anybody in the domain hierarchy without regard for whether they are a single, trusted entity or not. Don't blame the programmer when the spec dictates who can read and write the data with no regard for the authors preferences. CGI scripts generally do not have this limitation so your analogy is irrelevant.

Indeed; users are geocities.com shouldn't be using this service, and
geocities themselves should put their data (if any) in a private
subdomain space.
Geocities and other free-hosting sites generally have a low server-side storage allowance. This means these sites have a _greater_ need for persistent storage than 'real' domains.

It doesn't. The solution for mysite.geocities.com is to get their own domain.
That's a bit presumptuous. In fact it's downright offensive. The user may have valid reasons for not buying a domain. Is it the whatcg's role to dictate hosting requirements in a web standard?

The spec was written in conjunction with UA vendors. It is realistic
for UA vendors to provide a hardcoded list of TLDs; in fact, there is
significant work underway to create such a list (and have it be
regualrly updated). That work was originally started for use for HTTP
Cookie implementations, which have similar problems, but would be very
useful for Storage API implementations (although, again as noted in
the draft, not imperative for a secure implementation if the author is
responsible.
I accept that such a list is probably the answer, however I believe the list should itself be standardised before becoming part of a web standard - otherwise more UA inconsistency.

One could create much more complex APIs, naturally, but I do not see
that this would solve the problems. It wouldn't solve the issue of
authors who don't understand the security implications of their code,
for instance. It also wouldn't prevent the security issue you
mentioned -- why couldn't all *.geocities.com sites cooperate to
violate the user's privacy? Or *.co.uk sites, for that matter? (Note
that it is already possible today to do such tracking with cookies; in
fact it's already possible today even without cookies if you use
Referer tracking, and even without Referer tracking one can use IP and
User-Agent fingerprinting combined with log analysis to perform quite
thorough tracking.)
None of those techniques are reliable. My own weblogs show most users have the referer field turned off. Cookies can be safely deleted after every session without a major impact on site function (I may have to login again). IP tracking is mitigated by proxies and NAT's. The trouble with this proposal is that it would allow important data to get lumped in with tracking data when the spec suggests that UA's should only delete the storage when explicitly asked to do so. I don't have a solution to this other than to revoke this proposal or prevent the sharing of storage between sites. I accept tracking is inevitable but we shouldn't be making it easier either.

Certainly one could add a .readonly field or some such to storage data
items, or even fully fledged ACL APIs, but I don't think that should
be available in a first version, and I'm not sure it's really useful
in later versions either.
Any more or less complex or useful than the .secure flag? Readonly is an essential attribute in any shared data system from databases to filesystems. Would you advocate that all websites be world-writable just to simplify the API? Not that it should be hard to implement .readonly, as we already have metadata with each key.

I don't really understand what this is referring to. Could you show an
example of the transaction/callback system you refer to? The API is
intended to be really simple, just specify the item name and there you
go.
I'm refering to the "storage" event described in 5.9.6 which is fired in all active pages as data changes. This is an unusual proceedure that needs a better justification than those given in the spec. If the event pulls me out of my current function then how am I going to do anything useful with the application state (without really knowing where execution was interrupted)?

While I agree that there are valid concerns, I believe they are all
addressed explicitly in the spec, with suggested solutions.
You points are also quite valid however they ignore the root of my concerns - which is that the spec leaves too much up to the UA to resolve. I don't see how you can explicitly define something with a suggestion! The whole spec kind of 'hopes' that many disparate companies/groups will cooperate to make persistent storage work consistently across browsers. They might, but given both Microsoft and Netscapes track records I think things need to be more concrete in such an important spec.

I would be interested in seeing a concrete proposal for a better
solution; I don't really see what a better solution would be.

I'm not sure myself but I don't think it can stay the way it is. I would be happy to offer a better proposal or update the current one given enough time to consider it.

As a quick thought, the simplest approach might just be to require the site send a secret hash or public key in order to prove it 'owns' the key. The secret could even be a timestamp of the exact time the key was set or just a hash of the users site login. eg:

DOMAIN         KEY          SECRET                                 DATA
foo.bar              baz             kj43h545j34h6jk534dfytyf      A string.

Just one idea.

Shannon
Web Developer

Reply via email to