Re: [webkit-dev] Security Origins

2009-06-02 Thread Adam Barth
On Mon, Jun 1, 2009 at 8:29 PM, Jeremy Orlow jor...@chromium.org wrote:
 If this is the only issue, the parsing code could work around it.  There are
 3 parts to the identifier: the protocol (should never have a _ in it,
 right?), the domain, and the port (once again, should never have a _).  It
 seems as though the parsing code should look for the first _, the last _,
 and then assume everything in the middle is the domain.

I don't know of any reason why a scheme can't have a _...  If you'd
like to change the parsing code to understand domains with a _ this
way,  I think that would be an improvement.

 Doesn't seem like a top priority, but maybe it's worth filing a low priority
 bug for it anyway.  Although they are 2 somewhat distinct use cases and I
 see the possibility for misuse and bad assumptions, I'm not terribly worried
 about it.

I think HTML 5 has notions of origin and effective script origin
(or some such) that separate these two concepts.  It might be worth
syncing up our internal names with the spec to make these concepts
more accessible to future developers.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Security Origins

2009-06-02 Thread Jeremy Orlow
On Mon, Jun 1, 2009 at 11:30 PM, Adam Barth aba...@webkit.org wrote:

 On Mon, Jun 1, 2009 at 8:29 PM, Jeremy Orlow jor...@chromium.org wrote:
  If this is the only issue, the parsing code could work around it.  There
 are
  3 parts to the identifier: the protocol (should never have a _ in it,
  right?), the domain, and the port (once again, should never have a _).
  It
  seems as though the parsing code should look for the first _, the last _,
  and then assume everything in the middle is the domain.

 I don't know of any reason why a scheme can't have a _...  If you'd
 like to change the parsing code to understand domains with a _ this
 way,  I think that would be an improvement.

  Doesn't seem like a top priority, but maybe it's worth filing a low
 priority
  bug for it anyway.  Although they are 2 somewhat distinct use cases and I
  see the possibility for misuse and bad assumptions, I'm not terribly
 worried
  about it.

 I think HTML 5 has notions of origin and effective script origin
 (or some such) that separate these two concepts.  It might be worth
 syncing up our internal names with the spec to make these concepts
 more accessible to future developers.


Agreed.  Most of the new features use the simpler same origin policy which
compares only the protocol, port, and domain.  The effective script origin
 and the security around cookies are the more complex parts.  I believe
there's a fairly clean split between the two parts in the source code.  I'll
file a bug tomorrow regarding this opportunity for cleanup.

J
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Security Origins

2009-06-02 Thread Jeremy Orlow
FYI: https://bugs.webkit.org/show_bug.cgi?id=26143

On Tue, Jun 2, 2009 at 12:45 AM, Jeremy Orlow jor...@chromium.org wrote:

 On Mon, Jun 1, 2009 at 11:30 PM, Adam Barth aba...@webkit.org wrote:

 On Mon, Jun 1, 2009 at 8:29 PM, Jeremy Orlow jor...@chromium.org wrote:
  If this is the only issue, the parsing code could work around it.  There
 are
  3 parts to the identifier: the protocol (should never have a _ in it,
  right?), the domain, and the port (once again, should never have a _).
  It
  seems as though the parsing code should look for the first _, the last
 _,
  and then assume everything in the middle is the domain.

 I don't know of any reason why a scheme can't have a _...  If you'd
 like to change the parsing code to understand domains with a _ this
 way,  I think that would be an improvement.

  Doesn't seem like a top priority, but maybe it's worth filing a low
 priority
  bug for it anyway.  Although they are 2 somewhat distinct use cases and
 I
  see the possibility for misuse and bad assumptions, I'm not terribly
 worried
  about it.

 I think HTML 5 has notions of origin and effective script origin
 (or some such) that separate these two concepts.  It might be worth
 syncing up our internal names with the spec to make these concepts
 more accessible to future developers.


 Agreed.  Most of the new features use the simpler same origin policy which
 compares only the protocol, port, and domain.  The effective script origin
  and the security around cookies are the more complex parts.  I believe
 there's a fairly clean split between the two parts in the source code.  I'll
 file a bug tomorrow regarding this opportunity for cleanup.

 J

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] Security Origins

2009-06-01 Thread Jeremy Orlow
I have 2 questions about SecurityOrigins.


First of all, in SecurityOrigin::databaseIdentifier() (in
http://trac.webkit.org/browser/trunk/WebCore/page/SecurityOrigin.h) the
following comment appears: Serialize the security origin for storage in the
database. This format is deprecated and should be used only for
compatibility with old databases; use toString() and createFromString()
instead.  This was done in http://trac.webkit.org/changeset/32597

Despite the comments in the change log, I don't fully understand what the
intention here is.  DatabaseTracker and LocalStorage still use
databaseIdentifier() which makes sense since, unless I'm totally missing
something, the toString() format produces file names that are invalid on
most file systems.  At the same time, databaseIdentifier omits some of the
logic found in toString (which might be the cause of bugs like
https://bugs.webkit.org/show_bug.cgi?id=20701 ...I haven't confirmed yet).

So, maybe instead of databaseIdentifier() being deprecated, it should be
updated to be more like a toString that uses file system safe characters?


In addition, the creation of SecurityOrigin objects doesn't quite seem
right.  There are many places (like in DatabaseTracker) where security
origins are created based on the file, port, and protocol.  This is fine
when all you're doing is .equals (which the HashMap's
HashArg=SecurityOriginHash prarmeter does for you), but it seems like it
could be inefficient (many SecurityOrigin objects describing the same
origin) and could potentially lead to bugs (since not all of the properties
are serialized during toString()/databaseIdentifier()).

It's surprising to me that there isn't one central store of SecurityOrigin
objects (maybe with weak references?) to eliminate the possibility of 2
SecurityOrigin objects describing the same origin.


In general, I suppose I'm just trying to understand the thought behind
SecurityOrigin objects and how they're used in the source code...because it
doesn't seem very natural to me.

Thanks,
Jeremy
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Security Origins

2009-06-01 Thread Adam Barth
Hi Jeremy,

Some of this has evolved and could likely be cleaned up.

2009/6/1 Jeremy Orlow jor...@chromium.org:
 First of all, in SecurityOrigin::databaseIdentifier() (in 
 http://trac.webkit.org/browser/trunk/WebCore/page/SecurityOrigin.h) the 
 following comment appears: Serialize the security origin for storage in the 
 database. This format is deprecated and should be used only for compatibility 
 with old databases; use toString() and createFromString() instead.  This was 
 done in http://trac.webkit.org/changeset/32597

There are a bunch of problems with
SecurityOrigin::databaseIdentifier(), not the least of which is that
_ can't really be used as a delimiter because some intranet host
names contain the character in their names.  I'd rather we didn't used
databaseIdentifier() and had a single string representation we used
everywhere.

 Despite the comments in the change log, I don't fully understand what the 
 intention here is.  DatabaseTracker and LocalStorage still use 
 databaseIdentifier() which makes sense since, unless I'm totally missing 
 something, the toString() format produces file names that are invalid on most 
 file systems.

I don't see why these string representations need to be used for file
names.  The advantages of the toString() serialization are (1) we need
it for postMessage and cross-origin XMLHttpRequest, and (2) we already
have an awesome parser for it called KURL.

 At the same time, databaseIdentifier omits some of the logic found in 
toString (which might be the cause of bugs like 
https://bugs.webkit.org/show_bug.cgi?id=20701 ...I haven't confirmed yet).

I'll look at that bug once I have Internet access (currently on the train).

 So, maybe instead of databaseIdentifier() being deprecated, it should be 
 updated to be more like a toString that uses file system safe characters?

Late time I looked into this, I was told we couldn't change
databaseIdentifier() because users had files stored on their computer
that depended on the serialization.

 In addition, the creation of SecurityOrigin objects doesn't quite seem 
 right.  There are many places (like in DatabaseTracker) where security 
 origins are created based on the file, port, and protocol.  This is fine when 
 all you're doing is .equals (which the HashMap's HashArg=SecurityOriginHash 
 prarmeter does for you), but it seems like it could be inefficient (many 
 SecurityOrigin objects describing the same origin) and could potentially lead 
 to bugs (since not all of the properties are serialized during 
 toString()/databaseIdentifier()).

I suspect we can improve this.

 It's surprising to me that there isn't one central store of SecurityOrigin 
 objects (maybe with weak references?) to eliminate the possibility of 2 
 SecurityOrigin objects describing the same origin.

The problem is that SecurityOrigin objects are mutable.  For example,
a SecurityOrigin mutates if you set its document.domain property
(called domain).  Also, certain SecurityOrigin objects might be
granted certain privileges, and we might not be able to do that if
everyone shared the same object.

 In general, I suppose I'm just trying to understand the thought behind 
 SecurityOrigin objects and how they're used in the source code...because it 
 doesn't seem very natural to me.

It might be productive to invent a class StorageOrigin or some such
that is immutable, and therefore can be shared and moved across
threads.  That way the storage system wouldn't be tempted to use
SecurityOrigin features that don't make sense for storage (e.g.,
grantLoadLocalResources).

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Security Origins

2009-06-01 Thread Sam Weinig
On Mon, Jun 1, 2009 at 3:50 PM, Jeremy Orlow jor...@chromium.org wrote:

 I have 2 questions about SecurityOrigins.


 First of all, in SecurityOrigin::databaseIdentifier() (in
 http://trac.webkit.org/browser/trunk/WebCore/page/SecurityOrigin.h) the
 following comment appears: Serialize the security origin for storage in the
 database. This format is deprecated and should be used only for
 compatibility with old databases; use toString() and createFromString()
 instead.  This was done in http://trac.webkit.org/changeset/32597

 Despite the comments in the change log, I don't fully understand what the
 intention here is.  DatabaseTracker and LocalStorage still use
 databaseIdentifier() which makes sense since, unless I'm totally missing
 something, the toString() format produces file names that are invalid on
 most file systems.  At the same time, databaseIdentifier omits some of the
 logic found in toString (which might be the cause of bugs like
 https://bugs.webkit.org/show_bug.cgi?id=20701 ...I haven't confirmed yet).

 So, maybe instead of databaseIdentifier() being deprecated, it should be
 updated to be more like a toString that uses file system safe characters?


It is deprecated in the sense that only Database code should use it.  We
can't make Database code use toString() because there are existing clients
that depend on the identifier to remain the same.


 In addition, the creation of SecurityOrigin objects doesn't quite seem
 right.  There are many places (like in DatabaseTracker) where security
 origins are created based on the file, port, and protocol.  This is fine
 when all you're doing is .equals (which the HashMap's
 HashArg=SecurityOriginHash prarmeter does for you), but it seems like it
 could be inefficient (many SecurityOrigin objects describing the same
 origin) and could potentially lead to bugs (since not all of the properties
 are serialized during toString()/databaseIdentifier()).

 It's surprising to me that there isn't one central store of SecurityOrigin
 objects (maybe with weak references?) to eliminate the possibility of 2
 SecurityOrigin objects describing the same origin.


I suspect SecurityOrigins can be optimized, but as of yet they have not
appeared on profiles.  I would like to improve SecurityOrigins to better
represent the concepts of effective script origin and origins representing
resources more effectively.




 In general, I suppose I'm just trying to understand the thought behind
 SecurityOrigin objects and how they're used in the source code...because it
 doesn't seem very natural to me.


SecurityOrigin was created to cleanup the way we do same-origin checks in
the JavaScript bindings, and as such, carry some of that baggage (eg. domain
relaxing).  The main design constraint then was that it was a cheap compare
for two origins representing the same document (pointer compare) as that is
common case.  The code evolved from there.

-Sam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev