Re: [whatwg] Annotating structured data that HTML has no semantics for
A cursory glance on the new section 5 raises two questions on indirection: (Note the s in the last example -- since sometimes the information isn't visible, rather than requiring that people put it in and hide it with display:none, which has a rather poor accessibility story, I figured we could just allow anywhere, if it has a property="" attribute.) That seems to be a solution optimised for extremely invisible metadata but not for metadata which differs from the human visible data. Imagine as an example the simple act of marking up a number (and ignoring what the number denotes). For human consumption a thousands seperator is often used, the type of seperator differs by language, locale and context. Just in my little word I see on regular basis the point, the comma, the space, the thin space and sometimes the the apostrophe. Parsing different representations of numbers would be a chore. The value of textContent of the element itemprop="com.example.price">€ 1thinsp;000thinsp;000,—span> is clearly unusable, demanding an additional invisible property="com.example.price" content="100">. My irritation lies in the element proliferation, requiring one element/ attribute combination for machines, one element/text content combination for humans. Of course, any sane author would arrange both elements in a close relation, as parent/child or sibling but there would be still two different elements to maintain, leading to a higher cognitive load. Not just for authors but also for programmers: a fluctating price had to be actualized on two different elements; tree walking DOM scripts had to take meta-Elements in account. Furthermore it clashes with the familiar habit of other elements in HTML. A hyperlink is one element with a machine-readable attribute and human- readable text content. A citation is one element with a machine- readable reference and human-readable text content. The same model is used in , , , ... but not in user- defined objects. I'd prefer an additional @content-like attribute which supersedes the text content and maybe even the default values of the other value-bearing elements, reducing two different elements to maintain or change to just one. Instead, let us try using the regular "IDREF" functionality that HTML uses in a variety of other places, like . For this we'll need a new attribute, but unfortunately we can't use about="" (which would be the obvious name to use), because that would conflict with RDFa, so instead we'll use subject="": I'm slighty irritated by the implied change from active, possessive formulating (“The cat has the name Hedral.”) to something more passive- y (“Hedral is a name owned by that cat.“). My mental model for property relationships orients itself more on the former wording; link relationships are similar in that regard. @about/@subject are like @rev; a @resource alias @rel would feel more natural. There are practical relation by the missing @resource, I think. Imagine a document documenting an household and a household vocabulary which allows triples of s which are in an relationship to a . Given an household of two humans and one cat; how does one markup the assumption that the cat has two owners?
Re: [whatwg] Expandos and Prototyping
On Mon, 11 May 2009, Charles Pritchard wrote: > > Are expando / prototype functions at all included in the HTML 5 specs? Yes. Specifically, HTML5 uses WebIDL for all its definitions of interfaces, objects, etc, and the WebIDL spec defines how prototypes, custom properties, etc, work: http://dev.w3.org/2006/webapi/WebIDL/ -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] Expandos and Prototyping
Are expando / prototype functions at all included in the HTML 5 specs? While we may all know what Object.prototype does, I'd like to see its use added to Section 6: Web browsers. The Prototype Expando is not necessarily a Javascript-only construct, and neither is HTML 5. While I'm not championing full prototype inheritance, I do wonder (out-loud), whether some small section of HTML 5 might be describe the most basic of prototyping and expandos: Many projects use "ellipse" or other shapes for example, but this is easier: CanvasRenderingContext2D.prototype.funcName = function() { alert("Fill"+this.fillStyle); } document.createElement('canvas').getContext('2d').funcName(); I've never seen any developers attempt to use multiple inheritance within the CanvasRenderingContext2D object, nor have I tested myself to see if Firefox (the champion of such schemes) supports it. Which is why I'd be more than satisfied simply requiring single inheritance. It's already available in all implementations, and we spent a good deal of time making it available in our own. Expando Prototype would need descriptions of: expandos, prototyped objects, for(... in ...) All modern browsers support prototype, and so do many languages (without writing libraries). We've confirmed that it expandos and prototypes work just fine in Active X, MS long ago created IDispatchEx. Any host language with getter / setter availability can implement prototyping and expandos on an object, at least of one depth. I'd like to see ".prototype" described in the scripting section. That said, I'm more hesitant to champion ".constructor" and ".__proto__". -Charles
Re: [whatwg] innerStaticHTML
On 06.05.2009, at 17:31, Adam Barth wrote: WHY NOT toStaticHTML? toStaticHTML addresses the same use cause by translating an untrusted string to another string that lacks active HTML content. This API has two issues: 1) The untrusted string -> static string -> HTML parser workflow requires the browser to parse the string twice, introducing a performance penalty and a security issue if the two parsing aren't identical. That is based on assumptions that: 1. parsing is expensive enough to warrant API optimized for this particular case 2. browsers cannot optimize it otherwise 3. returned code will be ambiguous In client-side scripts untrusted content comes from the network, which means that parsing time is going to be miniscule compared to time required to fetch the content (and to render it). My guess is that parsing itself is not a bottleneck. Second, it _is_ possible to avoid reparsing without special API for this. toStaticHTML() may return subclass of String that contains reference to parsed DOM. Roughly something like this: function toStaticHTML(html) { var cleanDOM = clean(parse(html)) return { toString:function(){return unparse(cleanDOM)}, node:cleanDOM } } which should make common case: innerHTML = toStaticHTML(html) just as fast as innerStaticHTML = html; toStaticHTML() enables other optimisations, e.g. filtered HTML can be saved for future use (in local storage) or string filtered once used in multiple places. Alternatively there could be toStaticDOM() method that returns DOMDocumentFragment, avoiding reparsing issue entirely. 2) The API is difficult to future-proof because future versions of HTML are likely to add new tags with active content (e.g., like the tag's event handlers). When support for new tag is added to a browser, it would also be added to its toStaticHTML()/innerStaticHTML, so evolution of HTML shouldn't be a problem either way. Browser doesn't need to worry about dangerous constructs it does not support. Methods are easier to patch than properties in JavaScript, so if implementation of existing toStaticHTML() turned out to be insecure, the method could be easily replaced/patched on cilent-side, or applications could post-process output of toStaticHTML(). It's not that easy with a property. I dislike APIs based on magic properties. Properties cannot take arguments and we'd have to create new property for every combination of arguments. If innerHTML was a method, instead of creating new property we could extend it to be innerHTML(html, static=true). If more sophisticated filtering becomes needed in the future, we could have toStaticHTML(html, {preserve:['svg','rdf'], remove:'marquee'}), but it would be silly to create another innerStaticHTMLwithSVGandRDFbutWithoutMarquee property. -- regards, Kornel
Re: [whatwg] Custom microdata handling added to HTML5 spec
On Sun, 10 May 2009, Manu Sporny wrote: > Shelley Powers wrote: > > Since a new section detailing HTML5's handling of custom microdata has > > been added to the HTML5 spec > > > > http://dev.w3.org/html5/spec/Overview.html#microdata > > I've only had a brief chance to look over the HTML5 Microdata spec, but > there is one big problem that overrides all of the other issues: The > HTML5 Microdata spec is in direct conflict with planned RDFa extensions > and will almost surely result in spurious triples being generated in > RDFa processors in the future. I've renamed property="" to itemprop="". -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] innerStaticHTML
On Tue, May 12, 2009 at 4:16 AM, Adam Barth wrote: > On Thu, May 7, 2009 at 3:24 AM, Kristof Zelechovski > wrote: > > If toStaticHTML prunes everything it is not sure of, the danger of a > known > > language construct suddenly introducing active content is negligible. I > am > > sure HTML5 specification editors bear that aspect in mind and so shall > they > > in the future. > > Even if you believe that we've already committed to not introducing > active content that breaks toStaticHTML (which I'm not convinced we > have, especially because I don't know what algorithm it uses) I would be shocked if we have committed to not introducing active content that breaks IE8's toStaticHTML. That would be terribly limiting. (Does it prune the and event attributes?) When you call innerStaticHTML it should prune everything that's unsafe for *this UA*. Authors should not send that content to other UAs and expect it to be safe for those UAs. Rob -- "He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all." [Isaiah 53:5-6]
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Mon, May 11, 2009 at 6:15 PM, Giovanni Gentili wrote: > * a user (or groups of users) wants to annotate > items present on a generic web page with > additional properties in a certain vocabulary. > for example Joe wants to gather in a blog > a series of personal annotation to movies > (or other type of items) present in imdb.com. > > [...] > > this option require that @subject accept: > > 1) ID of an element with an item attribute, in the same Document > or > 2) valid URL of an element with an item attribute elsewhere in the web > or > 3) a valid URL (ithe item is the referred document or fragment) For the RDF output, you can use http://subject/";> to create triples whose subject is a URL. (I believe in general you can also do: http://subject/";> http://predicate1/"; href="http://object1/";> http://predicate2/"; content="object2"> to represent arbitrary RDF triples.) I don't think it would make sense for @subject to be a URL when generating JSON output, because there wouldn't be anywhere to represent that URL in the output structure. But there could be a convention that properties called "about" indicate the URLs that the item applies to, and then it would work with exactly the same markup as the RDF case. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
Ian Hickson: > USE CASE: Annotate structured data that HTML has no semantics for, and > which nobody has annotated before, and may never again, for private use or > use in a small self-contained community. > (..) > SCENARIOS: Between the scenarios should be considered also this case: * a user (or groups of users) wants to annotate items present on a generic web page with additional properties in a certain vocabulary. for example Joe wants to gather in a blog a series of personal annotation to movies (or other type of items) present in imdb.com. other examples of "external annotation" could be derived from this document [1]. this option require that @subject accept: 1) ID of an element with an item attribute, in the same Document or 2) valid URL of an element with an item attribute elsewhere in the web or 3) a valid URL (ithe item is the referred document or fragment) This raises two other questions: a) In the case of properties specified for element without ancestor with an item attribute specified the corresponding item should be the document? (element body with implicit item attribute). b) Do we need to require UA to offer a standard way to visualize (at least as an option left to the user) the structured information carried in microdata ? And copy&paste? See also this email [2]. [1] http://www.w3.org/TR/2009/WD-media-annot-reqs-20090119/#req-r01 [2] http://lists.w3.org/Archives/Public/public-html/2009Jan/0082.html -- Giovanni Gentili
Re: [whatwg] innerStaticHTML
On Wed, May 6, 2009 at 9:40 AM, João Eiras wrote: > The suggestion of marking content as non-executable doesn't solve anything, > because after setting innerStaticHTML another script might serialize a piece > of the affected DOM to string and back to a tree, and the code could then > execute, which would not be wanted. Yes, we can't make it impossible for web developers to shoot themselves in the foot. We also can't stop them from calling eval on a query string argument. However, innerStaticHTML does make it easier to display untrusted HTML to the user. > The only viable solution, from my point of view, would be for the UA to parse > the string, and remove all untrusted content from the result tree before > appending to the document. This is what I meant to suggest. > That would mean removing all onevent attributes, all scripts elements, all > plugins, etc. Basically, letting the UA implement all the filtering. Exactly. As you say, the UA is in a much better position to do this correctly than an individual web site. On Thu, May 7, 2009 at 3:24 AM, Kristof Zelechovski wrote: > If toStaticHTML prunes everything it is not sure of, the danger of a known > language construct suddenly introducing active content is negligible. I am > sure HTML5 specification editors bear that aspect in mind and so shall they > in the future. Even if you believe that we've already committed to not introducing active content that breaks toStaticHTML (which I'm not convinced we have, especially because I don't know what algorithm it uses), that still leaves the performance and correctness issues of parsing the untrusted content twice. Parsing the content once is more efficient and more predictable. Adam
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Sun, 10 May 2009 12:32:34 +0200, Ian Hickson wrote: Page 3: My Cats Schrödinger Orange male. Erwin Siamese color-point. Given the microdata solution and this example, there is now a reason other than styling to introduce , since here you duplicate the information in . Schrödinger Orange male. ... The styling problem is discussed at http://forums.whatwg.org/viewtopic.php?t=47 -- Simon Pieters Opera Software