date:20090511

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-11 Thread Tim Tepaße

A cursory glance on the new section 5 raises two questions on
indirection:

(Note the s in the last example -- since sometimes the
information

isn't visible, rather than requiring that people put it in and hide it
with display:none, which has a rather poor accessibility story, I
figured
we could just allow anywhere, if it has a property=""
attribute.)

That seems to be a solution optimised for extremely invisible metadata
but not for metadata which differs from the human visible data.
Imagine as an example the simple act of marking up a number (and
ignoring what the number denotes). For human consumption a thousands
seperator is often used, the type of seperator differs by language,
locale and context. Just in my little word I see on regular basis the
point, the comma, the space, the thin space and sometimes the the
apostrophe. Parsing different representations of numbers would be a
chore. The value of textContent of the element itemprop="com.example.price">€ 1thinsp;000thinsp;000,—span> is clearly unusable, demanding an additional invisible property="com.example.price" content="100">.

My irritation lies in the element proliferation, requiring one element/
attribute combination for machines, one element/text content
combination for humans. Of course, any sane author would arrange both
elements in a close relation, as parent/child or sibling but there
would be still two different elements to maintain, leading to a higher
cognitive load. Not just for authors but also for programmers: a
fluctating price had to be actualized on two different elements; tree
walking DOM scripts had to take meta-Elements in account. Furthermore
it clashes with the familiar habit of other elements in HTML. A
hyperlink is one element with a machine-readable attribute and human-
readable text content. A citation is one element with a machine-
readable reference and human-readable text content. The same model is
used in , , , ... but not in user-
defined objects. I'd prefer an additional @content-like attribute
which supersedes the text content and maybe even the default values of
the other value-bearing elements, reducing two different elements to
maintain or change to just one.

Instead, let us try using the regular "IDREF" functionality that
HTML uses
in a variety of other places, like . For this we'll
need a
new attribute, but unfortunately we can't use about="" (which would
be the
obvious name to use), because that would conflict with RDFa, so
instead

we'll use subject="":

I'm slighty irritated by the implied change from active, possessive
formulating (“The cat has the name Hedral.”) to something more passive-
y (“Hedral is a name owned by that cat.“). My mental model for
property relationships orients itself more on the former wording; link
relationships are similar in that regard. @about/@subject are like
@rev; a @resource alias @rel would feel more natural. There are
practical relation by the missing @resource, I think. Imagine a
document documenting an household and a household vocabulary which
allows triples of s which are in an relationship to a
. Given an household of two humans and one cat; how does one
markup the assumption that the cat has two owners?

Re: [whatwg] Expandos and Prototyping

2009-05-11 Thread Ian Hickson

On Mon, 11 May 2009, Charles Pritchard wrote:
>
> Are expando / prototype functions at all included in the HTML 5 specs?

Yes. Specifically, HTML5 uses WebIDL for all its definitions of 
interfaces, objects, etc, and the WebIDL spec defines how prototypes, 
custom properties, etc, work:

   http://dev.w3.org/2006/webapi/WebIDL/

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

[whatwg] Expandos and Prototyping

2009-05-11 Thread Charles Pritchard


Are expando / prototype functions at all included in the HTML 5 specs?
While we may all know what Object.prototype does, I'd like to see its 
use added to

Section 6: Web browsers.

The Prototype Expando is not necessarily a Javascript-only construct, 
and neither is HTML 5.


While I'm not championing full prototype inheritance, I do wonder 
(out-loud),
whether some small section of HTML 5 might be describe the most basic of 
prototyping and expandos:


Many projects use "ellipse" or other shapes for example, but this is easier:
CanvasRenderingContext2D.prototype.funcName = function() {
   alert("Fill"+this.fillStyle);
}
 document.createElement('canvas').getContext('2d').funcName();

I've never seen any developers attempt to use multiple inheritance 
within the CanvasRenderingContext2D object,
nor have I tested myself to see if Firefox (the champion of such 
schemes) supports it. Which is why I'd be
more than satisfied simply requiring single inheritance. It's already 
available in all implementations,

and we spent a good deal of time making it available in our own.

Expando Prototype would need descriptions of:
expandos, prototyped objects, for(... in ...)

All modern browsers support prototype, and so do many languages (without 
writing libraries).
We've confirmed that it expandos and prototypes work just fine in Active 
X, MS long ago created IDispatchEx.


Any host language with getter / setter availability can implement 
prototyping and expandos on an object, at least of one depth.


I'd like to see  ".prototype" described in the scripting section.

That said, I'm more hesitant to champion ".constructor" and ".__proto__".

-Charles

Re: [whatwg] innerStaticHTML

2009-05-11 Thread Kornel Lesiński


On 06.05.2009, at 17:31, Adam Barth wrote:


WHY NOT toStaticHTML?

toStaticHTML addresses the same use cause by translating an untrusted
string to another string that lacks active HTML content.  This API has
two issues:

1) The untrusted string -> static string -> HTML parser workflow
requires the browser to parse the string twice, introducing a
performance penalty and a security issue if the two parsing aren't
identical.


That is based on assumptions that:
1. parsing is expensive enough to warrant API optimized for this  
particular case

2. browsers cannot optimize it otherwise
3. returned code will be ambiguous

In client-side scripts untrusted content comes from the network, which  
means that parsing time is going to be miniscule compared to time  
required to fetch the content (and to render it). My guess is that  
parsing itself is not a bottleneck.


Second, it _is_ possible to avoid reparsing without special API for  
this. toStaticHTML() may return subclass of String that contains  
reference to parsed DOM. Roughly something like this:


function toStaticHTML(html)
{
var cleanDOM = clean(parse(html))
return {
toString:function(){return unparse(cleanDOM)},
node:cleanDOM
}
}

which should make common case:

innerHTML = toStaticHTML(html) just as fast as innerStaticHTML = html;

toStaticHTML() enables other optimisations, e.g. filtered HTML can be  
saved for future use (in local storage) or string filtered once used  
in multiple places.


Alternatively there could be toStaticDOM() method that returns  
DOMDocumentFragment, avoiding reparsing issue entirely.



2) The API is difficult to future-proof because future versions of
HTML are likely to add new tags with active content (e.g., like the
 tag's event handlers).


When support for new tag is added to a browser, it would also be added  
to its toStaticHTML()/innerStaticHTML, so evolution of HTML shouldn't  
be a problem either way. Browser doesn't need to worry about dangerous  
constructs it does not support.


Methods are easier to patch than properties in JavaScript, so if  
implementation of existing toStaticHTML() turned out to be insecure,  
the method could be easily replaced/patched on cilent-side, or  
applications could post-process output of toStaticHTML().

It's not that easy with a property.

I dislike APIs based on magic properties. Properties cannot take  
arguments and we'd have to create new property for every combination  
of arguments. If innerHTML was a method, instead of creating new  
property we could extend it to be innerHTML(html, static=true).


If more sophisticated filtering becomes needed in the future, we could  
have toStaticHTML(html, {preserve:['svg','rdf'], remove:'marquee'}),  
but it would be silly to create another  
innerStaticHTMLwithSVGandRDFbutWithoutMarquee property.


--
regards, Kornel

Re: [whatwg] Custom microdata handling added to HTML5 spec

2009-05-11 Thread Ian Hickson

On Sun, 10 May 2009, Manu Sporny wrote:
> Shelley Powers wrote:
> > Since a new section detailing HTML5's handling of custom microdata  has
> > been added to the HTML5 spec
> > 
> > http://dev.w3.org/html5/spec/Overview.html#microdata
> 
> I've only had a brief chance to look over the HTML5 Microdata spec, but
> there is one big problem that overrides all of the other issues: The
> HTML5 Microdata spec is in direct conflict with planned RDFa extensions
> and will almost surely result in spurious triples being generated in
> RDFa processors in the future.

I've renamed property="" to itemprop="".

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] innerStaticHTML

2009-05-11 Thread Robert O'Callahan

On Tue, May 12, 2009 at 4:16 AM, Adam Barth  wrote:

> On Thu, May 7, 2009 at 3:24 AM, Kristof Zelechovski
>  wrote:
> > If toStaticHTML prunes everything it is not sure of, the danger of a
> known
> > language construct suddenly introducing active content is negligible.  I
> am
> > sure HTML5 specification editors bear that aspect in mind and so shall
> they
> > in the future.
>
> Even if you believe that we've already committed to not introducing
> active content that breaks toStaticHTML (which I'm not convinced we
> have, especially because I don't know what algorithm it uses)

I would be shocked if we have committed to not introducing active content
that breaks IE8's toStaticHTML. That would be terribly limiting. (Does it
prune the  and  event attributes?)

When you call innerStaticHTML it should prune everything that's unsafe for
*this UA*. Authors should not send that content to other UAs and expect it
to be safe for those UAs.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-11 Thread Philip Taylor

On Mon, May 11, 2009 at 6:15 PM, Giovanni Gentili
 wrote:
> * a user (or groups of users) wants to annotate
> items present on a generic web page with
> additional properties in a certain vocabulary.
> for example Joe wants to gather in a blog
> a series of personal annotation to movies
> (or other type of items) present in imdb.com.
>
> [...]
>
> this option require that @subject accept:
>
> 1) ID of an element with an item attribute, in the same Document
> or
> 2) valid URL of an element with an item attribute elsewhere in the web
> or
> 3) a valid URL (ithe item is the referred document or fragment)

For the RDF output, you can use http://subject/";> to create triples whose subject is a URL. (I
believe in general you can also do:

  http://subject/";>
  http://predicate1/"; href="http://object1/";>
  http://predicate2/"; content="object2">
to represent arbitrary RDF triples.)

I don't think it would make sense for @subject to be a URL when
generating JSON output, because there wouldn't be anywhere to
represent that URL in the output structure. But there could be a
convention that properties called "about" indicate the URLs that the
item applies to, and then it would work with exactly the same markup
as the RDF case.

-- 
Philip Taylor
exc...@gmail.com

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-11 Thread Giovanni Gentili

Ian Hickson:
>   USE CASE: Annotate structured data that HTML has no semantics for, and
>   which nobody has annotated before, and may never again, for private use or
>   use in a small self-contained community.
> (..)
>   SCENARIOS:

Between the scenarios should be considered also this case:

* a user (or groups of users) wants to annotate
items present on a generic web page with
additional properties in a certain vocabulary.
for example Joe wants to gather in a blog
a series of personal annotation to movies
(or other type of items) present in imdb.com.

other examples of "external annotation" could
be derived from this document [1].

this option require that @subject accept:

1) ID of an element with an item attribute, in the same Document
or
2) valid URL of an element with an item attribute elsewhere in the web
or
3) a valid URL (ithe item is the referred document or fragment)

This raises two other questions:

a) In the case of  properties specified for element
without ancestor with an item attribute specified
the corresponding item should be the document?
(element body with implicit item attribute).

b) Do we need to require UA to offer a standard
way to visualize (at least as an option left to the user)
the structured information carried in microdata ?
And copy&paste? See also this email [2].

[1] http://www.w3.org/TR/2009/WD-media-annot-reqs-20090119/#req-r01
[2] http://lists.w3.org/Archives/Public/public-html/2009Jan/0082.html

-- 
Giovanni Gentili

Re: [whatwg] innerStaticHTML

2009-05-11 Thread Adam Barth

On Wed, May 6, 2009 at 9:40 AM, João Eiras  wrote:
> The suggestion of marking content as non-executable doesn't solve anything, 
> because after setting innerStaticHTML another script might serialize a piece 
> of the affected DOM to string and back to a tree, and the code could then 
> execute, which would not be wanted.

Yes, we can't make it impossible for web developers to shoot
themselves in the foot.  We also can't stop them from calling eval on
a query string argument.  However, innerStaticHTML does make it easier
to display untrusted HTML to the user.

> The only viable solution, from my point of view, would be for the UA to parse 
> the string, and remove all untrusted content from the result tree before 
> appending to the document.

This is what I meant to suggest.

> That would mean removing all onevent attributes, all scripts elements, all 
> plugins, etc. Basically, letting the UA implement all the filtering.

Exactly.  As you say, the UA is in a much better position to do this
correctly than an individual web site.

On Thu, May 7, 2009 at 3:24 AM, Kristof Zelechovski
 wrote:
> If toStaticHTML prunes everything it is not sure of, the danger of a known
> language construct suddenly introducing active content is negligible.  I am
> sure HTML5 specification editors bear that aspect in mind and so shall they
> in the future.

Even if you believe that we've already committed to not introducing
active content that breaks toStaticHTML (which I'm not convinced we
have, especially because I don't know what algorithm it uses), that
still leaves the performance and correctness issues of parsing the
untrusted content twice.  Parsing the content once is more efficient
and more predictable.

Adam

Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-11 Thread Simon Pieters


On Sun, 10 May 2009 12:32:34 +0200, Ian Hickson  wrote:


   Page 3:
   My Cats
   
Schrödinger

 
 
 Orange male.
Erwin

 
 
 Siamese color-point.
 
   


Given the microdata solution and this example, there is now a reason other than styling to 
introduce , since here you duplicate the  information in .

  
   
Schrödinger

 
 Orange male.
   
   ...


The styling problem is discussed at http://forums.whatwg.org/viewtopic.php?t=47

--
Simon Pieters
Opera Software

Re: [whatwg] Annotating structured data that HTML has no semantics for

Re: [whatwg] Expandos and Prototyping

[whatwg] Expandos and Prototyping

Re: [whatwg] innerStaticHTML

Re: [whatwg] Custom microdata handling added to HTML5 spec

Re: [whatwg] innerStaticHTML

Re: [whatwg] Annotating structured data that HTML has no semantics for

Re: [whatwg] Annotating structured data that HTML has no semantics for

Re: [whatwg] innerStaticHTML

Re: [whatwg] Annotating structured data that HTML has no semantics for

10 matches

Site Navigation

Mail list logo

Footer information