Process discussion, taking off-list.
-public-html
+www-archive
On 2010-06-14 18:32, Maciej Stachowiak wrote:
When the Chairs review survey responses on an issue, we also
carefully study the Change Proposals submitted and most particularly
the rationale sections. If you look at the Working Group Decision for
ISSUE-76
(<http://lists.w3.org/Archives/Public/public-html/2010Jan/att-0218/issue-76-decision.html>),
each point of rationale in each submitted Change Proposal was
explicitly addressed. For the recent round of decisions, we also
carefully reviewed Change Proposal rationales, but we commented on
them in a somewhat more cursory way.
To the casual reader, the Microdata decision was written in a way that
gave the appearance of it being an objective analysis, and I have no
doubt that that was your sincere intention when it was written.
However, the rationale presented merely mentions the arguments and
counter arguments as if they were somehow equal, while failing to
consider the validity or technical strength of the arguments presented
in the final decision.
Not all arguments are created equally and don't automatically balance
out merely because they contradict each other. Yet that is how that
Microdata decision clearly evaluated the respective positions.
Effectively, the result merely counted up the number of arguments listed
for and against, saw that there appeared to be 1 more in favour of
splitting, and thus ended up with the wrong decision being made.
To be clear, I'm writing the following to highlight exactly why I
consider that to be the case with the microdata decision. I'm not
writing this in an attempt to reopen the issue at this time (and I held
back from writing this months ago) because I believed that continuing to
fight this one decision would be less productive than moving on.
However, given the current dysfunctional state of the WG and the clearly
and catastrophically failing decision process that keeps making the same
mistakes, I feel it's time to speak up.
From the Microdata decision:
One argument presented was that "All good specs which integrate with
HTML5 should, ideally, be a part of HTML5." Other Working Group members
disagreed. They pointed out that this appears to contradict our position
that HTML5 enables extension specs - are we saying that nay such spec is
by definition not good?
The latter argument here is wrong because it fails to take into account
the fact that extension specs are merely permitted because they allow
independent editors to work on entirely separate specs, who don't have
access to HTML5 spec itself. There is a big difference between
permitting and requiring. There is nothing that requires a new feature
to be written as an extension spec - indeed, many new features are
included in HTML5 and are not written as extension specs.
The argument also fails to take into account that Microdata and the rest
of HTML5 share the same editor, and are in fact edited in the same
source document. And so there is no technical reason to require a
separate specification either.
So, if we're going to keep a tally, that's 1 point to merging, 0 to
splitting.
Another way of framing the point was in terms of Conway's Law - that
by splitting the spec we will make technology reflect organization, and
thus weaken integration and lead to the specs using workarounds to work
together. Relatedly, it was mentioned that Microdata might in theory
even be moved to another Working Group. But it was pointed out that, as
applied to Microdata, the Conway's Law and separate WG arguments are
purely hypothetical - it's demonstrably possible to split Microdata in
its current form without any technical changes, to keep it in the same
working group, and indeed even to keep the same editor as with other
HTML5 spinoff specs.
There are 2 arguments intermingled here:
1. Splitting will weaken integration
2. Microdata could be moved to another WG.
The decision correctly dismisses the latter as hypothetical.
Tally: Merging: 2 points, Splitting: 0 points.
The former, however, wrongly assumes that just because it's possible to
do somthing on a technical level, that the same level of integration
will be maintained. The fact is that the level of integration is
lowered because the separate Microdata spec has to explicitly override
specific sections (the element content models) in the HTML5 spec in
order for the integration to work. This leads to misleading situation
for implementers, particularly validator implementers, who must read 2
independent specs to determine the content models for elements.
Tally: Merging: 3 points, Splitting: 0 points.
Indeed, some argued that Microdata may already be affected by
Conway's law, due to being part of the spec part of the spec. For
example, it doesn't work with content from other namespaces such as
SVG or MathML.
The decision here completely ignored the arguments about that being by
design, and not a design flaw per se. There is also no direct
correlation shown between splitting the spec and changing that decision,
and nor was there any correlation between that design decision and the
initial drafting of the feature within the spec. In fact, even with the
current split, the design decision against being usable with MathML and
SVG elements is still in place. So again, that argument for splitting
is invalidated.
Tally: Merging: 4 points, Splitting: 0 points.
Another response was that separate specifications which are reviewed
and maintained by the HTML WG can be an equally good or even better
approach.
That argument is presented without any justification for that position.
In fact, experience shows that having features in separate specs
reduces the amount they get reviewed by people focussing on the main
spec, and so again, that argument is invalidated.
Tally: Merging: 5 points, Splitting: 0 points.
Some argued that having the Microdata specification separate from the
HTML5 specification will allow the technologies to evolve
independently from HTML5. But others pointed out that this could
actually be a problem - Microdata and HTML5 being published
separately may leave them out of sync.
The decision here correctly invalidates the first argument with the
counter argument, and yet still wrongly ends up considering these
arguments as balancing each other out. Although we - implementers - are
largely insulated from this issue because they share the same editor and
do live together in the WHATWG copy.
Tally: Merging: 6 points, Splitting: 0 points.
Another specific point raised was that a smaller core document
facilitates better review of the parts that are truly essential to
review.
This makes the false assumption that Microdata somehow isn't an
essential part to review. It also fails to explain why having it is own
section of the spec somehow impedes the ability to review other sections
of the spec, nor, conversely, why splitting it would somehow facilitate
it. Nevertheless, the following:
But the case was also made that inclusiveness promotes greater
attention to each part, and that for potentially split sections,
being part of the main standard will attract more review attention.
... is a successful counter argument.
Tally: Merging: 7 points, Splitting 0 points.
A number of WG participants argued in general terms that the spec was
"bloated" or "large enough", and that it was good to split anything
that should be split. The principle of orthogonality was cited. But
other participants pointed out that modularity is not always good.
Sometimes it makes a technology more general at the expense of focus.
As a middle ground, though, creating a separate spec with the sole or
primary aim of use with another spec can still be provide some of the
benefits of both.
Neither of those arguments are particularly compelling in their own
right, neither for or against splitting. Though it's difficult to see
how it was considered rational to state, based on those weak arguments,
that splitting the spec is some kind of middle ground that provides some
unexplained benefits.
On the whole, these lines of argument seemed balanced against each
other and therefore inconclusive.
With the current tally at 7-0, I find that claim to be astoundingly
inaccurate. Yet it only goes to show what I said earlier about the
validity and technical strength of the arguments not being taken into
account.
Another point raised was the idea that Microdata is an intrinsic
"part of the language", the same as any other extension mechanism in
HTML5, such as @class, @id, @title, etc. This line of argument makes
the case that it doesn't make sense to split out Microdata but not
other features, because it's just as much part of the language. But
other WG members argued that Microdata is relatively orthogonal and
separable - while it has dependencies on other parts of HTML, other
parts of HTML mostly don't depend on Microdata. Some went so far as
to call it circular reasoning to argue that Microdata should be part
of HTML because Microdata is part of HTML.
When presented as a straw man argument like that, it has the appearance
of circular reasoning. But the statement fails to distinguish between
the HTML specification itself and the HTML vocabulary.
Microdata was developed as part of the HTML vocabulary to address the
metadata related use cases and problems that HTML didn't adequately
address already. So the real reason is: "Microdata should be part of
[the HTML specification] because Microdata is part of [the HTML
vocabulary].", which is clearly not circular reasoning.
Tally: Merging: 8 points, Splitting: 0 points.
Some poll participants argued that Microdata is out of charter, at
least as Rec-track work. The argument goes that the charter doesn't
say the working group is allowed to actually add additional
vocabularies, only to develop an extensibility mechanism. However,
this does not seem to be well-founded in the charter. Even though the
charter gives RDFa as an example of a vocabulary that could be added
via an extensibility mechanism, RDFa is also an extensibility
mechanism itself, a way of adding vocabularies, and the charter does
not rule out adding RDFa, or something similar, directly. Thus, the
charter does not appear to rule out working on Microdata or HTML+RDFa
entirely, whether in the main spec or in a separate draft.
Correct.
Tally: Merging: 9 points, Splitting: 0 points.
There has been considerable discussion about the comparative
technical merits of RDFa and Microdata. Microdata advocates argue
that Microdata has many technical advantages. It is simple for common
cases and only complicated in rare in complicated cases. It has
defned conversation to XML and JSON, has a DOM API, and has various
other good properties. It avoids the potential confusion of CURIEs
and namespaces. The Microformats community has shown that there is a
demand for embedding machine reasonable metadata in HTML, but that
many of the built-in extensions are lacking. Microdata, it is said,
can fill this gap. RDFA advocates concede that Microdata has
significant strengths, but counter that many of the advantages of
Microdata can and will be replicated in RDFa, in RDFa 1.1.
While that assessment of Microdata vs. RDFa seems reasonable, neither
makes a compelling argument on the splitting issue.
Some also deny the importance of some of Microdata's features, for
instance they may argue that namespace prefixes are not in fact
confusing. Conversely, RDFa advocates argue that RDFa has some
important technical advantages. RDFa is more complete, and in some
cases Microdata may be *too* simple. For example it lacks
multi-language support. RDFa support the follow-your-nose principle
and semantic object validation and has various other advantages. And
it's reported that much of the community that provided the use cases
driving Microdata is not satisfied with the result, and prefers
RDFa. Overall Microdata proponents say that Microdata is valuable
because it is mostly inspired by existing technologies but overcomes
their flaws. Conversely, RDFa proponents say that Microdata isn't
exciting because its main feature is not being RDFa, and RDFa is a
superior technology because it is well-established. It seems a case
can be made for technical advantages for each technology. There is
not consensus to declare either as unquestionably superior, and
declaring either to be clearly inferior would draw strong
objections.
The question of which technology - RDFa vs. Microdata - is independant
from the issue of whether ot not Microdata should be included within the
spec while it is still being maintained. (If Microdata fails in the
market place, then the feature would be dropped entirely, not simply
split out to a separate spec, but at this stage its too early to tell.)
In any case, there were no relevant arguments presented here, either
for or against splitting.
It's been pointed out that if Microdata was published as a separate
spec, it might be reusable in other markup languages, it could be
used in other markup languages to provide semantic markup. The
likelihood that it would be adopted by other markup languages (like
SVG, ODF, or Docbook) might increase because it will no longer be
viewed as an HTML5-only technology. Some respondents counter that
Microdata is not reusable in non-HTML languages in its current form,
limiting the utility of a split out spec.
This is effectively a repetition of the earlier argument regarding use
with SVG and MathML.
They also argue that being HTML-specific is good; it can be focused
enough to work well for one particular domain. Making it more general
might reduce its value for HTML. However, it was pointed out that
being HTML-specific does not even make Microdata fully applicable to
HTML5 documents, which can also include SVG and MathML. Also, the
partial reliance on HTML5-specific features, such as <time> or
<meta>, does not preclude making the more generally applicable
constructs usable for other languages. It seems plausible that a
split-out Microdata could continue to have first-rate HTML
integration but also be usable from other languages. This possibility
seems like an advantage for Microdata in a separate spec.
Although this is also a repeat of the earlier argument, I just want to
stress the fact that even with the current split, Microdata is still
being maintained as an HTML specific technology, and there is no sign of
that changing. Therefore, the above argument for splitting is purely
based on a hypothetical situation and not valid.
Tally: Merging: 10 points, Splitting: 0 points.
Another important question was whether Microdata is mature enough.
It's been argued that HTML+Microdata should be allowed to become a
mature draft before consensus on inclusion or dismissal is discussed.
A productive way to enable that maturation process is to separate the
work into a separate document. Advocates of keeping Microdata in
argue that it's not currently in a state of flux, and if necessary,
Microdata can be removed at any time. They point out that HTML5 has
historically not followed the model of keeping sections separate if
they are not sufficiently mature. And indeed, Microdata is arguably
relatively mature compared to other parts of the spec, including some
parts that are not controversial for inclusion.
Correct.
The counterpoint is that true maturity would include implementation
experience, extensive feedback based on authoring and deployment, and
relative lack of strong disagreement. But at this time Microdata has
low adoption and has not seen significant adoption among authors, or
much implementation in UAs ordata mining tools. Advocates for
splitting Microdata point out that while it may turn out to be the
best solution, it is currently unproven. And if it turns out to
cause problems down the road, then it would be unfortunate to have
the HTML5 spec saddled with it.
This is nonsense because all new features start out as unimplemented
drafts, and there are still features that have been in the spec for much
longer than Microdata but not yet implemented by browsers. So as an
argument for splitting, it's an extroadinarily weak argument, basically
saying that we shouldn't work on new features because the new features
haven't been implemented yet. If that were a valid reason, then we
would have stopped work on all new features long ago and we'd be getting
nowhere.
Tally: Merging: 11 points, Splitting 0 points.
It seems debatable whether Microdata is one of the least mature
things in the spec. But it does seem clear that it does not have the
maturity level of features inherited from HTML4, or even newer but
widely implemented functionality such as <canvas> or <video>. Being
unproven is not necessarily a strong objection to inclusion by
itself, but it's worth considering along with other factors.
At least the decision admits here that it was a rather weak counter
argument, and yet somehow still seems to reach the conclusion that it
all balances out.
Many WG members have discussed the relative marketplace success of
Microdata and RDFa. As has been pointed out, RDFa has significant
deployment success - data mining tools from Google and Yahoo use it,
it is used by Drupal, and it is published by such organizations as
the UK government, the Library of Science and Best Buy. On the other
hand, Microdata has no significant deployed history or implementation
yet. Thus, Microdata may fail in the marketplace. If Microdata fails
in the marketplace, in the long-term, it may be advisable to allow it
to fail without having a negative impact on the HTML5 spec proper.
Advocates for keeping Microdata argue that while it is true that
either RDFa or Microdata (or both) may fail in the marketplace, we
should as a working group give the most support to the technology we
most believe should succeed in the marketplace. Whether we should be
picking winners in this way is controversial.
The question of RDFa vs. Microdata is not relevant to the question of
whether Microdata should continue development as part of the HTML5 spec
or independently.
So perhaps one of the most important points in this discussion is the
next one: should the W3C pick a winner in this particular
competition, or let nature take its course?
That question is red herring because the issue isn't about picking a
winner. The real issue here was that of letting each technology reach
maturity in the way that best facilitates its development. This also
happens to be the one issue I'm aware of that was missed in this decision.
It also follows then that the arguments concerning whether or not the
competition between RDFa and Microdata should occur are not particularly
relevant to the question of splitting, since either way, they would
still be competing with each other.
[...] Still others say that a winner is not yet clear, and we don't
have consensus, so we shouldn't pick a winner. They do not want to
see a particular format locked in, and would like to see them set up
on an even footing.
There was no justification given for the belief that that having both
Microdata and RDFa in their own independent specs somehow helps to
provide an even footing. It was just assumed that having Microdata
presented as part of HTML5 would somehow give it some unfair advantage
over RDFa. But in reality, the only advantages come from the design of
the technology itself, not where or how it is published.
This is also countered by the fact that, by splitting the spec, the
Microdata proponents are forced to work on the spec in a way that is not
conducive to its most effective development. While this issue is
somewhat mitigated by having Microdata in the WHATWG copy, the principle
still applies, and readers of the W3C copy are at a placed at a
disadvantage by it not being present.
Tally: Merging: 12 points, Splitting: 0 points
Some argue that competition between RDFa and Microdata is not a big
deal; RDFa is not so widely adopted yet, and writing a parser for
both is not so hard. Thus, we shouldn't worry about whether the
formats are left to compete on an even footing. Others do think
competition is a problem, to the point that having both Microdata and
RDFa as W3C specifications is not in the best interests of the web
community at large. It was claimed that it's more productive for
philosophically divergent communities (RDFa/Microdata) within a
larger community (HTML WG) to have their own work products during a
period of active debate. But on the other hand, Web Forms (relative
to XForms) was cited as a possible counter-example, where we actually
took Web Forms 2 from a separate spec to part of the draft. But
others thought this was a bad example, because they felt XForms was
actually technically superior.
The claimed technical superiority of XForms, even if that is true, is
outweighed by it's obvious market failure in comparison with HTML Forms,
and so that is not a valid counter argument. But the issue of whether
or not there should be competition between the two technologies is
outweighed by the previous argument that the W3C cannot pick a winner.
The only way then for the market to decide is to have a format war.
This sucks, but it's the unfortunate result of divergent communities
each pushing their own proposals and not giving in. In any case, these
arguments are not relevant to the issue at hand because the specs are
competing regardless, and neither argument addresses the real issue.
Thus, we see that, while many of the objections balance out, in some
areas keeping Microdata would draw stronger objections than splitting
it. The objections based on maturity, market success, and reusability
in other languages are stronger than their respective counterpoints.
As a result, the objections to picking a winner in this case are
stronger than the objections to not doing so.
As I have clearly demonstrated, with a final tally of 12 to 0, the
relevant arguments in favour of keeping Microdata in the spec were
technically superior to those in favour of splitting it. Yet as is
plainly obvious, this decision chose to blatantly ignore technical
superiority in favour of simply treating all arguments/counter arguments
as equal and as balancing each other out. Honestly, reading this
decision was like reading a he-said/she-said debate that ended up with
the one that was yelling the loudest winning.
So while, as I said above, I'm not asking that this issue be reopened at
this time - what's done is done - I just wanted to emphasise the fact
that the decision process so far has not been applied in way that,
despite claims to the contrary, considers technical merit over the most
vehement objections. I do, however, hope that you will take these
issues into serious consideration for future decisions that will be made.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/