Hey Paul:
I've done a little digging into this, and have found out a few more details.
* Using the test file you've provided, the serialized version of the parsed
DOM turns out to be (schematically):
<head>
<link rel="...">
<script type="os/data"></script>
<os:ViewerRequest.../>
</head>
<body>
<script type="os/template">
..os/template stuff
</script>
* So the problem is the placement of the os/data tag outside of its <script>
block. That in turn happens due to code down in
org.cyberneko.html.HTMLTagBalancer.java line 665 and onward:
// close previous elements
// all elements close a <script>
// in head, no element has children
if ((fElementStack.top > 1
&& (fElementStack.peek().element.code ==
HTMLElements.SCRIPT))
|| fElementStack.top > 2 &&
fElementStack.data[fElementStack.top-2].element.code == HTMLElements.HEAD) {
final Info info = fElementStack.pop();
if (fDocumentHandler != null) {
callEndElement(info.qname, synthesizedAugs());
}
}
This conditional causes callEndElement to prematurely close the <script> tag
during processing of the os/data tag's contents (parsing
<os:ViewerRequest>). As noted in the comment, the semantic reason this
occurs is that the <script> element gets automagically foisted to <head>,
and from there the code assumes that no child of <head> has children of its
own.
That's about it. I'm trying to figure out why <link><script type="os/data">
causes the <script> element to be placed in <head>, while removing <link>
doesn't -- and stranger yet, putting the <script type="os/template"> element
after <link> doesn't cause it to be put in head (thereby exhibiting this
behavior) as well.
--j
On Wed, Nov 11, 2009 at 10:55 AM, Paul Lindner <[email protected]>wrote:
> The problems started with this commit. Louis, could you have a look at it?
>
> commit a98b18f181df9e78c2f90f6b02483e64c3c76a2a
> Author: lryan <lr...@13f79535-47bb-0310-9956-ffa450edef68>
> Date: Tue Sep 29 22:17:23 2009 +0000
>
> Upgrade to Neko 1.9.13. Remove unused old Neko based parser
>
> git-svn-id:
>
> https://svn.apache.org/repos/asf/incubator/shindig/tr...@82010913f79535-47bb-0310-9956-ffa450edef68
>
>
> On Tue, Nov 10, 2009 at 11:03 AM, Paul Lindner <[email protected]
> >wrote:
>
> > If there are tags preceding the test/os-data script tags parsing fails.
> It
> > appears that the DOM parsing mangles the singular tags in the block and
> > somehow sees them as an open/close tag combo.
> >
> > A patch that reproduces the problem follows. I'll do a git bisect later
> > today to see where this was introduced.
> >
> >
> > diff --git
> >
> a/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > b/java/gadgets/src/test/resources/org/apache/shindig/gadgets/p
> > index c7fa769..f38663b 100644
> > ---
> >
> a/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > +++
> >
> b/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
> > @@ -1,3 +1,5 @@
> > +<link href="moo"></link>
> > +
> > <script type="text/os-data" xmlns:os="
> > http://ns.opensocial.org/2008/markup">
> > <os:ViewerRequest key="viewer"/>
> > </script>
> > @@ -14,4 +16,4 @@
> >
> > <span>Some content</span>
> >
> > -<div><!-- foo -->bar<!-- baz --></div>
> > \ No newline at end of file
> > +<div><!-- foo -->bar<!-- baz --></div>
> >
>