Re: HTML5 serializer
On 9 January 2012 11:36, Thorsten Scherler wrote: > On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote: >> Hi Thorsten, >> >> Adding in general is not a concern faik but setting the correct >> encoding is. >> >> Examples are >> for xml files > > That is correct for the doc declaration. > >> And >> for >> html files > > nupp, that tag may be needed to be valid html5 but that is not the > concern of the serializer but the prior transformation process. > >> >> So I was only referring to setting the correct encoding which can be >> configured as a Serializer property. > > Yes but that only goes in the PI and is used for the serialization. Not really convinced, chiefly for reasons of separation of concerns. Given that throughout the pipeline the XML is being held in java's unicode strings, IMO the only component that should need to worry about the charset being used to serialise the output should be the serialiser that's doing it, otherwise you can end up with a document using one charset that claims inside to be a different one. If you're happy to leave it to the serialiser to insert the PI in the output (including the charset) rather than having it already in the pipeline's XML stream (e.g. inserted by xsl:processing-instruction in an XSLT template), and happy to let the the HTML serialiser insert the doctype rather than having it already in the pipeline's stream, then why shouldn't the HTML/XHTML serialiser also insert the meta tag specifying the charset? In an ideal world, we wouldn't even have to specify a particular encoding on the serialiser either - there'd be a default configured somewhere, but it would select an appropriate one dynamically at the time of output based on the Accept-Charset request header sent by the browser... and why should the earlier part of the pipeline also need to worry about that? Andy. > > salu2 > >> >> Robby >> >> >> -Original Message- >> From: Thorsten Scherler [mailto:scher...@gmail.com] >> Sent: Sunday, January 08, 2012 10:28 PM >> To: dev@cocoon.apache.org >> Subject: RE: HTML5 serializer >> >> On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote: >> > >> >> > So we’re almost there. Do you have any suggestion how to accomplish >> > using the correct ?? Or do you think that’s >> > not worth the effort? >> >> Hmm, actually that is not the concern of the serializer at all. The >> serializer merely adds DOCTYPE PI and not much more. So is >> nothing the serializer should add. >> >> salu2 >> > > -- > Thorsten Scherler > codeBusters S.L. - web based systems > > http://www.codebusters.es/
RE: HTML5 serializer
Hi Thorsten, I assume with prior transformation process you are referring to the transformerhandler which insert the meta tag for the html use case. I also just stumbled across Michael Kay's reponse about serialization for html5 where he mentions: "The XSLT and XQuery WGs have taken the view that we will address serialization to HTML5 when HTML5 is finished; meanwhile WHAT WG seem to be claiming that "finished" is an obsolete concept and that HTML5 will remain under continuous change forever. Perhaps I'm misquoting them, but that's my understanding." http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201105/msg00067.html But to wrap up what I'm trying to achieve here: I want to be able to do following use cases: * XML data --> transform using xslt to html5 --> serialize to html5 * stringtemplate generator --> serialize to html5 Preferably I want to be able to do so out-of-the-box with Cocoon3. As you seem more acquainted with the topic what would need to be done to enable this? Kind regards, Robby -Original Message- From: Thorsten Scherler [mailto:scher...@gmail.com] Sent: Monday, January 09, 2012 12:37 PM To: dev@cocoon.apache.org Subject: RE: HTML5 serializer On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote: > Hi Thorsten, > > Adding in general is not a concern faik but setting the correct > encoding is. > > Examples are > for xml files That is correct for the doc declaration. > And > for html > files nupp, that tag may be needed to be valid html5 but that is not the concern of the serializer but the prior transformation process. > > So I was only referring to setting the correct encoding which can be > configured as a Serializer property. Yes but that only goes in the PI and is used for the serialization. salu2 > > Robby > > > -Original Message- > From: Thorsten Scherler [mailto:scher...@gmail.com] > Sent: Sunday, January 08, 2012 10:28 PM > To: dev@cocoon.apache.org > Subject: RE: HTML5 serializer > > On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote: > > > > > So we’re almost there. Do you have any suggestion how to accomplish > > using the correct ?? Or do you think that’s > > not worth the effort? > > Hmm, actually that is not the concern of the serializer at all. The > serializer merely adds DOCTYPE PI and not much more. So is > nothing the serializer should add. > > salu2 > -- Thorsten Scherler codeBusters S.L. - web based systems http://www.codebusters.es/
RE: HTML5 serializer
On Mon, 2012-01-09 at 08:32 +0100, Robby Pelssers wrote: > Hi Thorsten, > > Adding in general is not a concern faik but setting the correct > encoding is. > > Examples are > for xml files That is correct for the doc declaration. > And > for html > files nupp, that tag may be needed to be valid html5 but that is not the concern of the serializer but the prior transformation process. > > So I was only referring to setting the correct encoding which can be > configured as a Serializer property. Yes but that only goes in the PI and is used for the serialization. salu2 > > Robby > > > -Original Message- > From: Thorsten Scherler [mailto:scher...@gmail.com] > Sent: Sunday, January 08, 2012 10:28 PM > To: dev@cocoon.apache.org > Subject: RE: HTML5 serializer > > On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote: > > > > > So we’re almost there. Do you have any suggestion how to accomplish > > using the correct ?? Or do you think that’s > > not worth the effort? > > Hmm, actually that is not the concern of the serializer at all. The > serializer merely adds DOCTYPE PI and not much more. So is > nothing the serializer should add. > > salu2 > -- Thorsten Scherler codeBusters S.L. - web based systems http://www.codebusters.es/
RE: HTML5 serializer
Hi Thorsten, Adding in general is not a concern faik but setting the correct encoding is. Examples are for xml files And for html files So I was only referring to setting the correct encoding which can be configured as a Serializer property. Robby -Original Message- From: Thorsten Scherler [mailto:scher...@gmail.com] Sent: Sunday, January 08, 2012 10:28 PM To: dev@cocoon.apache.org Subject: RE: HTML5 serializer On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote: > > So we’re almost there. Do you have any suggestion how to accomplish > using the correct ?? Or do you think that’s > not worth the effort? Hmm, actually that is not the concern of the serializer at all. The serializer merely adds DOCTYPE PI and not much more. So is nothing the serializer should add. salu2 -- Thorsten Scherler codeBusters S.L. - web based systems http://www.codebusters.es/
RE: HTML5 serializer
On Fri, 2012-01-06 at 19:56 +0100, Robby Pelssers wrote: > > So we’re almost there. Do you have any suggestion how to accomplish > using the correct ?? Or do you think that’s > not worth the effort? Hmm, actually that is not the concern of the serializer at all. The serializer merely adds DOCTYPE PI and not much more. So is nothing the serializer should add. salu2 -- Thorsten Scherler codeBusters S.L. - web based systems http://www.codebusters.es/
RE: HTML5 serializer
Hi Sylvain, Thx for the pointer. Using the same test but with some changes to the html5serializer public static XMLSerializer createHTML5Serializer() { XMLSerializer serializer = new XMLSerializer(); serializer.setContentType(TEXT_HTML_UTF_8); serializer.setDoctypeSystem("about:legacy-compat"); serializer.setEncoding(UTF_8); serializer.setMethod(HTML); return serializer; } now results in serializer test test So we're almost there. Do you have any suggestion how to accomplish using the correct ?? Or do you think that's not worth the effort? Robby From: Sylvain Wallez [mailto:sylv...@apache.org] Sent: Friday, January 06, 2012 6:13 PM To: dev@cocoon.apache.org Subject: Re: HTML5 serializer Le 06/01/12 15:48, Robby Pelssers a écrit : Hi all, I've been looking at how to add a HTML5 serializer to the project. So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer public static XMLSerializer createHTML5Serializer() { XMLSerializer serializer = new XMLSerializer(); serializer.setContentType(TEXT_HTML_UTF_8); serializer.setDoctypePublic("XSLT-compat"); Looks like "XSLT-compat" has been changed to "about:legacy-compat" in the latest HTML 5 specification. See http://dev.w3.org/html5/spec/syntax.html#doctype-legacy-string Sylvain -- Sylvain Wallez - http://bluxte.net
Re: HTML5 serializer
Le 06/01/12 15:48, Robby Pelssers a écrit : Hi all, I've been looking at how to add a HTML5 serializer to the project. So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer public static XMLSerializer createHTML5Serializer() { XMLSerializer serializer = new XMLSerializer(); serializer.setContentType(TEXT_HTML_UTF_8); serializer.setDoctypePublic("XSLT-compat"); Looks like "XSLT-compat" has been changed to "about:legacy-compat" in the latest HTML 5 specification. See http://dev.w3.org/html5/spec/syntax.html#doctype-legacy-string Sylvain -- Sylvain Wallez - http://bluxte.net
RE: HTML5 serializer
For all I know the serializer does not actually output anything directly. It hands over the task to the transformerhandler and this is where the culprit resides. There is no need to extend the current serializer if I adapt the current way of working. XmlSerializer, Html4Serializer and XhtmlSerializer are all just XmlSerializers with a different set of properties and the current XMLSerializer class provides static methods to create them. It makes much more sense to add a static factory method for a html5 serializer there as well. The problem is that currently to actually have the transformerhandler output a doctype declaration you do need to pass I think a doctypepublic property but this cannot be empty from the looks of it: public XMLSerializer setDoctypePublic(String doctypePublic) { if (doctypePublic == null || EMPTY.equals(doctypePublic)) { throw new SetupException("A doctype-public has to be passed as argument."); } this.format.put(OutputKeys.DOCTYPE_PUBLIC, doctypePublic); return this; } public XMLSerializer setDoctypeSystem(String doctypeSystem) { if (doctypeSystem == null || EMPTY.equals(doctypeSystem)) { throw new SetupException("A doctype-system has to be passed as argument."); } this.format.put(OutputKeys.DOCTYPE_SYSTEM, doctypeSystem); return this; } Robby From: Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com] Sent: Friday, January 06, 2012 5:00 PM To: dev@cocoon.apache.org Subject: Re: HTML5 serializer Ok, then create an HTML5Serializer that extends the current Serializer. An other solution would be to add a boolean that will output differently for html5 but I'd prefer extension above a number of if statements. Jasha On 6 January 2012 16:56, Robby Pelssers mailto:robby.pelss...@nxp.com>> wrote: I am using Cocoon2.2 but am planning to switch to C3 in the upcoming months. And in my mail I was actually referring to C3.You are right about what you write but I'd prefer to have a Serializer which follows the spec so I can just copy the output and validate it without errors and too many warnings at http://validator.w3.org/ Robby From: Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com<mailto:j.joachimst...@onehippo.com>] Sent: Friday, January 06, 2012 4:51 PM To: dev@cocoon.apache.org<mailto:dev@cocoon.apache.org> Subject: Re: HTML5 serializer Hey Robby, which Cocoon version are you using for your project? In C2.1 and C2.2 there's not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for their specific needs. So why not create your own HTML5Serializer? In HTML5 the specification teams tried to specify what browsers were already doing instead of making a new theoretical specification. HTML5 should be backwards compatible with previous (X)HTML versions. This is the reason why some old elements are not deprecated but considered obsolete (remember marquee, it was so cool on Geocities). The doctype doesn't really matter, browsers generally ignore the PUBLIC part in the doctype (apart from some hacks in IE going into quirks mode). A good presentation about HTML5 is http://vimeo.com/15755349. Jasha Joachimsthal Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466 US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free) www.onehippo.com<http://www.onehippo.com/> On 6 January 2012 15:48, Robby Pelssers mailto:robby.pelss...@nxp.com>> wrote: Hi all, I've been looking at how to add a HTML5 serializer to the project. So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer public static XMLSerializer createHTML5Serializer() { XMLSerializer serializer = new XMLSerializer(); serializer.setContentType(TEXT_HTML_UTF_8); serializer.setDoctypePublic("XSLT-compat"); serializer.setEncoding(UTF_8); serializer.setMethod(HTML); return serializer; } Using the HTML5 serializer in a test to print the output: @Test public void testHTML5Serializer() throws Exception { ByteArrayOutputStream baos = new ByteArrayOutputStream(); newNonCachingPipeline() .setStarter( new XMLGenerator("serializer testtest") ) .setFinisher(XMLSerializer.createHTML5Serializer()) .withEmptyConfiguration() .setup(baos) .execute(); String data = new String(baos.toByteArray()); System.out.println(data); } Would print serializer test test I read a number of articles describing the issues with serializing html5 and so far this was the best I could come up with which is not 100% conforming due to * Non matching doctype although it will not break in the browser --> should be * The charset shoul
Re: HTML5 serializer
Ok, then create an HTML5Serializer that extends the current Serializer. An other solution would be to add a boolean that will output differently for html5 but I'd prefer extension above a number of if statements. Jasha On 6 January 2012 16:56, Robby Pelssers wrote: > I am using Cocoon2.2 but am planning to switch to C3 in the upcoming > months. And in my mail I was actually referring to C3.You are right > about what you write but I’d prefer to have a Serializer which follows the > spec so I can just copy the output and validate it without errors and too > many warnings at http://validator.w3.org/ > > ** ** > > Robby > > ** ** > > *From:* Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com] > *Sent:* Friday, January 06, 2012 4:51 PM > *To:* dev@cocoon.apache.org > *Subject:* Re: HTML5 serializer > > ** ** > > Hey Robby, > > ** ** > > which Cocoon version are you using for your project? In C2.1 and C2.2 > there's not only a XMLSerializer but also an HTMLSerializer and > XHTMLSerializer for their specific needs. So why not create your own > HTML5Serializer? > > ** ** > > In HTML5 the specification teams tried to specify what browsers were > already doing instead of making a new theoretical specification. HTML5 > should be backwards compatible with previous (X)HTML versions. This is the > reason why some old elements are not deprecated but considered obsolete > (remember marquee, it was so cool on Geocities). > > The doctype doesn't really matter, browsers generally ignore the PUBLIC > part in the doctype (apart from some hacks in IE going into quirks mode). > > > A good presentation about HTML5 is http://vimeo.com/15755349. > > > > > Jasha Joachimsthal > > > Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466 > US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll > free) > > www.onehippo.com > > > > On 6 January 2012 15:48, Robby Pelssers wrote:*** > * > > Hi all, > > > > I’ve been looking at how to add a HTML5 serializer to the project. > > > > So far my investigations have led to add following code to > org.apache.cocoon.sax.component.XMLSerializer > > > > public static XMLSerializer createHTML5Serializer() { > > XMLSerializer serializer = new XMLSerializer(); > > > > serializer.setContentType(TEXT_HTML_UTF_8); > > serializer.setDoctypePublic("XSLT-compat"); > > serializer.setEncoding(UTF_8); > > serializer.setMethod(HTML); > > > > return serializer; > > } > > > > > > Using the HTML5 serializer in a test to print the output: > > > > @Test > > public void testHTML5Serializer() throws Exception { > > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > > > > newNonCachingPipeline() > > .setStarter( > >new XMLGenerator("serializer > testtest") > > ) > > .setFinisher(XMLSerializer.createHTML5Serializer()) > > .withEmptyConfiguration() > > .setup(baos) > > .execute(); > > > > String data = new String(baos.toByteArray()); > > System.out.println(data); > > } > > > > Would print > > > > > > > > > > > > serializer test > > > > > > test > > > > > > > > > > I read a number of articles describing the issues with serializing html5 > and so far this was the best I could come up with which is not 100% > conforming due to > > · Non matching doctype although it will not break in the browser > à should be > > · The charset should be according to > html5 spec > > > > > > http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/ > > http://www.w3schools.com/html5/tag_meta.asp > > > > > > Does anyone have more knowledge on this subject? > > > > Robby > > > > > > ** ** >
RE: HTML5 serializer
I am using Cocoon2.2 but am planning to switch to C3 in the upcoming months. And in my mail I was actually referring to C3.You are right about what you write but I'd prefer to have a Serializer which follows the spec so I can just copy the output and validate it without errors and too many warnings at http://validator.w3.org/ Robby From: Jasha Joachimsthal [mailto:j.joachimst...@onehippo.com] Sent: Friday, January 06, 2012 4:51 PM To: dev@cocoon.apache.org Subject: Re: HTML5 serializer Hey Robby, which Cocoon version are you using for your project? In C2.1 and C2.2 there's not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for their specific needs. So why not create your own HTML5Serializer? In HTML5 the specification teams tried to specify what browsers were already doing instead of making a new theoretical specification. HTML5 should be backwards compatible with previous (X)HTML versions. This is the reason why some old elements are not deprecated but considered obsolete (remember marquee, it was so cool on Geocities). The doctype doesn't really matter, browsers generally ignore the PUBLIC part in the doctype (apart from some hacks in IE going into quirks mode). A good presentation about HTML5 is http://vimeo.com/15755349. Jasha Joachimsthal Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466 US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free) www.onehippo.com<http://www.onehippo.com/> On 6 January 2012 15:48, Robby Pelssers mailto:robby.pelss...@nxp.com>> wrote: Hi all, I've been looking at how to add a HTML5 serializer to the project. So far my investigations have led to add following code to org.apache.cocoon.sax.component.XMLSerializer public static XMLSerializer createHTML5Serializer() { XMLSerializer serializer = new XMLSerializer(); serializer.setContentType(TEXT_HTML_UTF_8); serializer.setDoctypePublic("XSLT-compat"); serializer.setEncoding(UTF_8); serializer.setMethod(HTML); return serializer; } Using the HTML5 serializer in a test to print the output: @Test public void testHTML5Serializer() throws Exception { ByteArrayOutputStream baos = new ByteArrayOutputStream(); newNonCachingPipeline() .setStarter( new XMLGenerator("serializer testtest") ) .setFinisher(XMLSerializer.createHTML5Serializer()) .withEmptyConfiguration() .setup(baos) .execute(); String data = new String(baos.toByteArray()); System.out.println(data); } Would print serializer test test I read a number of articles describing the issues with serializing html5 and so far this was the best I could come up with which is not 100% conforming due to * Non matching doctype although it will not break in the browser --> should be * The charset should be according to html5 spec http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/ http://www.w3schools.com/html5/tag_meta.asp Does anyone have more knowledge on this subject? Robby
Re: HTML5 serializer
Hey Robby, which Cocoon version are you using for your project? In C2.1 and C2.2 there's not only a XMLSerializer but also an HTMLSerializer and XHTMLSerializer for their specific needs. So why not create your own HTML5Serializer? In HTML5 the specification teams tried to specify what browsers were already doing instead of making a new theoretical specification. HTML5 should be backwards compatible with previous (X)HTML versions. This is the reason why some old elements are not deprecated but considered obsolete (remember marquee, it was so cool on Geocities). The doctype doesn't really matter, browsers generally ignore the PUBLIC part in the doctype (apart from some hacks in IE going into quirks mode). A good presentation about HTML5 is http://vimeo.com/15755349. Jasha Joachimsthal Europe - Amsterdam - Oosteinde 11, 1017 WT Amsterdam - +31(0)20 522 4466 US - Boston - 1 Broadway, Cambridge, MA 02142 - +1 877 414 4776 (toll free) www.onehippo.com On 6 January 2012 15:48, Robby Pelssers wrote: > Hi all, > > ** ** > > I’ve been looking at how to add a HTML5 serializer to the project. > > ** ** > > So far my investigations have led to add following code to > org.apache.cocoon.sax.component.XMLSerializer > > ** ** > > public static XMLSerializer createHTML5Serializer() { > > XMLSerializer serializer = new XMLSerializer(); > > ** ** > > serializer.setContentType(TEXT_HTML_UTF_8); > > serializer.setDoctypePublic("XSLT-compat"); > > serializer.setEncoding(UTF_8); > > serializer.setMethod(HTML); > > ** ** > > return serializer; > > } > > ** ** > > ** ** > > Using the HTML5 serializer in a test to print the output: > > ** ** > > @Test > > public void testHTML5Serializer() throws Exception { > > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > > ** ** > > newNonCachingPipeline() > > .setStarter( > >new XMLGenerator("serializer > testtest") > > ) > > .setFinisher(XMLSerializer.createHTML5Serializer()) > > .withEmptyConfiguration() > > .setup(baos) > > .execute(); > > ** ** > > String data = new String(baos.toByteArray()); > > System.out.println(data); > > } > > ** ** > > Would print > > ** ** > > > > > > > > > > serializer test > > > > > > test > > > > > > ** ** > > ** ** > > I read a number of articles describing the issues with serializing html5 > and so far this was the best I could come up with which is not 100% > conforming due to > > **· **Non matching doctype although it will not break in the > browser à should be > > **· **The charset should be according to > html5 spec > > ** ** > > ** ** > > http://www.contentwithstyle.co.uk/content/xslt-and-html-5-problems/ > > http://www.w3schools.com/html5/tag_meta.asp > > ** ** > > ** ** > > Does anyone have more knowledge on this subject? > > ** ** > > Robby > > ** ** > > ** ** >