Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
* Hadrian Węgrzynowski 2014-02-21 22:16 Even if it would work, I think that web shouldn't be pixel-perfect, because we could just use some glorified-PDFs. It's utter nonsense that correct rendering of page is depending on some specific font and specific font size. It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. +1 and it's utter nonsense to force specific view on given content (in case you need this, the presentation IS the content and you probably need ps, pdf or whatever) --s.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
* Charlie Kester 2014-02-21 23:54 Or is the trend to create a separate, mobile version of the page, which simply changes the assumption to some smaller screen size? Or are people just ignoring the problem altogether? When they don't do and build a mobile version, it is often more usable than the full one. (Unless they decide to split the content in small chunks to “better” fit one's screen, so that one need to click `next' all the time.) Printable versions are often more enjoyable than the normal ones, too. --s.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Sat, 22 Feb 2014 09:40:08 +0100 sta...@cs.tu-berlin.de wrote: Printable versions are often more enjoyable than the normal ones, too. The worst thing in my humble opinion are those reflowing layouts, which are _very_ slow and choppy. At least, they provide a way to provide one content to both desktop- and mobile-visitors. However, this is no excuse to mess up the experience for everyone. Cheers FRIGN -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, Feb 21, 2014 at 10:18:45AM +0100, FRIGN wrote: On Fri, 21 Feb 2014 11:37:30 +0100 Anselm R Garbe garb...@gmail.com wrote: The web wouldn't be so successful if everything was strictly XML based, more the opposite IMO. Why is that? Are you referring to the fact parsing HTML as XML requires the developer to be more careful with his markup and that stricter parsing would scare off beginners? There has been a lot of discussion why strict XML parsers don't belong in a browser. There even are XHTML enthusiasts that are against it. That'd be a fair point and I agree, but on the other hand, the rule still prevails: You write once, but parse often. You only write a parser once. But you write some magnitude more markup that is going to be parsed by it. So optimizing the markup specification for authoring has a better net gain than to optimize the protocol just to get away with a simpler parser. Apart from this, XML parsing is *not* simple. And XML sucks [0]. Yes, it sucks! This is out of question. But nothing compared to SGML. The XML-standard has around 26 pages, whereas SGML takes around 600. That's why HTML uses only a subset of SGML. That said, I don't want to defend HTML and the web as such, but it would be much worse with XML IMO. At least from my perspective. -- Eckehard Berns
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, 21 Feb 2014 13:34:41 +0100 Eckehard Berns ecki-suckl...@ecki.to wrote: There has been a lot of discussion why strict XML parsers don't belong in a browser. There even are XHTML enthusiasts that are against it. Yes, I've been listening to both sides for a few years now. You only write a parser once. But you write some magnitude more markup that is going to be parsed by it. So optimizing the markup specification for authoring has a better net gain than to optimize the protocol just to get away with a simpler parser. This would be an appropriate point if the SGML-parsers weren't lossy in this regard. I've read lots of HTML-markup and often ran into problems when people didn't take care of well-formedness. Often, they run into quirks and their Browser's SGML-parser fixes them. However, there's no guarantee another Browser would do the same and damn, don't ever try to modify the markup later! This is not an edge-case. I run into these problems day by day. That's why HTML uses only a subset of SGML. The point is that they allow ambiguity. That said, I don't want to defend HTML and the web as such, but it would be much worse with XML IMO. At least from my perspective. I really don't see your point why exactly XML should be bad for the web. If you write proper, well-formed markup, nothing really changes for you, except that the browser _knows_ it's dealing with proper markup and doesn't have to fire up it's forgiving but sloppy SGML-parser. It may not be clear here that switching from SGML to XML parsing only incorporates changing the MIME-type from text/html to application/xhtml +xml. If your markup is messed up, it throws an error and stops parsing (which is really helpful), instead of silently attempting to fix errors like the SGML parser, which is a real chore to implement. XML parsing is not a simple thing either, but at least you don't have to deal with bloody guesswork! Cheers FRIGN -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
This would be an appropriate point if the SGML-parsers weren't lossy in this regard. I've read lots of HTML-markup and often ran into problems when people didn't take care of well-formedness. Often, they run into quirks and their Browser's SGML-parser fixes them. However, there's no guarantee another Browser would do the same and damn, don't ever try to modify the markup later! This is not an edge-case. I run into these problems day by day. I see why you wish for a stricter approach. I don't believe this will happen anytime soon. That's why HTML uses only a subset of SGML. The point is that they allow ambiguity. I'm not sure about that. SGML has DTDs that describe what you're allowed to do and what not. So in theory browsers could reject non-validating HTML pages as well. No need to switch to XML for that. But I would doubt that browsers would do this. I really don't see your point why exactly XML should be bad for the web. Not bad for the web. Bad for me! :) I write lots of HTML at work. I tend to write validating HTML usually - except when encountering features that can't be described with valid HTML (HTML5 fixes this thou, at least for me). If I had to write XHTML I would get very angry pretty fast. If you write proper, well-formed markup, nothing really changes for you, except that the browser _knows_ it's dealing with proper markup and doesn't have to fire up it's forgiving but sloppy SGML-parser. As said before, browsers could reject non-validating HTML as well. So in the end we disagree because of personal preference. That's fine with me. -- Eckehard Berns
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Eckehard Berns said: You only write a parser once. But you write some magnitude more markup that is going to be parsed by it. So optimizing the markup specification for authoring has a better net gain than to optimize the protocol just to get away with a simpler parser. Actually, if parser behavior is simple and easily predictable, the task of writing markup is easier. When I write correct HTML, I still have to open browser to see how it renders, because I have no way to predict the actual result (apart from my experience with different generally unexpected results that serve me the basis for educated guess). This alone is sufficient for me to be all for simplistic strict parser with zero fault tollerance. -- Dmitrij D. Czarkoff
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, 21 Feb 2014 15:07:32 +0100 Dmitrij D. Czarkoff czark...@gmail.com wrote: Actually, if parser behavior is simple and easily predictable, the task of writing markup is easier. When I write correct HTML, I still have to open browser to see how it renders, because I have no way to predict the actual result (apart from my experience with different generally unexpected results that serve me the basis for educated guess). I'm interested. Do you have a specific case where that happened? Normally, there shouldn't be ambiguities of _rendering_ in (X)HTML, given it is just a language to represent structured data. Thus, rendering issues are either originating from bad browser-defaults or faulty CSS. If you refer to this, I totally agree: When you start styling a HTML-document, you usually can't write the CSS down and then be done, but often have to check the page by reloading. As we're discussing XML- and SGML-parsers here, this is another issue. This alone is sufficient for me to be all for simplistic strict parser with zero fault tollerance. Definitely! -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, 21 Feb 2014 15:35:40 +0100 Eckehard Berns ecki-suckl...@ecki.to wrote: Fair point. For me HTML usually renders as I expected. But that's because I do this for over a decade, I guess. If it doesn't it usually is because of a misunderstanding in semantics (e.g. the broken block-model in IE until 7) and using XML wouldn't change that. I already addressed that in my previous reply, but please don't mistake HTML for what actually is CSS's competence. Of course, if there's a problem with the block-model in CSS (I just love IE!), changing the markup won't change it (of course). Discussing XML and SGML just implies the data-structure. If you like, I could give you an example. Cheers FRIGN -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
* FRIGN d...@frign.de [2014-02-21 12:03:00 +0100]: I really don't see your point why exactly XML should be bad for the web. If you write proper, well-formed markup, nothing really changes for you, except that the browser _knows_ it's dealing with proper markup and doesn't have to fire up it's forgiving but sloppy SGML-parser. It may not be clear here that switching from SGML to XML parsing only incorporates changing the MIME-type from text/html to application/xhtml +xml. xml is not just markup but http://www.w3.org/TR/REC-xml/#charencoding (mandatory utf-8 and utf-16 support with bom) https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing (xml injection, unauthorized document access) https://en.wikipedia.org/wiki/Billion_laughs (DoS: exp or quadratic blowup of entities) and various xml validation issues and implementation bugs.. it's much better to use a restricted specific language with simple well defined semantics than generic things like sgml and xml (with arbitrary long tag and attribute names), once you do this the origin (sgml, xml,..) does not matter
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
FRIGN said: Actually, if parser behavior is simple and easily predictable, the task of writing markup is easier. When I write correct HTML, I still have to open browser to see how it renders, because I have no way to predict the actual result (apart from my experience with different generally unexpected results that serve me the basis for educated guess). I'm interested. Do you have a specific case where that happened? I happen to come over different issues here and there. They are mostly either bugs (which appear due to always changing nature of modern web engine) or optimizations supposed to address bad HTML code. Eg. I had problems with tables, which were optimized by Firefox and Chrome differently, resulting in different numbers of rows when I used empty cells for complex table drawing. Thus, rendering issues are either originating from bad browser-defaults or faulty CSS. I don't even touch CSS. And I just can't see any valid argument for existance of browser-defaults – the format that is supposed to deliver pixel-perfect rendering for given screen size the very fact that there is something left for browser to decide is already complete and utter defeat for the whole markup language. -- Dmitrij D. Czarkoff
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, 21 Feb 2014 16:18:33 +0100 Szabolcs Nagy n...@port70.net wrote: xml is not just markup but http://www.w3.org/TR/REC-xml/#charencoding (mandatory utf-8 and utf-16 support with bom) What's wrong with UTF-8? https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing (xml injection, unauthorized document access) Fortunately, browsers don't allow this. https://en.wikipedia.org/wiki/Billion_laughs (DoS: exp or quadratic blowup of entities) Also, easily avoidable. it's much better to use a restricted specific language with simple well defined semantics than generic things like sgml and xml (with arbitrary long tag and attribute names), once you do this the origin (sgml, xml,..) does not matter At the cost modularity. Still, I'd welcome a solution like this! -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Greetings. On Fri, 21 Feb 2014 17:56:58 +0100 FRIGN d...@frign.de wrote: On Fri, 21 Feb 2014 16:18:33 +0100 Szabolcs Nagy n...@port70.net wrote: xml is not just markup but http://www.w3.org/TR/REC-xml/#charencoding (mandatory utf-8 and utf-16 support with bom) What's wrong with UTF-8? BOM is wrong. it's much better to use a restricted specific language with simple well defined semantics than generic things like sgml and xml (with arbitrary long tag and attribute names), once you do this the origin (sgml, xml,..) does not matter At the cost modularity. Still, I'd welcome a solution like this! All this discussion sucks. The real question is how to make HTML sim‐ pler. With only some rendering engines and all of them under control of Open Source it should be possible to push a simpler »description lan‐ guage« for what to display on a website. Just look at the reduced code Google is sending out and it works in nearly every browser engine. Re‐ duce HTML to that substandard, make it easier to parse and standardize this. When old documents fail to parse, let their authors change them. In the evolution of HTML it grew to a state where humans shouldn’t write this by hand. In such a state, when only computers read what’s produced by computers, binary encodings are allowed. Make HTML easily parseable with byte separators, length descriptors and easy definable parts. If everything is just a DOM object, make the DOM easily parseable instead of having many intermediate dependencies like CSS or Javascript. I don’t think big browser vendors would choose against such an option, if it’s faster, easy to implement and allows the same features as the current state. Of course someone is needed to make a proposal and create the reference implementation. Sincerely, Christoph Lohmann
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Dnia 2014-02-21, o godz. 16:21:22 Dmitrij D. Czarkoff czark...@gmail.com napisał(a): Thus, rendering issues are either originating from bad browser-defaults or faulty CSS. I don't even touch CSS. And I just can't see any valid argument for existance of browser-defaults – the format that is supposed to deliver pixel-perfect rendering for given screen size the very fact that there is something left for browser to decide is already complete and utter defeat for the whole markup language. There should be separate stack for pixel-perfect device independent use and for semantic web (without CSS and JS), but then semantic web would probably just die... Even if it would work, I think that web shouldn't be pixel-perfect, because we could just use some glorified-PDFs. It's utter nonsense that correct rendering of page is depending on some specific font and specific font size. It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, Feb 21, 2014 at 1:15 PM, Hadrian Węgrzynowski hadr...@hawski.com wrote: It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. 80-character paragraphs don’t sound particularly semantic.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Dnia 2014-02-21, o godz. 13:27:51 Ryan O’Hara rni...@gmail.com napisał(a): On Fri, Feb 21, 2014 at 1:15 PM, Hadrian Węgrzynowski hadr...@hawski.com wrote: It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. 80-character paragraphs don’t sound particularly semantic. With semantic paradigm browser can safely wrap lines of paragraphs and nothing will explode. Currently many pages have too long lines in 1080p and many pages have too short lines in 800p.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri, 21 Feb 2014 22:15:24 +0100 Hadrian Węgrzynowski hadr...@hawski.com wrote: There should be separate stack for pixel-perfect device independent use and for semantic web (without CSS and JS), but then semantic web would probably just die... Even if it would work, I think that web shouldn't be pixel-perfect, because we could just use some glorified-PDFs. It's utter nonsense that correct rendering of page is depending on some specific font and specific font size. It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. A semantic web-browser is a great idea. It has already been partially realized in links. If X-support is compiled in, you can test it out with lynx -g. It's blazing fast (!), but sadly gives insight into how unsemnatic the web has become over the years. Cheers FRIGN -- FRIGN d...@frign.de
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
On Fri 21 Feb 2014 at 13:15:24 PST Hadrian W?grzynowski wrote: Even if it would work, I think that web shouldn't be pixel-perfect, because we could just use some glorified-PDFs. It's utter nonsense that correct rendering of page is depending on some specific font and specific font size. It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. I'm an old fart who never got into web programming, so forgive me if this is a stupid question: isn't the rise of smartphones and other mobile platforms forcing people to abandon those assumptions about the screen sizes they're working with? (As, for example, epub vs pdf.) Or is the trend to create a separate, mobile version of the page, which simply changes the assumption to some smaller screen size? Or are people just ignoring the problem altogether?
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Dnia 2014-02-21, o godz. 14:53:22 Charlie Kester corky1...@comcast.net napisał(a): On Fri 21 Feb 2014 at 13:15:24 PST Hadrian W?grzynowski wrote: Even if it would work, I think that web shouldn't be pixel-perfect, because we could just use some glorified-PDFs. It's utter nonsense that correct rendering of page is depending on some specific font and specific font size. It's utter nonsense to not restrict paragraph length (at 80 characters or something). It's utter nonsense to assume that everyone is using maximised browser window at 1080p. I'm an old fart who never got into web programming, so forgive me if this is a stupid question: isn't the rise of smartphones and other mobile platforms forcing people to abandon those assumptions about the screen sizes they're working with? (As, for example, epub vs pdf.) Or is the trend to create a separate, mobile version of the page, which simply changes the assumption to some smaller screen size? Or are people just ignoring the problem altogether? Because of compatibility mobile browsers pretend they are like desktop browsers. They move around the problem by scaling [1]. So problem is mostly ignored. Some sites create special mobile version. Sometimes it's crippled version of site or quite sane version which can better than desktop. [1] http://upload.wikimedia.org/wikipedia/commons/2/23/Nokia_Mini_Map_Browser.png
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Dnia 2014-02-21, o godz. 21:54:59 FRIGN d...@frign.de napisał(a): A semantic web-browser is a great idea. It has already been partially realized in links. If X-support is compiled in, you can test it out with lynx -g. It's blazing fast (!), but sadly gives insight into how unsemnatic the web has become over the years. Cheers FRIGN Sometimes I am amazed how many sites work properly in links. I never succeeded with javascript enabled VT browsers though.
Re: [dev] XML vs HTML (was: Article about suckless on root.cz)
Charlie Kester said: (As, for example, epub vs pdf.) These formats serve different functions. It would be more fair to compare PDF to PS and ePub to roff respectively. -- Dmitrij D. Czarkoff