Re: [Chicken-users] Cryptic SSAX error message
On Wed, Mar 18, 2015 at 08:56:06PM -0600, Matt Gushee wrote: On Tue, Mar 17, 2015 at 2:13 AM, Peter Bex airh...@users.sourceforge.net wrote: You shouldn't parse HTML with an XML parser. Not in general, no. But wouldn't you agree that, regardless of what is wrong with the input file and why it is wrong, it would be good if SSAX output something that would actually be useful in troubleshooting? That was my main point. Oh, I definitely agree. I just wanted to point out that expecting to be able to parse HTML with an XML parser is never going to work, in case you were expecting it to. And of course, as I mentioned, I'm well aware that desirable != doable, but I didn't (and don't) know if this is a known issue, so I thought I should say something. And I at least appreciate the bugreport :) But see my PS at the end. Since you're using CHICKEN, you could try the html-parser CHICKEN egg, which is more permissive. But that's not the goal. Perhaps you recall this discussion from 2 years ago? [Matt] Finally, an idea has occurred to me. What about a templating system where what actually gets used at runtime is SXML, but designers could create templates in XHTML, then when they are satisfied with the design, use a preprocessing tool to convert them to SXML? That would at least ensure well-formed markup. [Peter] Yep, that would be good. Representation and surface syntax don't neccessarily need to be equivalent, though the Lisper in me disagrees about that being a good idea :) REF: http://lists.nongnu.org/archive/html/chicken-users/2013-03/msg00058.html That's a rather different point I was making at the time; I was arguing for using SXML directly. The point I'm making now is that if you want to parse HTML, you should use an HTML parser. But IIUC you're saying that was a bit of mistake in one particular template you made, and you really mean the templates to be strict XML? So Civet is the templating system I created pursuant to that conversation. The templates are supposed to be well-formed XML (in practice, mainly XHTML), and presumably created by a developer who knows what they're doing - though the current issue may call that into question ;-). :) I certainly don't believe my approach is ideal from a purely technical standpoint. But given that the meta-goal of my projects is to use Scheme to create web development tools that might be used by people who don't know Scheme (as opposed to use Scheme to develop websites), I think it's about as good a compromise as can be expected. If I were creating Civet today, I think I would look for a different approach - but mainly because it is now clear (maybe it was in 2013 and I just didn't know it) that HTML5 (in non-XML syntax) is becoming dominant, and the never-popular XHTML is dying, if not dead. But I still stand by the fundamental reasoning that led to Civet as it is (and BTW, it works pretty well within its limitations - you should try it ;-) I don't create websites for nontechnical people anymore, and I'm very glad about it; it's pretty thankless and unsatisfying work. However, one of the things that made the job more painful than it had to be was the shitty CMSes out there (we used Drupal, but the other systems out there suck in a variety of different ways), so I'm sure a CMS in Scheme is going to make the lives of people like my former self more pleasant. I *think* XHTML Strict is a proper XML application, but I'm not 100% sure, I'm not 100% sure either, but if the W3C says it's XML, they most likely mean it is completely well-formed. One thing I know is that it prohibits inline CSS and JavaScript - and now I understand why. Yeah, there are lots of hairy issues, especially with inline JavaScript where it breaks the normal parsing rules, as some people noticed in #chicken the other day. It's no wonder the web is so riddled with security issues! PS: I replied using my sourceforge address; for some reason I thought you were reporting this to the SSAX mailinglist. That would be the correct place to report this bug; we just import the upstream code, as-is. Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Cryptic SSAX error message
On Thu, Mar 19, 2015 at 02:26:06PM +0100, Jörg F. Wittenberger wrote: On Mar 19 2015, Peter Bex wrote: there suck in a variety of different ways), so I'm sure a CMS in Scheme is going to make the lives of people like my former self more pleasant. 2 questions: * What features need to be present for something to count as a CMS? That's a very tough question, as many things are counted as a CMS. Looking back at the jobs I've had before, the CMS allowed non-programmers to build a complete site from scratch, writing only HTML and CSS. On the other hand, that's perhaps not the defining feature; rather, it's allowing the users to enter content which is then presented on the front-end. But almost any web-based system could then be termed a CMS... I suppose a CMS is more flexible, allowing users to do a lot without having to request custom-built new features just to add a new page to the site's menu, for example. I'd use Drupal as a benchmark because I know it well, and it goes very far in allowing non-programmers to define custom content types, install modules that allow them to define almost everything. There's a query builder that allows you to define custom reports/overview pages and so on. Unfortunately it's also dog slow because of this modularity and extensibility. There's a cost to everything... * How many CMS in Scheme are there already? None that I know of, but I never looked too hard. Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Cryptic SSAX error message
Hi, Peter-- On Tue, Mar 17, 2015 at 2:13 AM, Peter Bex airh...@users.sourceforge.net wrote: On Mon, Mar 16, 2015 at 09:27:36PM -0600, Matt Gushee wrote: I was building a new blog with Coq au vin, which uses Civet to process templates, which in turn uses SSAX ... and one of my XHTML templates caused [an] error. [error elided] Now that's a helpful error message. It turns out the problem was the inline JavaScript in my template (which contained the = operator). Since I was using the XHTML Transitional doctype, that's allowed per W3C specs, and it simply hadn't occurred to me that it was likely to result in non-well-formed XML. You shouldn't parse HTML with an XML parser. Not in general, no. But wouldn't you agree that, regardless of what is wrong with the input file and why it is wrong, it would be good if SSAX output something that would actually be useful in troubleshooting? That was my main point. And of course, as I mentioned, I'm well aware that desirable != doable, but I didn't (and don't) know if this is a known issue, so I thought I should say something. Since you're using CHICKEN, you could try the html-parser CHICKEN egg, which is more permissive. But that's not the goal. Perhaps you recall this discussion from 2 years ago? [Matt] Finally, an idea has occurred to me. What about a templating system where what actually gets used at runtime is SXML, but designers could create templates in XHTML, then when they are satisfied with the design, use a preprocessing tool to convert them to SXML? That would at least ensure well-formed markup. [Peter] Yep, that would be good. Representation and surface syntax don't neccessarily need to be equivalent, though the Lisper in me disagrees about that being a good idea :) REF: http://lists.nongnu.org/archive/html/chicken-users/2013-03/msg00058.html So Civet is the templating system I created pursuant to that conversation. The templates are supposed to be well-formed XML (in practice, mainly XHTML), and presumably created by a developer who knows what they're doing - though the current issue may call that into question ;-). I certainly don't believe my approach is ideal from a purely technical standpoint. But given that the meta-goal of my projects is to use Scheme to create web development tools that might be used by people who don't know Scheme (as opposed to use Scheme to develop websites), I think it's about as good a compromise as can be expected. If I were creating Civet today, I think I would look for a different approach - but mainly because it is now clear (maybe it was in 2013 and I just didn't know it) that HTML5 (in non-XML syntax) is becoming dominant, and the never-popular XHTML is dying, if not dead. But I still stand by the fundamental reasoning that led to Civet as it is (and BTW, it works pretty well within its limitations - you should try it ;-) I *think* XHTML Strict is a proper XML application, but I'm not 100% sure, I'm not 100% sure either, but if the W3C says it's XML, they most likely mean it is completely well-formed. One thing I know is that it prohibits inline CSS and JavaScript - and now I understand why. -- Matt Gushee ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] Cryptic SSAX error message
On Mon, Mar 16, 2015 at 09:27:36PM -0600, Matt Gushee wrote: I was building a new blog with Coq au vin, which uses Civet to process templates, which in turn uses SSAX ... and one of my XHTML templates caused [an] error. [error elided] Now that's a helpful error message. It turns out the problem was the inline JavaScript in my template (which contained the = operator). Since I was using the XHTML Transitional doctype, that's allowed per W3C specs, and it simply hadn't occurred to me that it was likely to result in non-well-formed XML. You shouldn't parse HTML with an XML parser. Since you're using CHICKEN, you could try the html-parser CHICKEN egg, which is more permissive. I *think* XHTML Strict is a proper XML application, but I'm not 100% sure, so if you insist on strict error checking you could use the strict doctype. However, of course that won't make much of a difference with regards to using an XML parser to parse it; you'd get the same error. Cheers, Peter signature.asc Description: Digital signature ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users