Re: [Chicken-users] Cryptic SSAX error message

2015-03-19 Thread Peter Bex
On Wed, Mar 18, 2015 at 08:56:06PM -0600, Matt Gushee wrote:
 On Tue, Mar 17, 2015 at 2:13 AM, Peter Bex airh...@users.sourceforge.net
 wrote:
  You shouldn't parse HTML with an XML parser.
 
 Not in general, no. But wouldn't you agree that, regardless of what is
 wrong with the input file and why it is wrong, it would be good if SSAX
 output something that would actually be useful in troubleshooting? That was
 my main point.

Oh, I definitely agree.  I just wanted to point out that expecting to be
able to parse HTML with an XML parser is never going to work, in case you
were expecting it to.

 And of course, as I mentioned, I'm well aware that desirable
 != doable, but I didn't (and don't) know if this is a known issue, so I
 thought I should say something.

And I at least appreciate the bugreport :)  But see my PS at the end.

  Since you're using CHICKEN,
  you could try the html-parser CHICKEN egg, which is more permissive.
 
 But that's not the goal. Perhaps you recall this discussion from 2 years
 ago?
 
  [Matt]
  Finally, an idea has occurred to me. What about a templating system where
  what actually gets used at runtime is SXML, but designers could create
  templates in XHTML, then when they are satisfied with the design, use a
  preprocessing tool to convert them to SXML? That would at least ensure
  well-formed markup.
 
  [Peter]
  Yep, that would be good.  Representation and surface syntax don't
  neccessarily need to be equivalent, though the Lisper in me disagrees
  about that being a good idea :)
 REF:
 http://lists.nongnu.org/archive/html/chicken-users/2013-03/msg00058.html

That's a rather different point I was making at the time; I was arguing
for using SXML directly.  The point I'm making now is that if you want
to parse HTML, you should use an HTML parser.  But IIUC you're saying that
was a bit of mistake in one particular template you made, and you really
mean the templates to be strict XML?

 So Civet is the templating system I created pursuant to that conversation.
 The templates are supposed to be well-formed XML (in practice, mainly
 XHTML), and presumably created by a developer who knows what they're doing
 - though the current issue may call that into question ;-).

:)

 I certainly don't believe my approach is ideal from a purely technical
 standpoint. But given that the meta-goal of my projects is to use Scheme
 to create web development tools that might be used by people who don't know
 Scheme (as opposed to use Scheme to develop websites), I think it's
 about as good a compromise as can be expected. If I were creating Civet
 today, I think I would look for a different approach - but mainly because
 it is now clear (maybe it was in 2013 and I just didn't know it) that HTML5
 (in non-XML syntax) is becoming dominant, and the never-popular XHTML is
 dying, if not dead. But I still stand by the fundamental reasoning that led
 to Civet as it is (and BTW, it works pretty well within its limitations -
 you should try it ;-)

I don't create websites for nontechnical people anymore, and I'm very
glad about it; it's pretty thankless and unsatisfying work.  However,
one of the things that made the job more painful than it had to be was
the shitty CMSes out there (we used Drupal, but the other systems out
there suck in a variety of different ways), so I'm sure a CMS in Scheme
is going to make the lives of people like my former self more pleasant.

  I *think* XHTML Strict is a proper XML application, but I'm not 100% sure,
 
 I'm not 100% sure either, but if the W3C says it's XML, they most likely
 mean it is completely well-formed. One thing I know is that it prohibits
 inline CSS and JavaScript - and now I understand why.

Yeah, there are lots of hairy issues, especially with inline JavaScript
where it breaks the normal parsing rules, as some people noticed
in #chicken the other day.  It's no wonder the web is so riddled with
security issues!

PS: I replied using my sourceforge address; for some reason I thought
you were reporting this to the SSAX mailinglist.  That would be the
correct place to report this bug; we just import the upstream code, as-is.

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Cryptic SSAX error message

2015-03-19 Thread Peter Bex
On Thu, Mar 19, 2015 at 02:26:06PM +0100, Jörg F. Wittenberger wrote:
 On Mar 19 2015, Peter Bex wrote:
 
 there suck in a variety of different ways), so I'm sure a CMS in Scheme
 is going to make the lives of people like my former self more pleasant.
 
 2 questions:
 
 * What features need to be present for something to count as a CMS?

That's a very tough question, as many things are counted as a CMS.
Looking back at the jobs I've had before, the CMS allowed
non-programmers to build a complete site from scratch, writing only HTML
and CSS.  On the other hand, that's perhaps not the defining feature;
rather, it's allowing the users to enter content which is then presented
on the front-end.  But almost any web-based system could then be termed
a CMS...  I suppose a CMS is more flexible, allowing users to do a lot
without having to request custom-built new features just to add a new
page to the site's menu, for example.

I'd use Drupal as a benchmark because I know it well, and it goes very
far in allowing non-programmers to define custom content types, install
modules that allow them to define almost everything.  There's a query
builder that allows you to define custom reports/overview pages and so
on.  Unfortunately it's also dog slow because of this modularity and
extensibility.  There's a cost to everything...

 * How many CMS in Scheme are there already?

None that I know of, but I never looked too hard.

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Cryptic SSAX error message

2015-03-18 Thread Matt Gushee
Hi, Peter--

On Tue, Mar 17, 2015 at 2:13 AM, Peter Bex airh...@users.sourceforge.net
wrote:

 On Mon, Mar 16, 2015 at 09:27:36PM -0600, Matt Gushee wrote:
  I was building a new blog with Coq au vin, which uses Civet to process
  templates, which in turn uses SSAX ... and one of my XHTML templates
 caused
  [an] error.

 [error elided]

  Now that's a helpful error message. It turns out the problem was the
 inline
  JavaScript in my template (which contained the = operator). Since I was
  using the XHTML Transitional doctype, that's allowed per W3C specs, and
 it
  simply hadn't occurred to me that it was likely to result in
  non-well-formed XML.

 You shouldn't parse HTML with an XML parser.


Not in general, no. But wouldn't you agree that, regardless of what is
wrong with the input file and why it is wrong, it would be good if SSAX
output something that would actually be useful in troubleshooting? That was
my main point. And of course, as I mentioned, I'm well aware that desirable
!= doable, but I didn't (and don't) know if this is a known issue, so I
thought I should say something.


 Since you're using CHICKEN,
 you could try the html-parser CHICKEN egg, which is more permissive.


But that's not the goal. Perhaps you recall this discussion from 2 years
ago?

 [Matt]
 Finally, an idea has occurred to me. What about a templating system where
 what actually gets used at runtime is SXML, but designers could create
 templates in XHTML, then when they are satisfied with the design, use a
 preprocessing tool to convert them to SXML? That would at least ensure
 well-formed markup.

 [Peter]
 Yep, that would be good.  Representation and surface syntax don't
 neccessarily need to be equivalent, though the Lisper in me disagrees
 about that being a good idea :)
REF:
http://lists.nongnu.org/archive/html/chicken-users/2013-03/msg00058.html

So Civet is the templating system I created pursuant to that conversation.
The templates are supposed to be well-formed XML (in practice, mainly
XHTML), and presumably created by a developer who knows what they're doing
- though the current issue may call that into question ;-).

I certainly don't believe my approach is ideal from a purely technical
standpoint. But given that the meta-goal of my projects is to use Scheme
to create web development tools that might be used by people who don't know
Scheme (as opposed to use Scheme to develop websites), I think it's
about as good a compromise as can be expected. If I were creating Civet
today, I think I would look for a different approach - but mainly because
it is now clear (maybe it was in 2013 and I just didn't know it) that HTML5
(in non-XML syntax) is becoming dominant, and the never-popular XHTML is
dying, if not dead. But I still stand by the fundamental reasoning that led
to Civet as it is (and BTW, it works pretty well within its limitations -
you should try it ;-)

 I *think* XHTML Strict is a proper XML application, but I'm not 100% sure,


I'm not 100% sure either, but if the W3C says it's XML, they most likely
mean it is completely well-formed. One thing I know is that it prohibits
inline CSS and JavaScript - and now I understand why.

--
Matt Gushee
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Cryptic SSAX error message

2015-03-17 Thread Peter Bex
On Mon, Mar 16, 2015 at 09:27:36PM -0600, Matt Gushee wrote:
 I was building a new blog with Coq au vin, which uses Civet to process
 templates, which in turn uses SSAX ... and one of my XHTML templates caused
 [an] error.

[error elided]

 Now that's a helpful error message. It turns out the problem was the inline
 JavaScript in my template (which contained the = operator). Since I was
 using the XHTML Transitional doctype, that's allowed per W3C specs, and it
 simply hadn't occurred to me that it was likely to result in
 non-well-formed XML.

You shouldn't parse HTML with an XML parser.  Since you're using CHICKEN,
you could try the html-parser CHICKEN egg, which is more permissive.
I *think* XHTML Strict is a proper XML application, but I'm not 100% sure,
so if you insist on strict error checking you could use the strict
doctype.  However, of course that won't make much of a difference with
regards to using an XML parser to parse it; you'd get the same error.

Cheers,
Peter


signature.asc
Description: Digital signature
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users