Gilles Detillieux <[EMAIL PROTECTED]> writes:
> > > Yes, if someone could add a good, efficient and reliable XML 
> > > parser to htdig, that would certainly be the way to go.
> 
> I think when it comes to ht://Dig development, we need to respect 
> the past as much as we look to the future.  All the new developments 
> are great, but we've been burned a few times before when we adhered 
> too closely to new standards and broke support for older, 
> non-standard (or loose standard) documents.
If you go the XML route, one of the first things you'll have to decide 
is whether to follow the XML philosophy that says that anything that 
doesn't strictly conform to the XML spec isn't XML, or whether to make 
your parser generous in what it accepts. I'd assume you'd probably 
want to do the latter.

(The point behind the XML philosophy is to force content developers 
into creating strictly conformant documents if they want them to 
display, which then allows making a simpler parser. In this case, I 
don't think the search engine is an appropriate place to apply 
pressure on content developers. And maintaining backwards 
compatibility with existing HTML will likely require a forgiving 
parser. This may make it a bit harder to find a suitable "off the 
shelf" XML parser. Though the parser in Mozilla must have to deal with 
the same issues when it parses HTML.)

> Meta tags may not have much of a future, but there's no denying how 
> heavily they're used today, nor how many of today's documents will 
> still need to be parsed and indexed in the future.  That means we 
> can't drop meta tag support, and if enough people still want it, it 
> may still be a good idea to extend that support to include Dublin 
> Core.
I'm not sure how meta tag support fits into this thread. I would 
assume that if you did go the XML route, it would just be a matter of 
constructing the right DTD to allow the XML parser to recognize the 
legacy meta tags.

What's Dublin Core?

 -Tom

-- 
Tom Metro
Venture Logic                                     [EMAIL PROTECTED]
Newton, MA, USA


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to