On Thu, Mar 15, 2012 at 10:33 AM, Dave Fisher <dave2w...@comcast.net> wrote:
>
> On Mar 15, 2012, at 12:22 AM, Regina Henschel wrote:
>
>> Hi,
>>
>> Joe Schaefer schrieb:
>>>> ________________________________
>>>> From: Regina Henschel<rb.hensc...@t-online.de>
>>>> To: ooo-dev@incubator.apache.org
>>>> Sent: Tuesday, March 13, 2012 5:31 PM
>>>> Subject: Re: Doctype of websites
>>>>
>>>> Hi Joe,
>>>>
>>>> Joe Schaefer schrieb:
>>>>> Those de.openoffice.org pages should redirect
>>>>> to www.openoffice.org/de pages, if not your
>>>>> DNS resolver is busted.
>>>>
>>>> I had indeed set de.openoffice.org to 192.9.163.104. Removing it makes
>>>> redirecting work.
>>>>
>>>> That means the pages at de.openoffice.org had been the original ones,
>>>> but will be deleted in near future. They had been imported to
>>>> ooo-site.apache.org/de and here they have got a different doctype. Right?
>>>
>>>
>>>
>>> Well sort of. If you look at the actual document on the site
>>> you will probably find it contains an XHTML doctype even now.
>>> The thing is that the CMS build system as Dave has designed it
>>> will strip most of the header matter out of the file and replace
>>> it with a generic one supplied by a template.
>>>
>>>
>>>>
>>>>    If that's not the problem
>>>>> then you need to refresh your pages as they
>>>>> are identical on the server.
>>>>>
>>>>> As to why the doctype is different from the original
>>>>> document, that's probably due to the way Dave worked
>>>>> out the templates for the site.  If we need to scrape
>>>>> the doctype out of each individual page that will require
>>>>> some perl coding work, some templating work,
>>>>> and another sledgehammer style commit- ie not something
>>>>> to be taken lightly.
>>>>
>>>> Our pages had been XHTML with all the differences to HTML. And we tried
>>>> to produce valid pages (including W3C check button). It is not
>>>> impossible to change the pages and it can be done bit by bit while
>>>> reviewing the pages. But the aim should be clear.
>>>
>>>
>>> Well I can't advise you how to proceed from here, only point out
>>> that there is some impedance mismatch between how your site builds
>>> work and what's actually in these documents.  The choice seems
>>> to be either standardize all the documents on a common doctype
>>> or have the perl code pull the doctype out of the original document
>>> if it exists and pass it along to the template as an argument.
>>>
>>>
>>> You might even be better off just not supplying a doctype at all
>>> and letting the browser figure it out.  Up to you folks.
>>>
>>
>> If we want valid pages, a common doctype is needed because the inserted part 
>> has to be written in a way, that it fits this doctype. For example you need 
>> for the feather-logo an <img .../> element in XHTML and in HTML only <img 
>> ...>. So I think we need to agree on one doctype.
>>
>> Is it possible to count, how many pages of all are actually having an XHTML 
>> doctype? (I'm not familiar with command line.)
>>
>> Kind regards
>> Regina
>>
>> P.S. The feather img-Element is missing the alt-attribute.
>
> I have been looking into this. In general the skeleton is the non-compliant 
> part and is what should be changed. However there are many of the NLC sites 
> that are very much HTML.
>
> One more sledgehammer will happen ... but planning needs to be careful.
>

What if we went subdomain by subdomain and ran HTML Tidy on the
content to coerce it to a single doctype. Would that butcher things?

-Rob

> Regards,
> Dave
>
>

Reply via email to