On 10/22/05, Moshe Yudkowsky <[EMAIL PROTECTED]> wrote:
> I've got a raw html file that is being auto-converted -- decorated --
> by forrest.
> Although the conversion goes well for the initial sections, at one point
> the conversion stops, and the rest of the file does not appear. There
> are no error messages or warnings.
> I have validated the document using the W3C validator, and it passes
> whether I use it as 4.01 loose or XHTML strict. (The meta tags have to
> be modified, depending  on the format, but the rest of the document is
> unchanged.)
> Problem 1: no conversion of XHTML strict
> If the document is XHTML strict, then forrest does not convert any of
> the body text whatsoever!
> Problem 2: partial conversion of HTML 4.01 text.
> The initial paragraphs convert with no problem. They look like this:
> <dl>
>   <dt>
>    DIALOGIC &amp; INTEL CORPORATION / 1996 - 2002<br/>
>    1996 - 2002: Speech Technology<br/>
>    Mission: Architect and Advocate for Speech Technologies.<br/>
>    <em>(Note: Dialogic was acquired by Intel in 1999.)</em><br/>
> </dt>
> <dd>
>   <ul>
>    <li>Guide technical development... </li>
>   </ul>
> </dd>
> </dt>
> etc.
> The paragraphs which do not convert look like this:
> <h2>SKILLS</h2>
>         <h4>Speech Recognition &amp; Speech Technology</h4>
>          <ul>
>           <li>Cross-industry knowledge...</li>
>          </ul>
> The only line that converts is the <h4> line, SKILLS, and the rest of
> the document is missing. I thought the "&amp;" in the <h4> might be
> throwing the system off, I tried removing it, and that's not the problem.
> If anyone has any ideas on how to debug this, please let me know!

What if you try making the <h4> line <h3>? I could be off base (I'm
still getting to know Forrest), but there could be a problem with
parsing if you skip levels from <h2> to <h4>.
