Hello Cafe,

I have this HTML structure:
            <a href="...">Want this</a>
            <a href="...">And this</a>
         <th>Another caption</th>
          <th>Yet another caption</th>

I'd like to extract A texts from row with header "Caption", and have come up with this

runX $ doc
>>> (deep (hasName "tr") -- filter only TRs
               >>> withTraceLevel 5 traceTree                   -- shows 
correct TR
             deep (
hasName "th" >>> -- filter THs with specified text
                getChildren >>> hasText (=="Caption")
             ) -- inner deep
             >>> getChildren >>> hasName "td" -- shouldn't here be only one TR?
             >>> getChildren
>>> getName &&& (getChildren >>> getText) -- list has TDs from all three TRs

Tried with `guards` but getting the same result.

I know there are other packages that might solve this in another way, but I'd like to understand what is going on here.



Haskell-Cafe mailing list

Reply via email to