Edward Rayl wrote: > I just tested 0.8 with my regular new sites, and the following sites > have parsing errors: > > 1. http://www.fool.com/partners/avantgo/index.htm depth=2 > > News items have partial URI garbage at the top of the text
I just downloaded the site. I see URI garbage just here: http://www.fool.com/partners/avantgo/news/take/2002/mft/mft02121204.htm I also see the garbage in IE and Mozilla so I think it's just this particular page. > > 2. http://wireless.metacritic.com/avantgo/ depth=3 > > The actual film reviews (level 3) have a graphic and lots of > space, but no review text. After that, pressing the back-arrow > key produces documents that are blank as well, even though they > were fine before. This is a bug in font color handling. Actually there is a redundant <font color="#ffffff"> at the top that throws the parser off. The color is not reset to black, which it should be. I think I just fixed this. MetaCritic now renders well, but I'll have to test it some more. The workaround for now is to set "Use Text Color" to "no", then you will be able to see the text. Actually I recommend setting this flag to "no" by default because you cannot change the background color in the Viewer. I received a couple of reports about missing text and it mostly turned out to be text that was colored white, which makes it invisible against the default Viewer background. Also colored text is harder to read on a grayscale device. (Which was the original reason why I put this option in.) > > 3. http://www.industryweek.com/avantgo/ depth=2 >\ > The document text is missing. This page is missing a <body> tag, so the parser synthesizes one in the event stream. However, synthesized tags appear in uppercase and the class that creates Plucker document expects a lower case tag name. Just put in a workaround for this. > Also, now that it appears that we have table support, are you > considering this code to JPluck X? Definitely. I still have to take a look at the specs that Chris sent me, but judging from a cursory glance it shouldn't be too hard to implement. > I use Jason Day's SlashPluck. Any chance you will do a JPluck X > equivalent, or allow it to run as an external program? Have to look into this. I don't read Slashdot myself. What is the advantage that SlashPluck has over http://slashdot.org/palm/? Anyway, JPluck will support screen-scraping and reformatting of HTML (which is what SlashPluck seems to do) through XSL stylesheet transformation. I have this working but you have to edit the JXL file manually as this isn't available in the GUI yet. JPluck 0.8.1 will be out later this week. It should have the fixes for the two issues you're having, as well as some other enhancements. Thanks -Laurens _______________________________________________ plucker-dev mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
