Re: Parsing issue
Sure... clean up your HTML and it'll parse fine :) Perhaps use JTidy to clean up the HTML. Or switch to using a more forgiving parser like NekoHTML. Erik On Jan 4, 2005, at 3:59 PM, Hetan Shah wrote: Hello All, Does any one know how to handle the following parsing error? thanks for pointers/code snippets. -H While trying to parse a HTML file using IndexHTML I get Parse Aborted: Encountered \ at line 8, column 1162. Was expecting one of: ArgName ... = ... TagEnd ... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Parsing issue
Has any one used NekoHTML ? If so how do I use it. Is it a stand alone jar file that I include in my classpath and start using just like IndexHTML ? Can some one share syntax and or code if it is supposed to be used programetically. I am looking at http://www.apache.org/~andyc/neko/doc/html/ for more information is that the correct place to look? Thanks, -H Erik Hatcher wrote: Sure... clean up your HTML and it'll parse fine :) Perhaps use JTidy to clean up the HTML. Or switch to using a more forgiving parser like NekoHTML. Erik On Jan 4, 2005, at 3:59 PM, Hetan Shah wrote: Hello All, Does any one know how to handle the following parsing error? thanks for pointers/code snippets. -H While trying to parse a HTML file using IndexHTML I get Parse Aborted: Encountered \ at line 8, column 1162. Was expecting one of: ArgName ... = ... TagEnd ... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Parsing issue
That's the correct place to look and it includes code samples. Yes, it's a Jar file that you add to the CLASSPATH and use ... hm, normally programmatically, yes :). Otis --- Hetan Shah [EMAIL PROTECTED] wrote: Has any one used NekoHTML ? If so how do I use it. Is it a stand alone jar file that I include in my classpath and start using just like IndexHTML ? Can some one share syntax and or code if it is supposed to be used programetically. I am looking at http://www.apache.org/~andyc/neko/doc/html/ for more information is that the correct place to look? Thanks, -H Erik Hatcher wrote: Sure... clean up your HTML and it'll parse fine :) Perhaps use JTidy to clean up the HTML. Or switch to using a more forgiving parser like NekoHTML. Erik On Jan 4, 2005, at 3:59 PM, Hetan Shah wrote: Hello All, Does any one know how to handle the following parsing error? thanks for pointers/code snippets. -H While trying to parse a HTML file using IndexHTML I get Parse Aborted: Encountered \ at line 8, column 1162. Was expecting one of: ArgName ... = ... TagEnd ... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Parsing issue
I use it and have yet to have a problem with it. It uses the Xerces API so you parse and access html files just like xml files. Very cool, Chuck -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 04, 2005 2:05 PM To: Lucene Users List Subject: Re: Parsing issue That's the correct place to look and it includes code samples. Yes, it's a Jar file that you add to the CLASSPATH and use ... hm, normally programmatically, yes :). Otis --- Hetan Shah [EMAIL PROTECTED] wrote: Has any one used NekoHTML ? If so how do I use it. Is it a stand alone jar file that I include in my classpath and start using just like IndexHTML ? Can some one share syntax and or code if it is supposed to be used programetically. I am looking at http://www.apache.org/~andyc/neko/doc/html/ for more information is that the correct place to look? Thanks, -H Erik Hatcher wrote: Sure... clean up your HTML and it'll parse fine :) Perhaps use JTidy to clean up the HTML. Or switch to using a more forgiving parser like NekoHTML. Erik On Jan 4, 2005, at 3:59 PM, Hetan Shah wrote: Hello All, Does any one know how to handle the following parsing error? thanks for pointers/code snippets. -H While trying to parse a HTML file using IndexHTML I get Parse Aborted: Encountered \ at line 8, column 1162. Was expecting one of: ArgName ... = ... TagEnd ... - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]