According to kimsg: >Hi, > >I'm using HTDig 3.1.2 for NT version and I develop external parser of HTDig >under Windwos NT environment. >I develop that external parser for NT vesion is Windows console application >but I have met interation with external parser and HTDig is not simple. So I >have to modify ExternalParser.cc. > >My proposal and question. > >1. How about to change parse logic of ExternalParser.cc into Plaintext.cc. > - Get external parser in htdig.conf > - Excute this program and get temp text file. > - Goto Plaintext parser. Plaintext has restrictions which IMO forbid your method of working with an external parser. This is because an external document may have hyperlinks to other documents and also HTML docs. Plain text does not have any of these, neither does it have headings or titles. Therefore, using your approach a lot of information could get lost either by not reaching it when digging (i.e. not being able to follow hyperlinks) or having bad search results resulting from the unavailability of titles and headers. I think that the interaction between ht://Dig and an external parser is fairly easy and straight forward, btw., so what is your problem? If you're able to convert a document into plain text, then about 95% of writing the external parser is already done ;-) > >2. How do display excerpt in case external document. I haven't worked much with external docs, but AFAIK all excerpts are taken from the doc.db, so nobody has to care about that once a doc has been indexed. regards, Torsten -- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstra�e 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: [EMAIL PROTECTED] Internet: http://www.inwise.de ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the SUBJECT of the message.
