Automatically extracted Tomcat FAQs

Stefan Henß Tue, 08 Mar 2011 13:02:24 -0800

Hi everybody,

I'm currently doing research for my bachelor thesis on how toautomatically extract FAQs from unstructured data.


For this I've built a system automatically performing the following:

- Load thousands of conversations from forums and mailing lists (don'tmind the categories there, don't discriminate between sources).- Build new categorization solely based on the conversation's texts (byclustering).

- Pick the best modelled categories as basis for one FAQ each.

- For each question (first entry in a thread) find the best reply fromits answers.- Select the most relevant and well formatted question/answer-pairs foreach FAQ.

For the evaluation I'm interested in expert's perceptions of theresults, e.g. if the questions are relevant, correctly answered, etc.Also as I'll release a paper about the approach I'd be happy if youcould rate one or two questions (stars on the details pages) so I'd havesome statistics to present.



Here's the direct link to the Tomcat FAQs:
http://faqcluster.com/tomcat-apache-server

(There are some other interesting FAQs as well at http://faqcluster.com/)


Thanks for your help

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Automatically extracted Tomcat FAQs

Reply via email to