Hi, How to index properly HTML documents? All the documents are HTML, some
containing charaters encodid like ží ... Is there a character
filter for filtering these codes? Is there a way to strip the HTML tags out?
Does solr weight the terms in the document based on where they appear?..
words in headers (H1, H2,..) would be supposed to describe the document more
then words in paragraphs.

Thanks for help,

   Georg

Reply via email to