Hi all I'm just getting started with trying out Lucy. Installation went without a hitch and I've successfully worked my way through the tutorials. Congratulations on getting the project to this level of quality.
My main interest is indexing HTML documents for web sites. It seems that if I feed the HTML file contents to the Lucy indexer, all the markup (tags and attributes) ends up in the index and consequently comes back out in the highlighted excerpts. Is it my responsibility to strip the tags out before passing the text to the indexer? Or is there a simple option I can enable somewhere to have this happen automatically? Regards Grant
