At 01:41 AM 03/26/02 +0800, Stas Bekman wrote: >> Another issue of the search I'd like to address is that the packet of >> results that come back is a little problematic: the mixture of code >> which cannot be formatted and text is a mess, and you're easily turned >> off byt it. I suggest that we make it search through the code sections >> (as it's still pretty important), but we don't show the code listing on >> the results page.
I guess it would be helpful to know what you searched for to see what you are seeing.
The bit of content returned with each search result is just to help decide if that's the result one might be interested in. I never considered that it had to be formatted correctly, and in fact for something like perl code that would take up too much space.
Well, try searching for "registry" as an example, something I know many people would try to search for.
What you get.. is.. well, hard to get a grip of. I've been put off by similar results on other sites, and it just didn't make me want to search there.
I know we can't correctly format it in the results, but there must some other possibility.
>I suggest the following solution: if you meet a <pre> tag and no closing ></pre> tag we add it. >Do you think this is possible Bill? Assuming that all the HTML is proper >(no text without enclosing <p>), >we can always tell which text is not HTML (i.e. <pre>) am I right?
It's not that easy. Swish is what is storing the content. It's being parsed by libxml2 and it's just storing the text, not any of the tags. It's also converting \n into white space, so any formatting would be lost anyway.
For HTML in general, it's a fun task to add highlighting code around a group of words -- and still keep the HTML valid.
Yes, I can understand it's very hard.
What I can suggest: as we generate our HTML from POD files, knowing what is code, could there maybe be some possibility of putting some <div> tags around the <pre> ones, and then patch Swish in some way to get it to treat those parts as searchable but not displayable? If I understood it right, it's already using some <div> tags to know what to index, so maybe it would be possible to make it a little more advanced?
Another possibility: I know it's not optimal, but maybe the search results should only display descriptions of the page in question?
This brings out another issue: if the pages were more split out (the guide pages are veeery long), maybe we could get more concise results and descriptions matching more closely.
Just some thoughts.
-- Per Einar Ellefsen [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
