Title: Parsing search returns
Hi everyone,

First off, this is a great product. *All* of my troubles have turned out to be related to our new load-balancing architecture, not htdig, and when I've worked out all the kinks I'll post a workaround summary for others.

A couple questions I couldn't find the answers to in the archives:

1. Is there a way to strip out elements from the search returns? In our case, each <title> tag in the site includes the site name. So headers from search returns kook like this:

The Onion | Damn You, Hearst!
The Onion | I Miss My Old Sled
The Onion | Drop Dead, Every Last One of You!

Pretty silly, right? I'd like to parse that repeating element out, preferable without employing an auxiliary script.

2. I see that there are configuration attributes for translate_latin1, translate_amp, and translate_lt_gt: false

I thought translate_latin1: false might work, but I'm still getting the entity
&#151;

printing out on the page instead of the em-dash in search results. Is there a config attribute I'm missing?

Thanks!

--
Adam Powell
Web Programmer, The Onion
America's Finest News Source
[EMAIL PROTECTED] | voice: 608.256.1372 | fax: 608.256.2535
www.theonion.com

Reply via email to