Thanks for the ideas Gilles, but unfortunately they won't work since there
is no set 'primary location'. Maybe the best way to explain is with a
filesystem example.
say I have:
/usr/bin/somefile
/usr/local/bin/somefile which is a link to: /usr/bin/somefile
Joe does:
find /usr -name somefile
This will turn up two matches which are really duplicates since they're
the same file. This is what I want to avoid. I don't care which match it
returns, but it should only return one. If Joe does:
find /usr/local -name somefile
the problem goes away naturally because find restricts itself to
/usr/local and it's sub-tree. This is what I can currently do with
htsearch.
Problem is that if find were told to ignore linked files (or articles that
aren't in their 'primary location') then:
find /usr/local -name somefile
wouldn't return any matches which would be bad.
--
Aaron
On Fri, 18 Jun 1999, Gilles Detillieux wrote:
>
> According to Aaron Turner:
> > On a simular note, I'm having a major delima. Basically I have a SQL DB
> > with content that is accessed via PHP. Each "article" in the DB has a URL
> > like:
> >
> > /articles/article.php3?id=x&loc=a.b.c.d
> >
> > where x, a, b, c, d are postive integers. Basically the id is a unique
> > identifier for the article, and loc is the location in the 'tree'. Each
> > article can be in 1 or more places in the tree. So:
> >
> > /articles/article.php3?id=11&loc=1.3.4.10
> > /articles/article.php3?id=11&loc=1.3.5.7
>
> Here are a couple more ideas. If you can produce a list of locations that
> you want to be excluded from searches, you can add them to the list in the
> exclude_urls attribute, or put them as disallow records in robots.txt.
>
> Alternatively, you could change the article.php3 script to add a noindex
> tag to its output for any article that's not at it's "primary" location,
> i.e. the one where you want it to be for search results.
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.