Thanks for the ideas Gilles, but unfortunately they won't work since there
is no set 'primary location'.  Maybe the best way to explain is with a
filesystem example.

say I have:

/usr/bin/somefile
/usr/local/bin/somefile which is a link to: /usr/bin/somefile

Joe does:

find /usr -name somefile

This will turn up two matches which are really duplicates since they're
the same file.  This is what I want to avoid.  I don't care which match it
returns, but it should only return one.  If Joe does:

find /usr/local -name somefile

the problem goes away naturally because find restricts itself to
/usr/local and it's sub-tree.  This is what I can currently do with
htsearch.

Problem is that if find were told to ignore linked files (or articles that
aren't in their 'primary location') then:

find /usr/local -name somefile

wouldn't return any matches which would be bad.

-- 
Aaron

On Fri, 18 Jun 1999, Gilles Detillieux wrote:

> 
> According to Aaron Turner:
> > On a simular note, I'm having a major delima.  Basically I have a SQL DB
> > with content that is accessed via PHP.  Each "article" in the DB has a URL
> > like:
> > 
> > /articles/article.php3?id=x&loc=a.b.c.d
> > 
> > where x, a, b, c, d are postive integers.  Basically the id is a unique
> > identifier for the article, and loc is the location in the 'tree'.  Each
> > article can be in 1 or more places in the tree.  So:
> > 
> > /articles/article.php3?id=11&loc=1.3.4.10
> > /articles/article.php3?id=11&loc=1.3.5.7
> 
> Here are a couple more ideas.  If you can produce a list of locations that
> you want to be excluded from searches, you can add them to the list in the
> exclude_urls attribute, or put them as disallow records in robots.txt.
>
> Alternatively, you could change the article.php3 script to add a noindex
> tag to its output for any article that's not at it's "primary" location,
> i.e. the one where you want it to be for search results.

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to