RE: Indexing Urls pointing to same content

2006-01-23 Thread Gwyn Carwardine
age- From: Mario Alejandro M. [mailto:[EMAIL PROTECTED] Sent: 23 January 2006 15:58 To: Otis Gospodnetic Cc: [email protected] Subject: Re: Indexing Urls pointing to same content I know Lucene is not a web indexer... maybe I explain this bad. I'm asking in how STORE the data, not in ho

Re: Indexing Urls pointing to same content

2006-01-23 Thread Mario Alejandro M.
I know Lucene is not a web indexer... maybe I explain this bad. I'm asking in how STORE the data, not in how locate it. If two files are the same, using MD5 is my actual approach, then I plan to STORE the content once but is necesary add the two locations. Example: c:\file1 Content: One c:\file2

Re: Indexing Urls pointing to same content

2006-01-20 Thread Otis Gospodnetic
ri 20 Jan 2006 05:27:01 PM EST Subject: Indexing Urls pointing to same content I found that in the data I'm searching I have a lot of duplicated content. Only diference is that the url change, ie, one say http://localhost/sample.html and the other http://localhost/sample2.html. However, sample1

Indexing Urls pointing to same content

2006-01-20 Thread Mario Alejandro M.
I found that in the data I'm searching I have a lot of duplicated content. Only diference is that the url change, ie, one say http://localhost/sample.html and the other http://localhost/sample2.html. However, sample1 and sample2 are diferent files, that its, here is not involved redirection or link