Hi Fergus,

On Tue, Apr 07, 2009 at 05:06:23PM +0100, Fergus McMenemie wrote:
> >Thank you much Fergus,
> >
> >I was considering implementing a database which would hold a path name
> >and an MD5 sum of each file.
> Snap. That is close to what we did. However due to our pervious
> duff full text search engine we had to hold this information in
> a separate checksums file. Solr is much better at allowing you
> to add extra meta information as the document is being submitted
> for indexing.
> 
> curl http://localhost...update/extract 
>    -F "myfi...@file.pdf;ext.literal.id=file.pdg;ext.literal.chksum=XXXXX"

- Great idea, simpler and cleaner!

 
> >Then as a part of Solr indexing, one could check against the DB if a
> >file path exists, if Yes, then compare MD5 and only index if different.
> Using solr you could hold the checksum and pathname as solr fields,
> then rather than looking up a DB you would look up solr. Having every
> thing in the one place is better for consistency and quality. You
> could also dump all checksums and pathnames from solr if/when you wanted
> to validate your folder structure and or indexes.

- What kind of query could I use with Solr, to check for a specific
  filename/checksum and get an answer as close to "TRUE or FALSE" as possible?

Regards,
Veselin K

Reply via email to