RE: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Darren Hartford
If you are going to end up either copying or moving all the data to lucene (which, when you hook up lucene even to the existing mysql data, it will still create it's own copy of the data), you might really want to look at other options: *column oriented databases (analytical databases). If ope

RE: Lucene as a primary datastore

2010-01-20 Thread Darren Hartford
My two cents is no, not to use lucene as a primary datastore. Although there are some datastores that look similar to lucene who define themselves as primary datastores (the 'nosql' style datastores), I would put lucene besides the likes of RRD and other specifically purposed information stores th

RE: Why indexing database is necessary? (RE: indexing database)

2008-03-04 Thread Darren Hartford
Indexing with lucene/nutch on top of/instead of DB indexing for: 1) relativity scoring 2) alias searching (i.e. a large amount of aliases, like first names) 3) highlighting 4) cross-datasource searching (multi DB, DB + XML files, etc). As for best approach to externally index, I do not have any d

RE: Solr newbe

2007-07-26 Thread Darren Hartford
One side-note is various content management tools already handle a lot of data extraction (POI/PDFBox/etc). In the case of Jakarta Slide and Apache Jackrabbit, both use Lucene under the covers to index this data. Not sure if you want to take the approach of putting your documents as 'managed' und

RE: Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-07-02 Thread Darren Hartford
Thank you for the link to the previous thread, lot of information there! *Synonym use of nicknames - that sounds quite feasible. Do you specifically mean the WordNet module in the Sandbox, or something different? > -Original Message- > From: Grant Ingersoll [mailto:[EMAIL PROTECTED] >

Geneology, nicknames, levenstein, soundex/metaphone, etc

2007-06-29 Thread Darren Hartford
Hey all, As you can tell by the subject, interested in 'name searching' and 'nearby name' searching. Scenarios include Geneology and Similar-Person-from-Different-Datasources matchings. Assuming java-based lucene, and more than likely the Solr project. *nickname: would it be feasible to create