If you are going to end up either copying or moving all the data to lucene
(which, when you hook up lucene even to the existing mysql data, it will still
create it's own copy of the data), you might really want to look at other
options:
*column oriented databases (analytical databases). If ope
My two cents is no, not to use lucene as a primary datastore. Although
there are some datastores that look similar to lucene who define
themselves as primary datastores (the 'nosql' style datastores), I would
put lucene besides the likes of RRD and other specifically purposed
information stores th
Indexing with lucene/nutch on top of/instead of DB indexing for:
1) relativity scoring
2) alias searching (i.e. a large amount of aliases, like first names)
3) highlighting
4) cross-datasource searching (multi DB, DB + XML files, etc).
As for best approach to externally index, I do not have any d
One side-note is various content management tools already handle a lot
of data extraction (POI/PDFBox/etc).
In the case of Jakarta Slide and Apache Jackrabbit, both use Lucene
under the covers to index this data.
Not sure if you want to take the approach of putting your documents as
'managed' und
Thank you for the link to the previous thread, lot of information there!
*Synonym use of nicknames - that sounds quite feasible. Do you
specifically mean the WordNet module in the Sandbox, or something
different?
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
>
Hey all,
As you can tell by the subject, interested in 'name searching' and
'nearby name' searching. Scenarios include Geneology and
Similar-Person-from-Different-Datasources matchings. Assuming
java-based lucene, and more than likely the Solr project.
*nickname: would it be feasible to create