Lucene vs. Database indexing (RE: Indexing and Searching from within a single Document)

2008-04-08 Thread Duan, Nick
I think this should be a new thread since it's a different problem. Based on your description, I don't see any compelling reasons for you to use Lucene just for indexing purposes, since you are not indexing text docs as you indicated. Claiming database of lacking performance is not accurate and o

RE: Lucene 2.3.0 and NFS

2008-04-03 Thread Duan, Nick
Have you looked at Nutch or Hadoop? They are subprojects of Lucene, developed specifically to support large-scale, distributed indexing. Nutch is probably more mature whereas Hadoop supports clustering out of the box... ND -Original Message- From: Rajesh parab [mailto:[EMAIL PROTECTED]

RE: Using a thesaurus/onthology

2008-03-05 Thread Duan, Nick
Nutch has a ontology plugin based on Jena. http://wiki.apache.org/nutch/OntologyPlugin I haven't used it. Just by looking at the source code, it seems it just a Owl parser. So apparently it only works with sources defined in OWL format, not others such as RDF. I think you need to extend the sou

RE: Why indexing database is necessary? (RE: indexing database)

2008-03-04 Thread Duan, Nick
oking at an ETL tool that can be extended for this purpose (I've started writing a plugin for Pentaho, but got pulled off and haven't finished it -- and that was for Solr, not lucene/nutch). -D > -----Original Message- > From: Duan, Nick [mailto:[EMAIL PROTECTED] > Sent: Tue

Why indexing database is necessary? (RE: indexing database)

2008-03-04 Thread Duan, Nick
Could anyone provide any insight on why someone would use nutch/lucene or any other search engines to index relational databases? With use cases if possible? Shouldn't the database's own indexing mechanism be used since it is more efficient? If there is such a need of indexing the database conten