documentation on Lucene
Hello folks, I am new to Lucene search engine. I have read about the power of Lucene in indexing and search. I just browsed through the site http://jakarta.apache.org/lucene to find about the documentation on Lucene classes. But I was unable to find the required information(about the abstract classes and implementation of the same). Is there any documentation available on other sites? If so please forward me the same. Thanks and Regards Suhas -- Robosoft Technologies, Mangalore, India -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
storing index in third party database.
Hi all I want to index the datas which I already stored in a thirdparty database table and develop a search facility using lucene. I am thinking of storing this indexes back to the database in another table. I know for this we have to create a 'directory' which do all the indexing operations, for example Indexwriter indwriter = new Indexwriter(dirStore,null,create); where dirStore is the directory, create is boolean. but I don't know the format to be followed for the directory(dirStore).Please help me if anybody has done similar thing. TIA Amith __ Your favorite stores, helpful shopping tools and great gift ideas. Experience the convenience of buying online with Shop@Netscape! http://shopnow.netscape.com/ Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: compiling lucene
JavaCC 2.1 works, too. This is how I have it set up: [otis@linux2 otis]$ ls -al /usr/local/.version/javacc2.1/ total 44 drwxrwxr-x6 otis otis 4096 Jan 28 06:50 . drwxr-xr-x 20 otis otis 4096 Apr 2 23:32 .. drwxrwxr-x3 otis otis 4096 Jan 28 06:50 bin -rw-rw-r--1 otis otis 8518 Jan 28 06:50 COPYRIGHT drwxrwxr-x2 otis otis 4096 Jan 28 06:50 doc drwxrwxr-x 21 otis otis 4096 Jan 28 06:50 examples -rw-rw-r--1 otis otis 5599 Jan 28 06:50 README drwxrwxr-x5 otis otis 4096 Jan 28 06:50 src [otis@linux2 otis]$ ls -al ~/cvs-repositories/jakarta/jakarta-lucene/lib/ total 132 drwxrwxr-x3 otis otis 4096 Jan 28 15:28 . drwxrwxr-x9 otis otis 4096 Mar 27 23:28 .. drwxrwxr-x2 otis otis 4096 Jan 28 15:29 CVS lrwxrwxrwx1 otis otis 36 Jan 28 06:55 JavaCC.zip - /usr/local/javacc/bin/lib/JavaCC.zip -rw-rw-r--1 otis otis 117522 Jan 28 15:23 junit_37.jar Otis --- Victor Hadianto [EMAIL PROTECTED] wrote: Hi list, I'm having problem compiling lucene from scratch. I checkout lucene 1.2 rc4 from cvs and I am missing one vital component JavaCC 2.0 The latest javaCC that I can get from webgain is 2.1 and just dropping the thing to lucene/lib directory does not work quite well, I had a look and the class name expected by lucene build file is quite different from JavaCC 2.1 Is there someplace where I can get JavaCC 2.0 that works with lucene? Thanks, -- Victor Hadianto -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: storing index in third party database.
If you want to store indices in a database search the mailing list archives for SqlDirectory. Once I considered using it for one application at work, so I asked its author about performance. The answer was that it doesn't perform all that well when the index grows, if I recall correctly. Consequently, we chose to use file-based indices instead. Otis --- [EMAIL PROTECTED] wrote: Hi all I want to index the datas which I already stored in a thirdparty database table and develop a search facility using lucene. I am thinking of storing this indexes back to the database in another table. I know for this we have to create a 'directory' which do all the indexing operations, for example Indexwriter indwriter = new Indexwriter(dirStore,null,create); where dirStore is the directory, create is boolean. but I don't know the format to be followed for the directory(dirStore).Please help me if anybody has done similar thing. TIA Amith __ Your favorite stores, helpful shopping tools and great gift ideas. Experience the convenience of buying online with Shop@Netscape! http://shopnow.netscape.com/ Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: storing index in third party database.
without having investigated the problem much i would think that a SQL database would be a very bad match for lucene as most of lucene's working is creating key's for words and documents and then creating indexes of these keys. for these purposes a SQL database is an unecessary overhead, not even talking about the overhead represented by the SQL language parser. for these kind of indexes a lower-level database would be better suited. I have good experiences with BerkeleyDB (http://www.sleepycat.com) and a friend of me uses gdbm successfully for such key-pair indexing tasks. the advantage of these low-level databasesystems is that they are really much or less persistent b-tree/hashtable implementations, and thus created for key-pairing. they have no SQL layer as you will have to program against them as they are more subroutines that applications. but for key-pair indexes i have experienced that BerkeleyDB runs circles around any SQL database (including db2 and oracle!!!). Berkeley has a java-api and a b-tree record type that could be a very good match for a key-based searchtree, and it's free. take a look at it! mvh karl øie (ps: i am not payed by the sleepy cat to write this :-) On Wednesday 03 April 2002 16:12, you wrote: If you want to store indices in a database search the mailing list archives for SqlDirectory. Once I considered using it for one application at work, so I asked its author about performance. The answer was that it doesn't perform all that well when the index grows, if I recall correctly. Consequently, we chose to use file-based indices instead. Otis --- [EMAIL PROTECTED] wrote: Hi all I want to index the datas which I already stored in a thirdparty database table and develop a search facility using lucene. I am thinking of storing this indexes back to the database in another table. I know for this we have to create a 'directory' which do all the indexing operations, for example Indexwriter indwriter = new Indexwriter(dirStore,null,create); where dirStore is the directory, create is boolean. but I don't know the format to be followed for the directory(dirStore).Please help me if anybody has done similar thing. TIA Amith __ Your favorite stores, helpful shopping tools and great gift ideas. Experience the convenience of buying online with Shop@Netscape! http://shopnow.netscape.com/ Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Case Sensitivity
Hi, I am using StandardAnalyzer - the problem was with wildcard queries being case sensitive. Even with Standard Analyzer, you have to worry about case sensitivity in this case. Thanks for the tip on example Analyzer, I will take a peek. -Original Message- From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 03, 2002 1:40 PM To: Lucene Users List Subject: RE: Case Sensitivity Alan, Aruna: The built-in solution is to use LowerCaseFilter in your Analyzer. (The SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do this; see the Lucene API docs to see which filters each uses.) The FAQ includes an example implementation of an Analyzer if you want to build your own. Joshua [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall It's that moment of dawning comprehension that I live for--Bill Watterson My opinions are too rational and insightful to be those of any organization. On Wed, 3 Apr 2002, Aruna Raghavan wrote: Hi, I worked around the problem by converting everything to lowercase in my code prior to indexing into lucene and also prior to searching for a string. Ofcourse, I also had to use pattern matching to change bool operators such as ANDs and ORs to uppercase again because lucene expects those to be uppercase. -Original Message- From: Alan Weissman [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 03, 2002 1:26 PM To: Lucene Users List Subject: Case Sensitivity What can I do to configure Lucene to make in case insensitive? Thanks, Alan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: Case Sensitivity
You can use the standard analyzer. This lower cases all the words (it uses the lowerCaseFilter). Note this also uses the stop word filter so your results may vary. Also when you index, be sure to use text instead of keyword as the field type since the keyword doesn't go through the filter. --Peter On 4/3/02 11:25 AM, Alan Weissman [EMAIL PROTECTED] wrote: What can I do to configure Lucene to make in case insensitive? Thanks, Alan -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Objects as search results
Here's a topic which to my recollection (surprisingly) hasn't been brought up: Assuming development in an object-oriented environment, it's a fair assumption that the eventual target of searching is an object. How are developers making this happen? Are all fields of the objects indexed and displayed accordingly (this means that the Document essentially takes the place of the object for search results. bad idea IMHO)? Is there some way for the object to be instantiated, then populated? How are these objects then displayed as search results? Here are some comments I have: a) Documents shouldn't be used for displaying search results. To do so would be inflexible and limit the type of data displayed as results to the fields in a document. This means that if you wish to display more information, more information has to be added to the document. This somewhat violates the purpose of the document, I think, which is to provide an abstraction of a atomic collection of searched/indexed fields. You may be able to get away with it for simple applications, but I don't think it's a good idea. Ideally, objects should be used to display the results then, since that's what a result represents. I use Velocity, so this is easy for me. I retrieve the objects as a collection (somehow), and stuff them into the Context for rendering. b) Different types of objects obviously have different types of metadata. How can the different fields for each object be displayed, when the types of objects to be indexed aren't fixed? (I use fields and metadata interchangeably, so metadata is really a collection of fields of an object) c) I use Torque, so object instantiation and population is a pretty easy thing. I have no real solution to others, who don't have some kind of O/R tool of sorts. I have addressed these points to my satisfaction in a current app, but they are terribly reliant on a specific combination (Torque and Velocity). I'm really interested to know how other developers have approached this. Regards, Kelvin Tan Relevanz Pte Ltd http://www.relevanz.com 180B Bencoolen St. The Bencoolen, #04-01 S(189648) Tel: 6238 6229 Fax: 6337 4417 -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Querying multiple fields of a index
Hi, Is it possible to query multiple fields of a given index and get the result based on this combined query. i.e for example if i want to serach for a word lucene in the title field and the word engine in the summary filed and want the results based on these words . How can i achieve this ? TIA Regards Harpreet -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: compiling lucene
JavaCC 2.1 works, too. This is how I have it set up: Yes, to confirm, a list member pointed out earlier that I have to _install_ JavaCC first, serve me right not redaing tfm. Sorry for the noise -- Victor Hadianto --- Every cloud engenders not a storm. -- William Shakespeare, Henry VI -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]