documentation on Lucene

2002-04-03 Thread Suhas Indra

Hello folks,

I am new to Lucene search engine. I have read about the power of Lucene in
indexing and search. I just browsed through the site
http://jakarta.apache.org/lucene
to find about the documentation on Lucene classes. But I was unable to find
the required information(about the abstract classes and implementation of
the same). Is there any documentation available on other sites? If so please
forward me the same.

Thanks and Regards

Suhas


--
Robosoft Technologies, Mangalore, India



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




storing index in third party database.

2002-04-03 Thread amithnz

Hi all

I want to index the datas which I already stored in a thirdparty database table and 
develop a search facility using lucene. I am thinking of storing this indexes back to 
the database in another table. I know for this we have to create a 'directory' which 
do all the indexing operations,

for example

Indexwriter indwriter = new Indexwriter(dirStore,null,create);

where dirStore is the directory, create is boolean.

but I don't know the format to be followed for the directory(dirStore).Please help  me 
if anybody has done similar thing.
TIA
Amith


__
Your favorite stores, helpful shopping tools and great gift ideas. Experience the 
convenience of buying online with Shop@Netscape! http://shopnow.netscape.com/

Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: compiling lucene

2002-04-03 Thread Otis Gospodnetic

JavaCC 2.1 works, too.
This is how I have it set up:

[otis@linux2 otis]$ ls -al /usr/local/.version/javacc2.1/
total 44
drwxrwxr-x6 otis otis 4096 Jan 28 06:50 .
drwxr-xr-x   20 otis otis 4096 Apr  2 23:32 ..
drwxrwxr-x3 otis otis 4096 Jan 28 06:50 bin
-rw-rw-r--1 otis otis 8518 Jan 28 06:50 COPYRIGHT
drwxrwxr-x2 otis otis 4096 Jan 28 06:50 doc
drwxrwxr-x   21 otis otis 4096 Jan 28 06:50 examples
-rw-rw-r--1 otis otis 5599 Jan 28 06:50 README
drwxrwxr-x5 otis otis 4096 Jan 28 06:50 src
[otis@linux2 otis]$ ls -al
~/cvs-repositories/jakarta/jakarta-lucene/lib/
total 132
drwxrwxr-x3 otis otis 4096 Jan 28 15:28 .
drwxrwxr-x9 otis otis 4096 Mar 27 23:28 ..
drwxrwxr-x2 otis otis 4096 Jan 28 15:29 CVS
lrwxrwxrwx1 otis otis   36 Jan 28 06:55 JavaCC.zip -
/usr/local/javacc/bin/lib/JavaCC.zip
-rw-rw-r--1 otis otis   117522 Jan 28 15:23 junit_37.jar

Otis


--- Victor Hadianto [EMAIL PROTECTED] wrote:
 Hi list,
 
 I'm having problem compiling lucene from scratch. I checkout lucene
 1.2 rc4 
 from cvs and I am missing one vital component JavaCC 2.0
 
 The latest javaCC that I can get from webgain is 2.1 and just
 dropping the 
 thing to lucene/lib directory does not work quite well, I had a look
 and the 
 class name expected by lucene build file is quite different from
 JavaCC 2.1
 
 Is there someplace where I can get JavaCC 2.0 that works with lucene?
 
 
 Thanks,
 
 -- 
 Victor Hadianto
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: storing index in third party database.

2002-04-03 Thread Otis Gospodnetic

If you want to store indices in a database search the mailing list
archives for SqlDirectory.

Once I considered using it for one application at work, so I asked its
author about performance.  The answer was that it doesn't perform all
that well when the index grows, if I recall correctly.  Consequently,
we chose to use file-based indices instead.

Otis

--- [EMAIL PROTECTED] wrote:
 Hi all
 
 I want to index the datas which I already stored in a thirdparty
 database table and develop a search facility using lucene. I am
 thinking of storing this indexes back to the database in another
 table. I know for this we have to create a 'directory' which do all
 the indexing operations,
 
 for example
 
 Indexwriter indwriter = new Indexwriter(dirStore,null,create);
 
 where dirStore is the directory, create is boolean.
 
 but I don't know the format to be followed for the
 directory(dirStore).Please help  me if anybody has done similar
 thing.
 TIA
 Amith
 
 
 __
 Your favorite stores, helpful shopping tools and great gift ideas.
 Experience the convenience of buying online with Shop@Netscape!
 http://shopnow.netscape.com/
 
 Get your own FREE, personal Netscape Mail account today at
 http://webmail.netscape.com/
 
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: storing index in third party database.

2002-04-03 Thread Karl Øie

without having investigated the problem much i would think that a SQL 
database would be a very bad match for lucene as most of lucene's working is 
creating key's for words and documents and then creating indexes of these 
keys. for these purposes a SQL database is an unecessary overhead, not even 
talking about the overhead represented by the SQL language parser.

for these kind of indexes a lower-level database would be better suited. I 
have good experiences with BerkeleyDB (http://www.sleepycat.com) and a friend 
of me uses gdbm successfully for such key-pair indexing tasks. the advantage 
of these low-level databasesystems is that they are really much or less 
persistent b-tree/hashtable implementations, and thus created for key-pairing.

they have no SQL layer as you will have to program against them as they are 
more subroutines that applications. but for key-pair indexes i have 
experienced that BerkeleyDB runs circles around any SQL database (including 
db2 and oracle!!!).

Berkeley has a java-api and a b-tree record type that could be a very good 
match for a key-based searchtree, and it's free. take a look at it!

mvh karl øie

(ps: i am not payed by the sleepy cat to write this :-)



On Wednesday 03 April 2002 16:12, you wrote:
 If you want to store indices in a database search the mailing list
 archives for SqlDirectory.

 Once I considered using it for one application at work, so I asked its
 author about performance.  The answer was that it doesn't perform all
 that well when the index grows, if I recall correctly.  Consequently,
 we chose to use file-based indices instead.

 Otis

 --- [EMAIL PROTECTED] wrote:
  Hi all
 
  I want to index the datas which I already stored in a thirdparty
  database table and develop a search facility using lucene. I am
  thinking of storing this indexes back to the database in another
  table. I know for this we have to create a 'directory' which do all
  the indexing operations,
 
  for example
 
  Indexwriter indwriter = new Indexwriter(dirStore,null,create);
 
  where dirStore is the directory, create is boolean.
 
  but I don't know the format to be followed for the
  directory(dirStore).Please help  me if anybody has done similar
  thing.
  TIA
  Amith
 
 
  __
  Your favorite stores, helpful shopping tools and great gift ideas.
  Experience the convenience of buying online with Shop@Netscape!
  http://shopnow.netscape.com/
 
  Get your own FREE, personal Netscape Mail account today at
  http://webmail.netscape.com/
 
 
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]

 __
 Do You Yahoo!?
 Yahoo! Tax Center - online filing with TurboTax
 http://taxes.yahoo.com/

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Case Sensitivity

2002-04-03 Thread Aruna Raghavan

Hi,
I am using StandardAnalyzer - the problem was with wildcard queries being
case sensitive. Even with Standard Analyzer, you have to worry about case
sensitivity in this case. Thanks for the tip on example Analyzer, I will
take a peek.

-Original Message-
From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, April 03, 2002 1:40 PM
To: Lucene Users List
Subject: RE: Case Sensitivity


Alan, Aruna:

The built-in solution is to use LowerCaseFilter in your Analyzer.  (The
SimpleAnalyzer, StopAnalyzer, and StandardAnalyzer classes already do
this; see the Lucene API docs to see which filters each uses.)  The FAQ
includes an example implementation of an Analyzer if you want to build
your own.

Joshua

 [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden
Joshua Madden: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.

On Wed, 3 Apr 2002, Aruna Raghavan wrote:

 Hi,
 I worked around the problem by converting everything to lowercase in my
code
 prior to indexing into lucene and also prior to searching for a string.
 Ofcourse, I also had to use pattern matching to change bool operators such
 as ANDs and ORs to uppercase again because lucene expects those to be
 uppercase.
 
 -Original Message-
 From: Alan Weissman [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, April 03, 2002 1:26 PM
 To: Lucene Users List
 Subject: Case Sensitivity
 
 
 What can I do to configure Lucene to make in case insensitive? 
 
 Thanks,
 Alan
 
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 --
 To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
mailto:[EMAIL PROTECTED]
 
 


--
To unsubscribe, e-mail:
mailto:[EMAIL PROTECTED]
For additional commands, e-mail:
mailto:[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Case Sensitivity

2002-04-03 Thread Peter Carlson

You can use the standard analyzer.
This lower cases all the words (it uses the lowerCaseFilter).

Note this also uses the stop word filter so your results may vary.

Also when you index, be sure to use text instead of keyword as the field
type since the keyword doesn't go through the filter.

--Peter



On 4/3/02 11:25 AM, Alan Weissman [EMAIL PROTECTED] wrote:

 What can I do to configure Lucene to make in case insensitive?
 
 Thanks,
 Alan
 
 
 --
 To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
 For additional commands, e-mail: mailto:[EMAIL PROTECTED]
 
 


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Objects as search results

2002-04-03 Thread Kelvin Tan

Here's a topic which to my recollection (surprisingly) hasn't been brought
up: Assuming development in an object-oriented environment, it's a fair
assumption that the eventual target of searching is an object. How are
developers making this happen?

Are all fields of the objects indexed and displayed accordingly (this means
that the Document essentially takes the place of the object for search
results. bad idea IMHO)? Is there some way for the object to be
instantiated, then populated? How are these objects then displayed as search
results?

Here are some comments I have:

a) Documents shouldn't be used for displaying search results. To do so would
be inflexible and limit the type of data displayed as results to the fields
in a document. This means that if you wish to display more information, more
information has to be added to the document. This somewhat violates the
purpose of the document, I think, which is to provide an abstraction of a
atomic collection of searched/indexed fields. You may be able to get away
with it for simple applications, but I don't think it's a good idea.

Ideally, objects should be used to display the results then, since that's
what a result represents. I use Velocity, so this is easy for me. I retrieve
the objects as a collection (somehow), and stuff them into the Context for
rendering.

b) Different types of objects obviously have different types of metadata.
How can the different fields for each object be displayed, when the types of
objects to be indexed aren't fixed? (I use fields and metadata
interchangeably, so metadata is really a collection of fields of an object)

c) I use Torque, so object instantiation and population is a pretty easy
thing. I have no real solution to others, who don't have some kind of O/R
tool of sorts.

I have addressed these points to my satisfaction in a current app, but they
are terribly reliant on a specific combination (Torque and Velocity). I'm
really interested to know how other developers have approached this.

Regards,
Kelvin Tan

Relevanz Pte Ltd
http://www.relevanz.com

180B Bencoolen St.
The Bencoolen, #04-01
S(189648)

Tel: 6238 6229
Fax: 6337 4417



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Querying multiple fields of a index

2002-04-03 Thread Harpreet S Walia

Hi,

Is it possible to query multiple fields of  a given index and get the result
based on this combined query.
i.e for example if  i want to serach for a word lucene in the title field
and the word engine in the summary filed and want the results based on
these words .

How can i achieve this ?

TIA

Regards
Harpreet



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: compiling lucene

2002-04-03 Thread Victor Hadianto

 JavaCC 2.1 works, too.
 This is how I have it set up:

Yes, to confirm, a list member pointed out earlier that I have to _install_ 
JavaCC first, serve me right not redaing tfm.

Sorry for the noise

-- 
Victor Hadianto
---
Every cloud engenders not a storm. -- William Shakespeare, Henry VI

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]