Re: Using Lucene to store document

2004-11-14 Thread Nhan Nguyen Dang
Hi,
When the Index is read to memory for searching, which data in the segment/ 
index will be load ? 
I mean all the indexed fields/ terms ? Is the stored field loaded ?
thanks, 


Otis Gospodnetic [EMAIL PROTECTED] wrote:Hello,

HEAD version means that you should check out Lucene straight out of
CVS. How to work with CVS is another story, probably described
somewhere on jakarta.apache.org site.

Otis

--- Nhan Nguyen Dang wrote:

 Hi Otis,
 Please let me know what HEAD version of Lucene is?
 Actually, I'm consider the advantages of storing document using
 Lucene Stored field - For my Search engine.
 I've tested with thousands of documents and see that retrieve
 document (in this case XML file) with Lucene is a little bit faster
 than using FS. But I cannot test with a large number of data to hava
 an accurate comparision. 
 So whether Lucene can support millions of document, still balance and
 retrieve the with approriate speed.
 Nhan
 
 
 -
 FREE Spam Protection! Click Here.
 SpamExtract Blocks Spam.
 
 -
 Do you Yahoo!?
 Check out the new Yahoo! Front Page. www.yahoo.com


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Do you Yahoo!?
 Check out the new Yahoo! Front Page. www.yahoo.com

Re: Using Lucene to store document

2004-11-14 Thread Otis Gospodnetic
Not all data in the index is loaded all at once.  I believe the .tii
file (if you are using multifile index format) is loaded into RAM,
maybe some other small ones, but the rest is read off the disk as it's
needed, depending on the terms used in the search.

Otis


--- Nhan Nguyen Dang [EMAIL PROTECTED] wrote:

 Hi,
 When the Index is read to memory for searching, which data in the
 segment/ index will be load ? 
 I mean all the indexed fields/ terms ? Is the stored field loaded ?
 thanks, 
 
 
 Otis Gospodnetic [EMAIL PROTECTED] wrote:Hello,
 
 HEAD version means that you should check out Lucene straight out of
 CVS. How to work with CVS is another story, probably described
 somewhere on jakarta.apache.org site.
 
 Otis
 
 --- Nhan Nguyen Dang wrote:
 
  Hi Otis,
  Please let me know what HEAD version of Lucene is?
  Actually, I'm consider the advantages of storing document using
  Lucene Stored field - For my Search engine.
  I've tested with thousands of documents and see that retrieve
  document (in this case XML file) with Lucene is a little bit faster
  than using FS. But I cannot test with a large number of data to
 hava
  an accurate comparision. 
  So whether Lucene can support millions of document, still balance
 and
  retrieve the with approriate speed.
  Nhan
  
  
  -
  FREE Spam Protection! Click Here.
  SpamExtract Blocks Spam.
  
  -
  Do you Yahoo!?
  Check out the new Yahoo! Front Page. www.yahoo.com
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
   
 -
 Do you Yahoo!?
  Check out the new Yahoo! Front Page. www.yahoo.com


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Using Lucene to store document

2004-11-10 Thread Nhan Nguyen Dang
Hi Otis,
Please let me know what HEAD version of Lucene is?
Actually, I'm consider the advantages of storing document using Lucene Stored 
field - For  my Search engine.
I've tested with thousands of documents and see that retrieve document (in this 
case XML file) with Lucene is a little bit faster than using FS. But I cannot 
test with a large number of data to hava an accurate comparision. 
So whether Lucene can support millions of document, still balance and retrieve 
the with approriate speed.
Nhan


-
FREE Spam Protection! Click Here.
SpamExtract Blocks Spam.

-
Do you Yahoo!?
 Check out the new Yahoo! Front Page. www.yahoo.com

Re: Using Lucene to store document

2004-11-10 Thread Otis Gospodnetic
Hello,

HEAD version means that you should check out Lucene straight out of
CVS.  How to work with CVS is another story, probably described
somewhere on jakarta.apache.org site.

Otis

--- Nhan Nguyen Dang [EMAIL PROTECTED] wrote:

 Hi Otis,
 Please let me know what HEAD version of Lucene is?
 Actually, I'm consider the advantages of storing document using
 Lucene Stored field - For  my Search engine.
 I've tested with thousands of documents and see that retrieve
 document (in this case XML file) with Lucene is a little bit faster
 than using FS. But I cannot test with a large number of data to hava
 an accurate comparision. 
 So whether Lucene can support millions of document, still balance and
 retrieve the with approriate speed.
 Nhan
 
 
 -
 FREE Spam Protection! Click Here.
 SpamExtract Blocks Spam.
   
 -
 Do you Yahoo!?
  Check out the new Yahoo! Front Page. www.yahoo.com


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Using Lucene to store document

2004-11-09 Thread Nhan Nguyen Dang
Hi all,
I'm using Lucene to index XML document/ file (may be millions of documents in 
future, each about 5-10KB)
Beside the index for searching, I want to use Lucene to store whole document 
content with UnIndexed fields -content field(instead of store each document in 
a XML file). All the document content will be stored on a separate index. Each 
time I want to get access to a document, I will let Lucene retrieve it.
 
I am consider this issue with another one Use file system to store document 
content in separate XML document means, 400K document ill be stored in 400K 
XML file in file system.
 
Purpose of this is that I can access each document rapidly. Can any body who 
has experience with this problem before give me advise which method is suitable 
? Is this better to collect all documents to an Lucene index or store them 
separately in file system ?
 
Thanks,
Dang Nhan





-
Do you Yahoo!?
 Check out the new Yahoo! Front Page. www.yahoo.com

Re: Using Lucene to store document

2004-11-09 Thread Otis Gospodnetic
It is difficult to give a general answer.  You can certainly store the
whole XML in the Lucene index, just don't tokenize it.  The HEAD
version of Lucene even has some compression that you may find handy. 
On the other hand, storing XML in the FS would allow you to store XML
files wherever you wanted, even on separate disk(s).  If these are lots
of parallel searches/reads, this can be handy.  If you want to be able
to see XML files without going through the index, this can also be
handy.  So, it depends on how you like it, but both approaches are
doable.

Otis


--- Nhan Nguyen Dang [EMAIL PROTECTED] wrote:

 Hi all,
 I'm using Lucene to index XML document/ file (may be millions of
 documents in future, each about 5-10KB)
 Beside the index for searching, I want to use Lucene to store whole
 document content with UnIndexed fields -content field(instead of
 store each document in a XML file). All the document content will be
 stored on a separate index. Each time I want to get access to a
 document, I will let Lucene retrieve it.
  
 I am consider this issue with another one Use file system to store
 document content in separate XML document means, 400K document ill
 be stored in 400K XML file in file system.
  
 Purpose of this is that I can access each document rapidly. Can any
 body who has experience with this problem before give me advise which
 method is suitable ? Is this better to collect all documents to an
 Lucene index or store them separately in file system ?
  
 Thanks,
 Dang Nhan
 
 
 
 
   
 -
 Do you Yahoo!?
  Check out the new Yahoo! Front Page. www.yahoo.com


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]