Re: Not entire document being indexed?

2005-02-25 Thread [EMAIL PROTECTED]
Thanks Andrzej and Pasha for your prompt replies and suggestions. I will try everything you have suggested and report back on the findings! regards -pedja Pasha Bizhan said the following on 2/25/2005 6:32 PM: Hi, whole document was indexed or not. Luke can help you to give an answer the question

Re: Not entire document being indexed?

2005-02-25 Thread [EMAIL PROTECTED]
pedja [EMAIL PROTECTED] said the following on 2/24/2005 2:08 PM: Hi everyone I'm having a bizzare problem with a few of the documents here that do not seem to get indexed entirely. I use textmining WordExtractor to convert M$ Word to plain text and then index that text. For example one docu

Re: Not entire document being indexed?

2005-02-24 Thread [EMAIL PROTECTED]
indexed. You could also try the extreme case and set that max value to the max Integer. Otis --- "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: Hi everyone I'm having a bizzare problem with a few of the documents here that do not seem to get indexed entirely. I use textmining WordE

Not entire document being indexed?

2005-02-24 Thread [EMAIL PROTECTED]
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: PHP-Lucene Integration

2005-02-07 Thread [EMAIL PROTECTED]
f-written command-line search program and it outputs its results to the standard output. I guess your solution must be better ;) If the "communication parts" of your code aren't top secret, can you please share them with me/us?

Re: PHP-Lucene Integration

2005-02-06 Thread [EMAIL PROTECTED]
face to that. And some similar comments. But I'm a bit surprised there's not a bit more in terms of use of the official java extension to php. Thanks for the great package! Owen ----- To unsubscribe, e-mail: [EMAIL PROTECTE

Re: English and French documents together / analysis, indexing, searching

2005-01-23 Thread [EMAIL PROTECTED]
Morus Walter said the following on 1/21/2005 2:14 AM: No. You could do a ( ( french-query ) or ( english-query ) ) construct using one query. So query construction would be a bit more complex but querying itself wouldn't change. The first thing I'd do in your case would be to look at the differen

Re: English and French documents together / analysis, indexing, searching

2005-01-20 Thread [EMAIL PROTECTED]
EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: English and French documents together / analysis, indexing, searching

2005-01-20 Thread [EMAIL PROTECTED]
to get the language for the particular document before creating the analyzer. regards Bernhard [EMAIL PROTECTED] schrieb: Greetings everyone I wonder is there a solution for analyzing both English and French documents using the same analyzer. Reason being is that we have predominantly English

English and French documents together / analysis, indexing, searching

2005-01-20 Thread [EMAIL PROTECTED]
thanks -pedja - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene Book in UK

2005-01-06 Thread [EMAIL PROTECTED]
ry time and tacking on a rare book charge. Amazon.com are quoting shipping in 24hrs. Is this a new 'Boston Tea Party'? cheers David ----- To unsubscribe, e-mail: [EMAIL PROTECTED] For addit

Re: Lucene working with a DB

2004-12-21 Thread [EMAIL PROTECTED]
e if I've a forum with Mysql and a lot of files on my web, for every search I've to select the index that I want use in my search, true? But I don't know how to do that Lucene writes an index about the information of the DB

Re: How to index Windows' Compiled HTML Help (CHM) Format

2004-12-11 Thread [EMAIL PROTECTED]
supported indexable filetype-collection (XML, HTML, PDF, MSWord-DOC, RTF, Plaintext). WBR, Tom. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED

Re: No of docs using IndexSearcher

2004-12-10 Thread [EMAIL PROTECTED]
eader. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, December 10, 2004 2:59 PM To: Lucene Users List Subject: Re: No of docs using IndexSearcher numDocs() http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexR eader.html#numDocs() Ravi

Re: No of docs using IndexSearcher

2004-12-10 Thread [EMAIL PROTECTED]
. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Conditional Operator in Lucene

2004-12-08 Thread [EMAIL PROTECTED]
ne support conditional operator? Like retrieve all documents where age is greater than 21, how do I compose a query like this in Lucene is there a different Query object to use? Thanks, Ramon - To unsubscribe, e-mail: [EMAIL PROT

Re: Empty/non-empty field indexing question

2004-12-08 Thread [EMAIL PROTECTED]
ectly fine technically. Otis --- "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: Here's probably a silly question, very newbish, but I had to ask. Since I have mysql documents that contain over 30 fields each and most of them are added to the index, is it a common practice to add fields

Empty/non-empty field indexing question

2004-12-07 Thread [EMAIL PROTECTED]
nks -pedja - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Is this a bug or a feature with addIndexes?

2004-12-06 Thread [EMAIL PROTECTED]
Hi Otis I did try, here's what I get: [EMAIL PROTECTED] tmp]# time java MemoryVsDisk 1 1 10 -r Docs in the RAM index: 1 Docs in the FS index: 0 Total time: 142 ms real0m0.322s user0m0.268s sys 0m0.033s I tried other combinations but they dont seem to affect the outcome e

Is this a bug or a feature with addIndexes?

2004-12-06 Thread [EMAIL PROTECTED]
------ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Problem with indexing/merging indices - documents not indexed.

2004-12-06 Thread [EMAIL PROTECTED]
cs to fsWriter directly, you were using an IndexReader you had opened prior to calling fsWriter.close() to check the number of docs ... that won't work for hte same reason. -Hoss --------- To unsubscr

Problem with indexing/merging indices - documents not indexed.

2004-12-06 Thread [EMAIL PROTECTED]
r = new RAMDirectory(); IndexWriter ramWriter= new IndexWriter(ramDir, analyzer, true); addDoc(ramWriter, doctype, crofileno); System.out.println("Docs In the RAM index: " + ramWriter.docCount()); IndexWriter fsWriter = new IndexWriter(indexDir, analyzer, true); //fsWriter.setUseCompoundFile(false); //fsWriter.mergeFactor = 1000; //fsWriter.maxMergeDocs = 10; fsWriter.addIndexes(new Directory[] { ramDir }); //fsWriter.optimize(); System.out.println("Docs in the FS index: " + fsWriter.docCount()); ramWriter.close(); fsWriter.close(); Date end = new Date(); System.out.println("Lucene Added OK: " + Long.toString(end.getTime() - start.getTime()) + " total milliseconds"); } catch (IOException e) { throw new Exception("Something bad happened: " + e.getClass() + " with message: " + e.getMessage()); } catch (Exception e) { throw new Exception(" caught a " + e.getClass() + "\n with message: " + e.getMessage()); } } } - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: problems search number range

2004-11-18 Thread [EMAIL PROTECTED]
hi morus & company; On Thursday 18 November 2004 12:49, Morus Walter wrote: > [EMAIL PROTECTED] writes: > > i need to solve this search: > > number: -10 > > range: -50 TO 5 > > > > i need help.. > > i dont find anything using google.. > > If your

problems search number range

2004-11-18 Thread [EMAIL PROTECTED]
. but then another problem starts: i need to use negative numbers and then all becomes crazy for me... i need to solve this search: number: -10 range: -50 TO 5 i need help.. i dont find anything using google.. thanks d2clon -----

Re: LUCENE and algorithm for asing score to hits

2004-10-05 Thread [EMAIL PROTECTED]
im sorry friends.. i put the title incorrectly for two times - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

hibernate and algorithm for asing score to hits

2004-10-05 Thread [EMAIL PROTECTED]
61.2372 house455.9017 house254.1266 house144.1942 house037.5 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

hibernate y el algoritmo para asignar scores a los hits

2004-10-05 Thread [EMAIL PROTECTED]
use455.9017 house254.1266 house144.1942 house037.5 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Implement custom score

2004-09-22 Thread [EMAIL PROTECTED]
c wrote: > > > You need your own Similarity implementation and you need to set it as > > shown in this javadoc: > > http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ > > Similarity.html > > > > Otis > > > > --- "[EMAIL PROTECT

Re: Implement custom score

2004-09-22 Thread [EMAIL PROTECTED]
norm in Simliarity. Should I do anything about it? or does'nt it matter? /William > You need your own Similarity implementation and you need to set it as > shown in this javadoc: > http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarit y.html > > Otis &

Implement custom score

2004-09-22 Thread [EMAIL PROTECTED]
orted after popularity (a field) and not by anything else. How can I do this? What classes and methods do I have to change? thanks, William - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: pdf in Chinese

2004-09-08 Thread [EMAIL PROTECTED]
it is not about analyzer ,i need to read text from pdf file first. - Original Message - From: "Chandan Tamrakar" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Wednesday, September 08, 2004 4:15 PM Subject: Re: pdf in Chinese &

pdf in Chinese

2004-09-07 Thread [EMAIL PROTECTED]
Hi all, i use pdfbox to parse pdf file to lucene document.when i parse Chinese pdf file,pdfbox is not always success. Is anyone have some advice? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e

problem in IndexSearcher

2004-09-06 Thread [EMAIL PROTECTED]
java.io.IOException: Lock obtain timed out I was trying to create two instance of IndexSearcher with different index files Is there something i've missed? tia, buics - To unsubscribe, e-mail: [EMAIL PROTECTED] For addit

Re: memory leek in lucene?

2004-09-03 Thread [EMAIL PROTECTED]
I also have problems regarding my application, what would be the ideal memory allocation for lucene considering my application will serve at least 20 transactions per second? tia --buics On Fri, 3 Sep 2004 15:20:45 +0200, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Terence, >

too many files open while lucene indexing...

2004-05-09 Thread [EMAIL PROTECTED]
2web.com/ . - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-17 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-17 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-16 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-16 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-15 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-15 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-13 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-13 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-12 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-12 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-11 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-11 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

thanks for your mail

2004-02-11 Thread [EMAIL PROTECTED]
Received your mail we will get back to you shortly - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Italian web sites

2002-04-29 Thread [EMAIL PROTECTED]
; Date sent:Wed, 24 Apr 2002 11:02:32 +0200 > From: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Subject: Italian web sites > To: [EMAIL PROTECTED] > Send reply to:Lucene Users List > > > Hi

RE: Lucene in action at www.mil.fi

2002-04-25 Thread [EMAIL PROTECTED]
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > Sent: 23. huhtikuuta 2002 10:06 > > To: [EMAIL PROTECTED] > > Subject: Re: Lucene in action at www.mil.fi > > > > Hi Jari > > > > whre do you build your index? On filesystem? Do

Re: Italian web sites

2002-04-24 Thread [EMAIL PROTECTED]
top- word list to run statistics > on a page :-) ?!? > > On Wednesday 24 April 2002 11:02, [EMAIL PROTECTED] wrote: > > Hi all, > > > > I'm using Jobo for spidering web sites and lucene for indexing. The > > problem is that I'd like spidering only I

Italian web sites

2002-04-24 Thread [EMAIL PROTECTED]
Hi all, I'm using Jobo for spidering web sites and lucene for indexing. The problem is that I'd like spidering only Italian web sites. How can I see discover the country of a web site? Dou you know some method that tou can suggest me? Thanks Laura

Re: Lucene in action at www.mil.fi

2002-04-22 Thread [EMAIL PROTECTED]
ot; at the "Powered by " > -section of the Lucene web site. > > Thanks go to all the Lucene developers - it's great stuff :D > > Jari Aarniala > > -- > Jari Aarniala > [EMAIL PROTECTED] "death is the > Vantaa, .fi last dance eternal" > > > > > -- > To unsubscribe, e-mail: <mailto:lucene-user- [EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:lucene-user- [EMAIL PROTECTED]> > >

Re:_HTML_parser

2002-04-22 Thread [EMAIL PROTECTED]
> http://www.matuschek.net/software/jobo/ > > Otis > > --- "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > > Hi Otis, > > > > thanks for your reply. I have been looking for Spindle and Mojo for 2 > > > > hours but I don&#x

Re:_HTML_parser

2002-04-21 Thread [EMAIL PROTECTED]
> > > > Its easy to write a Visitor which extracts the links; should take > > abou > > t ten > > > lines of code. > > > __ > Do You Yahoo!? > Yahoo! Games - play chess, backgammon, pool and more >

Re: HTML parser

2002-04-20 Thread [EMAIL PROTECTED]
e following...here 's a > >good example of link extraction. > > Try http://www.quiotix.com/opensource/html-parser > > Its easy to write a Visitor which extracts the links; should take abou t ten > lines of code. > > > > -- > Brian Goetz >

Some questions

2002-04-19 Thread [EMAIL PROTECTED]
Hi all, my name is Laura and I'm a new member of this list. I'm a long date user of tomcat and I'm also a meber of tomcat user list. Yesterday looking at the jakarta menu I saw lucene and I said:"What is this?" Reading lucene home page I understood that Lucene is a very interesting and