Fwd: lucene.apache.org problems again

2005-02-21 Thread Erik Hatcher
t;Erik Hatcher" <[EMAIL PROTECTED]> Subject: Re: lucene.apache.org problems again On Feb 20, 2005, at 9:16 AM, Noel J. Bergman wrote: The bitname.com name servers haven't updated. Surnet and Hyperreal have done so. Checking the allowed list, I see: Fixed. The process that does that h

Re: Sort Performance Problems across large dataset

2005-01-27 Thread Doug Cutting
Peter Hollas wrote: Currently we can issue a simple search query and expect a response back in about 0.2 seconds (~3,000 results) with the Lucene index that we have built. Lucene gives a much more predictable and faster average query time than using standard fulltext indexing with mySQL. This ho

RE: Sort Performance Problems across large dataset

2005-01-25 Thread Xiaohong Yang \(Sharon\)
- From: Xiaohong Yang (Sharon) [mailto:[EMAIL PROTECTED] Sent: 25 January 2005 00:37 To: Lucene Users List Subject: Re: Sort Performance Problems across large dataset Hi Peter, I just got on the list a few hours ago. I am still reading the source code. I am not going to send this to the list. I woul

RE: Sort Performance Problems across large dataset

2005-01-25 Thread Peter Hollas
hitcollector object to do the sorting. Many thanks, Peter. -Original Message- From: Xiaohong Yang (Sharon) [mailto:[EMAIL PROTECTED] Sent: 25 January 2005 00:37 To: Lucene Users List Subject: Re: Sort Performance Problems across large dataset Hi Peter, I just got on the list a few hours

Re: Sort Performance Problems across large dataset

2005-01-24 Thread Matt Quail
Peter, Currently we can issue a simple search query and expect a response back in about 0.2 seconds (~3,000 results) You may want to try something like the following (I do this in FishEye, seems to be performant for moderately large field-spaces). Use a custom HitCollector, and store all the ma

Re: Sort Performance Problems across large dataset

2005-01-24 Thread Erik Hatcher
On Jan 24, 2005, at 7:01 PM, Peter Hollas wrote: I am working on a public accessible Struts based Well there's the problem right there :)) (just kidding) To sort the resultset into alphabetical order, we added the species names as a seperate keyword field, and sorted using it whilst querying. Thi

Re: Sort Performance Problems across large dataset

2005-01-24 Thread Xiaohong Yang \(Sharon\)
Hi Peter, I just got on the list a few hours ago. I am still reading the source code. I am not going to send this to the list. I would like to know the ".2 sec" query time for 2 million fields, should it display only the first page (100 or so), not the whole 3000 found? It is very fast I ag

Re: Sort Performance Problems across large dataset

2005-01-24 Thread Stefan Groschupf
Hi, do you optimize the index? Do you tried to implement a own hit collector? Stefan Am 25.01.2005 um 01:01 schrieb Peter Hollas: I am working on a public accessible Struts based species database project where the number of species names is currently at 2.3 million, and in the near future will be

Sort Performance Problems across large dataset

2005-01-24 Thread Peter Hollas
I am working on a public accessible Struts based species database project where the number of species names is currently at 2.3 million, and in the near future will be somewhere nearer 4 million (probably the largest there is). The species names are typically 1 to 7 words in length, and the broa

Re: Problems...

2005-01-07 Thread Chris Hostetter
: Stored = as-is value stored in the Lucene index : : Tokenized = field is analyzed using the specified Analyzer - the tokens : emitted are indexed : : Indexed = the text (either as-is with keyword fields, or the tokens : from tokenized fields) is made searchable (aka inverted) : : Vectored = term

Re: Problems...

2005-01-06 Thread Erik Hatcher
On Jan 6, 2005, at 6:23 PM, Ross Rankin wrote: Could you explain this piece further, Erik "BooleanQuery and AND in TermQuery for resellerId" Your code did a textual concatenation (and I'm paraphrasing as I don't have your previous e-mail handy) of "id:" + resellerId. And then it parsed the expre

RE: Problems...

2005-01-06 Thread Chris Hostetter
: Hoss, could you tell me what to exceptions I'm missing? Thanks! anytime you have a "catch" block, you should be doing something with that exception. If possible, you can recover from an exception, but no matter what you should log the exception in some way so that you know it happened. Your

RE: Problems...

2005-01-06 Thread Ross Rankin
stetter Sent: Tuesday, January 04, 2005 6:48 PM To: Lucene Users List Subject: Re: Problems... To start with, there has to be more to the "search" side of things then what you included. this search function is not static, which means it's getting called on an object, which obviously

Re: Problems...

2005-01-04 Thread Chris Hostetter
To start with, there has to be more to the "search" side of things then what you included. this search function is not static, which means it's getting called on an object, which obviously has some internal state (paramOffset, hits, and pathToIndex are a few that jump out at me) what are the va

Re: Problems...

2005-01-04 Thread Erik Hatcher
On Jan 4, 2005, at 10:53 AM, Ross Rankin wrote: I'm not sure where or how to troubleshoot. Can I examine the indexes with anything to see what is there and that it's meaningful. Is there something simple I can do to track down what doesn't work in the process? Thanks. Echoing a previous sugge

Re: Problems...

2005-01-04 Thread Luke Shannon
://www.getopt.org/luke/ Luke - Original Message - From: "Ross Rankin" <[EMAIL PROTECTED]> To: Sent: Tuesday, January 04, 2005 10:53 AM Subject: Problems... > (Bear with me; I have inherited this system from another developer who is no > longer with the company. So

Problems...

2005-01-04 Thread Ross Rankin
(Bear with me; I have inherited this system from another developer who is no longer with the company. So I am not familiar with Lucene at all. I just have got the task of "Fixing the search".) I have servlet that runs every 10 minutes and indexes and I can see files being created in the ind

Re: LIMO problems

2004-12-13 Thread David Spencer
Daniel Cortes wrote: Hi, I want to know what library do you use for search in PPT files? I use this ("native code"): http://chicago.sourceforge.net/xlhtml POI support this? thanks - To unsubscribe, e-mail: [EMAIL PROTECTED] For add

Re: LIMO problems

2004-12-13 Thread Luke Shannon
CTED]> Sent: Monday, December 13, 2004 10:46 AM Subject: Re: LIMO problems > > Hi, I want to know what library do you use for search in PPT files? > POI support this? > thanks > > > - > To unsubscribe

Re: LIMO problems

2004-12-13 Thread Daniel Cortes
Hi, I want to know what library do you use for search in PPT files? POI support this? thanks - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: LIMO problems

2004-12-09 Thread Luke Francl
On Thu, 2004-12-09 at 10:07, Daniel Cortes wrote: > I've the last version of LIMO. > It is running in a Tomcat and I can't add any Index and don't load the > index that I create the index before from console (java > org.apache.lucene.demo.IndexFiles ...) > This is the reasson that I demand the st

Re: LIMO problems

2004-12-09 Thread Daniel Cortes
I've the last version of LIMO. It is running in a Tomcat and I can't add any Index and don't load the index that I create the index before from console (java org.apache.lucene.demo.IndexFiles ...) This is the reasson that I demand the structure of limo.properties because this file don't exist an

Re: LIMO problems

2004-12-09 Thread Luke Francl
On Thu, 2004-12-09 at 07:32, Daniel Cortes wrote: > Hi, I'm tying Limo (Index Monitor of Lucene) and I have a problem, > obviously it will be a silly problem but now I don't > have solution. > Someone can tell me how structure it have limo.properties file? > because I have any example thanks. > If

Re: LIMO problems

2004-12-09 Thread Luke Shannon
I use "Luke". It is pretty good. http://www.getopt.org/luke/ Luke - Original Message - From: "Daniel Cortes" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, December 09, 2004 8:32 AM Subject: LIMO problems > Hi, I'm tying Limo (Index

LIMO problems

2004-12-09 Thread Daniel Cortes
Hi, I'm tying Limo (Index Monitor of Lucene) and I have a problem, obviously it will be a silly problem but now I don't have solution. Someone can tell me how structure it have limo.properties file? because I have any example thanks. If you know another web-aplication for administration Lucenes In

Re: problems search number range

2004-11-18 Thread Morus Walter
[EMAIL PROTECTED] writes: > > this solution was the first that i tried.. but this does not run correctly.. > because: > > when we try to sort this number in alphanumeric order we obtain that number > -0010 is higher than -0001 > right. I failed to see that. So you would have to use a complemen

Re: problems search number range

2004-11-18 Thread [EMAIL PROTECTED]
for signed integers. > E.g. > -0010 > -0001 > 0 > 1 > 00020 > should work (in the range -..9), since '0' has a higher ascii > (unicode) code than '-'. > Of course the analayzer has to preserve the '-' and the '-' should not

Re: problems search number range

2004-11-18 Thread Morus Walter
e the analayzer has to preserve the '-' and the '-' should not be eaten by the query parser in case you use it. I don't know if there are problems with that, but I suspect that at least for the query parser. Morus ---

problems search number range

2004-11-18 Thread [EMAIL PROTECTED]
ers: 5 and 130 i started to have problems.. lucene didn't find the number 20 yet¡ i solved this changing the format of the numbers and putting this: number to look for: "020" range: "005", "130" query= 'number:[005 TO 030] up to this point all correct.

Lucene web demo problems

2004-11-06 Thread Harald Tijink
va HotSpot(TM) Client VM (build 1.4.2_06-b03, mixed mode) - Lucene: lucene-1.4.2.jar Tested the Tomcat 4 examples, no problems here. Then I created an index in /opt/lucene/index with the command: java org.apache.lucene.demo.IndexHTML -create -index /opt/lucene/index .. Still no problems, the

RE: problems deleting documents / design question

2004-10-22 Thread Aad Nales
Paul, We are doing similar stuff. We actually do create a hash of database name, table name and id to form a unique id. So far I have not had any problems with it. Cheers, Aad Hi, I'm creating an index from several database tables. Every item within every table has a unique id which is

problems deleting documents / design question

2004-10-22 Thread Paul
Hi, I'm creating an index from several database tables. Every item within every table has a unique id which is saved in some kind of id-field and the table name in an other one. So together they form a unique identifier within the index. When deleting / updating an item I need to retrieve it. My fi

RE: lucene-??? files: are they ever deleted after? was: locking problems

2004-10-11 Thread Aad Nales
problems Aad, note that .../topic/temp is not always correct. If you use Lucene outside of Tomcat, the lock files will be stored elsewhere. If IndexWriter/Readers are properly handled, they will clean up lock files after themselves. If something breaks (e.g. IOException because you are out of disk

Re: lucene-??? files: are they ever deleted after? was: locking problems

2004-10-11 Thread Otis Gospodnetic
Aad, note that .../topic/temp is not always correct. If you use Lucene outside of Tomcat, the lock files will be stored elsewhere. If IndexWriter/Readers are properly handled, they will clean up lock files after themselves. If something breaks (e.g. IOException because you are out of disk space

lucene-??? files: are they ever deleted after? was: locking problems

2004-10-11 Thread Aad Nales
Hi, >From earlier postings I understand that when lucene opens an IndexWriter or executes a number of specific methods on an IndexReader that a lock is created in .../tomcat/temp. My question is. Are these files ever deleted after something has gone wrong? For instance, when the server is restarte

Re: locking problems

2004-10-08 Thread Doug Cutting
Aad Nales wrote: 1. can I have one or multiple searchers open when I open a writer? 2. can I have one or multiple readers open when I open a writer? Yes, with one caveat: if you've called the IndexReader methods delete(), undelete() or setNorm() then you may not open an IndexWriter until you've c

locking problems

2004-10-08 Thread Aad Nales
Based on discussions in this group I figure that I should 'cache' IndexSearchers and IndexReaders, which i did. I have build an IndexSearcherPool and an IndexReaderPool. Both seem to work fine (although I am still testing). However, whenever I use these I can not create an IndexWriter. The thread f

RE: Problems with Lucene + BDB (Berkeley DB) integration

2004-09-21 Thread Crump, Michael
Try setUseCompoundFile(false) on your IndexWriter as soon as you create it or before you call optimize -Original Message- From: Christian Rodriguez [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 21, 2004 1:10 PM To: Lucene Users List Subject: Re: Problems with Lucene + BDB (Berkeley

Re: Problems with Lucene + BDB (Berkeley DB) integration

2004-09-21 Thread Christian Rodriguez
fdx, _4.fnm, > > _4.frq, _4.prx, _4.tii, segments, deletable. (9 files) > > 2nd run: contents of directory: 0:_4.f1, _4.cfs, _4.fdt, _4.fdx, > > _4.fnm, _4.frq, _4.prx, _4.tii, _4.tis, segments, deletable. (11 &

Re: Problems with Lucene + BDB (Berkeley DB) integration

2004-09-20 Thread Andy Goodell
e any idea why this is happening? > Has anyone been able to use the BDB + Lucene integration with no problems? > > Id appreciate any help or pointers. > Thanks! > Xtian > >

Problems with Lucene + BDB (Berkeley DB) integration

2004-09-20 Thread Christian Rodriguez
, _4.fnm, _4.frq, _4.prx, _4.tii, _4.tis, segments, deletable. (11 files) Does anyone have any idea why this is happening? Has anyone been able to use the BDB + Lucene integration with no problems? Id appreciate any help or pointers. Thanks! Xtian

Re: range query problems

2004-09-17 Thread Derek Baker
Ah, but if I escape the "0" in the term constructor: Term term2 = new Term("adzer", "\\0"); It works. And then it works for a dash as well. Seems that to pass a search string to a queryParser, the "0" has to be escaped doubly: searchString = "adzer: [# TO 0]"; Just escaping with a doub

Re: range query problems

2004-09-17 Thread Derek Baker
Thanks for your reply. If I do it manually: Term term1 = new Term("adzer", "#"); Term term2 = new Term("adzer", "0"); Query myQuery = new RangeQuery(term1, term2, true); hits = searcher.search(myQuery); I still get nothing. If I make the first term in the new RangeQuery call null:

Re: range query problems

2004-09-17 Thread Daniel Naber
On Friday 17 September 2004 19:37, Derek Baker wrote: > However, if I create a range query that I would expect to find that > value, I get nothing. The range query string is: "adzer:[# TO 0]" (minus > the quotes). As far as I can tell, this query string should find any > value in the adzer fields

range query problems

2004-09-17 Thread Derek Baker
Hi, I'm having a problem with a range query. I have a field in my documents called "adzer". In at least one of those documents, the value is: "-93" (without the quotes). I know this because if I create a search string like so: "adzer: \\-93" (again, without the qu

Re: Build problems

2004-09-03 Thread Paul Elschot
Danny, On Friday 03 September 2004 20:53, [EMAIL PROTECTED] wrote: > I'm trying to build Lucene with ant (in XP) from the prompt > I got the ant-optional.jar from > http://archive.apache.org/dist/ant/binaries/ because I > couldn't find it anywhere else. I'm running the newest > version of ant and

Build problems

2004-09-03 Thread dhatcher
I'm trying to build Lucene with ant (in XP) from the prompt I got the ant-optional.jar from http://archive.apache.org/dist/ant/binaries/ because I couldn't find it anywhere else. I'm running the newest version of ant and when I go into the lucene base directory and type 'ant' it finds the bu

RE: Problems indexing Japanese with CJKAnalyzer ... Or French wit h UTF-8 and MetaData

2004-07-16 Thread Jon Schuster
27;Lucene Users List' Subject: RE: Problems indexing Japanese with CJKAnalyzer ... Or French with UTF-8 and MetaData I don't think I understand correctly your proposal. As a basis, I am using Demo3 with indexHTML, HTMLDocument and HTMLParser. Inside HTML parser, I am calling getMetaTags (ca

RE: Problems indexing Japanese with CJKAnalyzer ... Or French with UTF-8 and MetaData

2004-07-15 Thread Bruno Tirel
4 15:12 À : Lucene Users List Objet : Re: Problems indexing Japanese with CJKAnalyzer If its a web application, you have to cal request.setEncoding("UTF-8") before reading any parameters. Also make sure html page encoding is specified as "UTF-8" in the metatag. most web app server

Re: Problems indexing Japanese with CJKAnalyzer

2004-07-15 Thread Praveen Peddi
If u call above method, I think it will solve ur problem. Praveen - Original Message - From: "Bruno Tirel" <[EMAIL PROTECTED]> To: "'Lucene Users List'" <[EMAIL PROTECTED]> Sent: Thursday, July 15, 2004 6:15 AM Subject: RE: Problems indexin

RE: Problems indexing Japanese with CJKAnalyzer

2004-07-15 Thread Bruno Tirel
Any help available? Best regards, Bruno -Message d'origine- De : Jon Schuster [mailto:[EMAIL PROTECTED] Envoyé : mercredi 14 juillet 2004 22:51 À : 'Lucene Users List' Objet : RE: Problems indexing Japanese with CJKAnalyzer Hi all, Thanks for the help on indexing Japanes

RE: Problems indexing Japanese with CJKAnalyzer

2004-07-14 Thread Jon Schuster
Hi all, Thanks for the help on indexing Japanese documents. I eventually got things working, and here's an update so that other folks might have an easier time in similar situations. The problem I had was indeed with the encoding, but it was more than just the encoding on the initial creation of

Re: Problems indexing Japanese with CJKAnalyzer

2004-07-06 Thread Grant Ingersoll
Jon, Java expects your files to be in the encoding of the Native Locale. In most cases in the U.S., this will be English. If you want to read files in that are in a different encoding, you have to tell Java what your encoding is, in this case, Shift JIS. See the javadocs for java.io.InputStr

Re: Problems indexing Japanese with CJKAnalyzer

2004-07-06 Thread Steven Rowe
Hi Jon, It sounds to me like you have a character encoding problem. The native2ascii tool is designed to produce input for the Java compiler; the "\u7aef" notation you're seeing is understood by Java string interpreters to mean the corresponding hexadecimal Unicode code point. Other Java progr

Problems indexing Japanese with CJKAnalyzer

2004-07-02 Thread Jon Schuster
Hi, I've gone through all of the past messages regarding the CJKAnalyzer but I still must be doing something wrong because my searches don't work. I'm using the IndexHTML application from the org.apache.lucene.demo package to do the indexing, and I've changed the analyzer to use the CJKAnalyzer.

Re: Problems with special characters

2004-07-02 Thread Cheolgoo Kang
How about creating a special-char-converting-reader like this? public class LuceneReader extends Reader { private Reader source = null; private char buffer = (char) 0; public LuceneReader( Reader sourceReader ) { this.source = sourceReader; } public int read() { char result = (char) 0;

Re: Problems with special characters

2004-07-02 Thread Marten Senkel
I had a similar problem. I don't know whether there is a more intelligent solution, but the quickest I had in mind was to convert the special characters I needed to look up into a fixed random character string. For example: prior to indexing I replace all occurences of '+' by 'PLUSsdfaEGsgfAE'.

Problems with special characters

2004-07-02 Thread Olivier Catteau
Hi ! I've got big problems with special characters such as +, -, ", ', *, ... I use a StandartAnalyzer and when I release a search with MultiFieldQueryParser. It makes my application crashing because I get an exception. org.apache.lucene.queryParser.ParseException: Encounter

RE: demo indexing problems on linux

2004-06-18 Thread Morris Mizrahi
thoughts or ideas? Thanks for any help. Morris -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Friday, June 18, 2004 10:52 AM To: Lucene Users List Subject: Re: demo indexing problems on linux On Thursday 17 June 2004 21:10, Morris Mizrahi wrote

Re: demo indexing problems on linux

2004-06-18 Thread Daniel Naber
On Thursday 17 June 2004 21:10, Morris Mizrahi wrote: > When I run org.apache.lucene.demo.IndexHTML on Linux the indexer works > fine when I am creating a new index (e.g. using -create -index option). > But when I run the indexer again (-index without the -create option) for > updates it does not

demo indexing problems on linux

2004-06-17 Thread Morris Mizrahi
Hey everyone. I am using the Lucene 1.4rc3 demo classes. When I run org.apache.lucene.demo.IndexHTML on Linux the indexer works fine when I am creating a new index (e.g. using -create -index option). But when I run the indexer again (-index without the -create option) for updates it does no

Re: problems with lucene in multithreaded environment

2004-06-08 Thread Jayant Kumar
gt; consider moving certain > files of the index onto a ramfs-based drive. For > example, moving the > .tis, .frq and .prx can greatly improve performance. > Also, having these > files in RAM means that the cache does not need to > be warmed. > > Hope this helps! Thanks

Re: problems with lucene in multithreaded environment

2004-06-07 Thread Doug Cutting
Jayant Kumar wrote: Thanks for the patch. It helped in increasing the search speed to a good extent. Good. I'll commit it. Thanks for testing it. But when we tried to give about 100 queries in 10 seconds, then again we found that after about 15 seconds, the response time per query increased. This

Re: problems with lucene in multithreaded environment

2004-06-05 Thread Jayant Kumar
Thanks for the patch. It helped in increasing the search speed to a good extent. But when we tried to give about 100 queries in 10 seconds, then again we found that after about 15 seconds, the response time per query increased. Enclosed is the dump which we took after about 30 seconds of starting t

Re: problems with lucene in multithreaded environment

2004-06-04 Thread Doug Cutting
Doug Cutting wrote: Please tell me if you are able to simplify your queries and if that speeds things. I'll look into a ThreadLocal-based solution too. I've attached a patch that should help with the thread contention, although I've not tested it extensively. I still don't fully understand why

Re: problems with lucene in multithreaded environment

2004-06-04 Thread Doug Cutting
Jayant Kumar wrote: Please find enclosed jvmdump.txt which contains a dump of our search program after about 20 seconds of starting the program. Also enclosed is the file queries.txt which contains few sample search queries. Thanks for the data. This is exactly what I was looking for. "Thread-14"

Re: problems with lucene in multithreaded environment

2004-06-03 Thread Jayant Kumar
We conducted a test on our search for 500 requests given in 27 seconds. We noticed that in the first 5 seconds, the results were coming in 100 to 500 ms. But as the queue size kept increasing, the response time of the search increased drastically to approx 80-100 seconds. Please find enclosed jvm

Re: problems with lucene in multithreaded environment

2004-06-03 Thread Supun Edirisinghe
I noticed delays when concurrent threads query an IndexSearcher too. our index is about 550MB with about 850,000 docs. each doc with 20-30 fields of which only 3 are indexed. Our queries are not very complex -- just 3 required term queries. this is what my test did: intialilize an array of terms

Re: problems with lucene in multithreaded environment

2004-06-02 Thread Doug Cutting
Jayant Kumar wrote: We recently tested lucene with an index size of 2 GB which has about 1,500,000 documents, each document having about 25 fields. The frequency of search was about 20 queries per second. This resulted in an average response time of about 20 seconds approx per search. That sounds s

problems with lucene in multithreaded environment

2004-06-02 Thread Jayant Kumar
We recently tested lucene with an index size of 2 GB which has about 1,500,000 documents, each document having about 25 fields. The frequency of search was about 20 queries per second. This resulted in an average response time of about 20 seconds approx per search. What we observed was that lucene

Re: asktog on search problems

2004-05-21 Thread Erik Hatcher
This is not specific advice, but an idea that I think Google leverages to build up search corrections. If a user searches for "100AW" and it doesn't match, but a moment later they try something different and immediately get to a product page, the system can make a loose connection between thei

Re: asktog on search problems

2004-05-21 Thread Jeff Wong
I don't think the first solution will work because the "100AW~" term must match either 100 or AW which are your index terms. Coincidentally, I have been trying to deal with this very problem over the past few days. In my situation, I'm trying to help users find thing when the spacing of their

asktog on search problems

2004-05-21 Thread David Spencer
Haven't seen this discussed here. See 7a at the link below: http://www.asktog.com/columns/062top10ReasonsToNotShop.html 7a talks about searching on a camera site for the "Lowepro 100 AW". He says this query works:"Lowepro 100 AW" and this query does not work: "Lowepro 100AW" Cross checking

Re: Problems From the Word Go

2004-04-30 Thread Erik Hatcher
On Apr 30, 2004, at 8:52 AM, Terry Steichen wrote: Erik, Maybe you could donate some of those demo modules (and the accompanying article/text) to Lucene, so they'd be incorporated officially in the website? Sure... and in fact that has been my intention all along. One idea that I had with the Lu

Re: Problems From the Word Go

2004-04-30 Thread Terry Steichen
" <[EMAIL PROTECTED]> Sent: Friday, April 30, 2004 8:48 AM Subject: Re: Problems From the Word Go > Unfortunately the demo that comes with Lucene is harder to run than it > really should be. My suggestion is to just get the Lucene JAR, and try > out examples from the many art

Re: Problems From the Word Go

2004-04-30 Thread Erik Hatcher
Unfortunately the demo that comes with Lucene is harder to run than it really should be. My suggestion is to just get the Lucene JAR, and try out examples from the many articles available. My intro Lucene article at java.net should be easy to get up and running in only a few minutes of having

Re: Problems From the Word Go

2004-04-30 Thread Marten Senkel
IL PROTECTED]To: <[EMAIL PROTECTED]> utions.net>cc:

Re: Problems From the Word Go

2004-04-29 Thread James Dunn
Alex, Could you send along whatever error messages you are receiving? Thanks, Jim --- Alex Wybraniec <[EMAIL PROTECTED]> wrote: > I'm sorry if this is not the correct place to post > this, but I'm very > confused, and getting towards the end of my tether. > > I need to install/compile and run L

Re: Problems From the Word Go

2004-04-29 Thread Grant Ingersoll
Alex, What kind of errors are you getting? Is the Lucene JAR in your classpath? Have you read http://jakarta.apache.org/lucene/docs/gettingstarted.html? -Grant >>> [EMAIL PROTECTED] 04/29/04 11:53AM >>> I'm sorry if this is not the correct place to post this, but I'm very confused, and getti

Problems From the Word Go

2004-04-29 Thread Alex Wybraniec
I'm sorry if this is not the correct place to post this, but I'm very confused, and getting towards the end of my tether. I need to install/compile and run Lucene on a Windows XP Pro based machine, running J2SE 1.4.2, with ANT. I downloaded both the source code and the pre-compile versions, and a

Re: Locking problems with NFS

2004-04-20 Thread Doug Cutting
Francesco Bellomi wrote: The only problem is that, as lucene 1.4rc2, FSDirectory is 'final'. Please submit a patch to lucene-dev to make FSDirectory non-final. In fact, a third architectural approach would be to define an API for "pluggable" lock implementations: IMHO that would be more robust to

Re: Locking problems with NFS

2004-04-20 Thread Francesco Bellomi
Doug, thanks for your suggestions! I will implement your solution and make it public. We already use a 1.4 JVM, so we have no problems with that. The only problem is that, as lucene 1.4rc2, FSDirectory is 'final'; I would go for the subclassing solution if you can make it not final;

Re: Locking problems with NFS

2004-04-20 Thread Doug Cutting
Francesco Bellomi wrote: we are experiencing some difficulties in using Lucene with a NFS filesystem. Basically, locking seems not to work properly, since it appears that attempted concurring writing on the index (from different VMs) are not blocked, and this often causes the index to be corrupted.

Locking problems with NFS

2004-04-20 Thread Francesco Bellomi
Dear Lucene users, we are experiencing some difficulties in using Lucene with a NFS filesystem. Basically, locking seems not to work properly, since it appears that attempted concurring writing on the index (from different VMs) are not blocked, and this often causes the index to be corrupted. I f

Re: Problems indexing with keywords

2004-03-25 Thread Brisbart Franck
Hi, when you index your field as a keyword, it's not indexed and thus the analyzer is not used for this field during the indexation. But, if you make a search using the query parser with the StandardAnalyzer, it will be used for the parsing. So, I suppose that in your query 'fieldname:Rev*', the

Problems indexing with keywords

2004-03-25 Thread John Colvin
Hi I am creating an index with StandardAnalyzer and searching with the same. I have added keyword fields to the index which contain something like this: Rev_.12.04.2004 /root/navroot/10 Modules/drawer 1/drawer 2/job AK When searching with the StandardAnalyzer I get nothing back when searching f

Re: maxDoc and RemoteSearchable problems

2004-02-28 Thread DMGoodstein
all fixed...it was caused by slightly different versions of the jvm being used to compile vs execute the classes. --DMG - Original Message - From: [EMAIL PROTECTED] Date: Friday, February 27, 2004 8:50 pm Subject: maxDoc and RemoteSearchable problems > I am instantiatin

maxDoc and RemoteSearchable problems

2004-02-27 Thread DMGoodstein
I am instantiating a RemoteSearchable with a file-based IndexReader: public static void main(String args[]) throws Exception { System.setSecurityManager(new RMISecurityManager()); Searchable theLocal = new IndexSearcher(args[0]); theLocal.maxDoc(); the maxDoc() method call causes: Except

problems with maxDoc() and RemoteSearchable

2004-02-27 Thread DMGoodstein
I am instantiating a RemoteSearchable with a file-based IndexReader: public static void main(String args[]) throws Exception { System.setSecurityManager(new RMISecurityManager()); Searchable theLocal = new IndexSearcher(args[0]); -

Problems deleting documents from the index (Lock obtain timed out)

2003-12-15 Thread Hohwiller, Joerg
Hi there, I just subscribed to this list and have a little Problem: I am using lucene for incremental indexing (yes, I read the FAQ! dont try to convince me to rebuild the index periodically from scratch :) ). Now the problem seems to be that lucene is not able to perform index modifications a

Re: lucene web demo problems

2003-11-24 Thread Holger Klawitter
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, > org.apache.jasper.JasperException: Unable to compile class for JSP An error > occurred at line: 36 in the jsp file: /web/results.jsp Generated servlet > error: [javac] Compiling 1 source file [javac] > /web/jakarta-tomcat-4.1.27/work/bomEng/loca

lucene web demo problems

2003-11-23 Thread Justyna Lubkowski
I've run into trouble while trying to get the lucene demo working, included below is an error I'm getting. I think I've followed the instructions set out at http://jakarta.apache.org/lucene/docs/demo4.html The package org.apache.lucene.analysis has been placed in tomcat/webapps/search/luceneweb/

Re: HTML Parsing problems...

2003-09-22 Thread Michael Giles
Yeah, I was using HTMLParser for a few days until I tried to parse a 400K document and it spun at 100% CPU for a very long time. It is tolerant of bad HTML, but does not appear to scale. TagSoup processed the same document in a second or less at <25% CPU. -Mike At 02:42 PM 9/22/2003 +0200, y

Re: HTML Parsing problems...

2003-09-22 Thread Andrzej Bialecki
Michael Giles wrote: Erik, Probably a good idea to swap something else in, although Neko introduces a dependency on Xerces. I didn't play with Neko because I am currently using a different XML parser and didn't want to deal with the conflicts (and also find dependencies on specific parsers ann

Re: HTML Parsing problems...

2003-09-20 Thread Michael Giles
Erik, Probably a good idea to swap something else in, although Neko introduces a dependency on Xerces. I didn't play with Neko because I am currently using a different XML parser and didn't want to deal with the conflicts (and also find dependencies on specific parsers annoying). However, yes

Re: HTML Parsing problems...

2003-09-19 Thread Erik Hatcher
I'm going to swap in the neko HTML parser for the demo refactorings I'm doing. I would be all for replacing the demo HTML parser with this. If you look at the Ant task in the sandbox, you'll see that I used JTidy for it and it works well, but I've heard that neko is faster and better so I'

Re: HTML Parsing problems...

2003-09-19 Thread Michael Giles
Tatu, Thanks for the reply. See below for comments. > just ignore everything inside of

Re: HTML Parsing problems...

2003-09-18 Thread Peter Becker
Tatu Saloranta wrote: On Thursday 18 September 2003 14:50, Michael Giles wrote: I know, I know, the HTML Parser in the demo is just that (i.e. a demo), but I also know that it is updated from time to time and performs much better than the other ones that I have tested. Frustratingly, the very

Re: HTML Parsing problems...

2003-09-18 Thread Tatu Saloranta
On Thursday 18 September 2003 14:50, Michael Giles wrote: > I know, I know, the HTML Parser in the demo is just that (i.e. a demo), but > I also know that it is updated from time to time and performs much better > than the other ones that I have tested. Frustratingly, the very first page > I tried

HTML Parsing problems...

2003-09-18 Thread Michael Giles
I know, I know, the HTML Parser in the demo is just that (i.e. a demo), but I also know that it is updated from time to time and performs much better than the other ones that I have tested. Frustratingly, the very first page I tried to parse failed (

  1   2   >