Desing Question related with Lucene

2010-05-20 Thread ilkay polat
Hello; I have a desing question while developing my project. If you have time, lease read my problem and if you have a solution please make me informed. Project : Our system produce a txt file for every one hour(13 pm 14 pm e.g. ). (These files contain logs from network e.g. TCP logs). I

Removing old datas from index file

2010-05-20 Thread ilkay polat
Hello; I need to learn whether there is a way to remove some records on indexed files. And is it rapid for removing some indexed file records (For example clean old records whose created date's are less than a definite day) . Thanks

Re: Removing old datas from index file

2010-05-20 Thread Ian Lea
I need to learn whether there is a way to remove some records on indexed files. Of course. See IndexReader and IndexWriter deleteXXX methods. And is it rapid for removing some indexed file records Of course. (For example clean old records whose created date's are less than a definite

Re: Desing Question related with Lucene

2010-05-20 Thread ilkay polat
Is it better to analyze logs with lucene ? Or other solutions are better for performance On Thu, May 20, 2010 at 9:51 AM, ilkay polat polattechnol...@gmail.comwrote: Hello; I have a desing question while developing my project. If you have time, lease read my problem and if you have a

Re: Problem of getTermFrequencies()

2010-05-20 Thread manjula wijewickrema
Thanx On Mon, May 17, 2010 at 10:19 PM, Grant Ingersoll gsing...@apache.orgwrote: Note, depending on your downstream use, you may consider using a TermVectorMapper that allows you to construct your own data structures as needed. -Grant On May 17, 2010, at 3:16 PM, Ian Lea wrote: terms

Arrange terms[i]

2010-05-20 Thread manjula wijewickrema
Hi, I wrote aprogram to get the ferquencies and terms of an indexed document. The output comes as follows; If I print : +tfv[0] Output: array terms are:{title: capabl/1, code/2, frequenc/1, lucen/4, over/1, sampl/1, term/4, test/1} In the same way I can print terms[i] and freqs[i], but the

Re: How to achive this kind of document ordering

2010-05-20 Thread Dragan Jotanovic
Thanks Frank, the idea of preparing set of structured lists is what I initially thought I will have to do, but I'm afraid there will be serious performance penalty, because I would have to traverse the documents until I find all distinct values of SortFieldB. But I guess there is no other

Stemming and Wildcard Queries

2010-05-20 Thread Ivan Provalov
Is there a good way to combine the wildcard queries and stemming? As is, the field which is stemmed at index time, won't work with some wildcard queries. We were thinking to create two separate index fields - one stemmed, one non-stemmed, but we are having issues with our SpanNear queries

Re: Stemming and Wildcard Queries

2010-05-20 Thread Ahmet Arslan
Is there a good way to combine the wildcard queries and stemming?  As is, the field which is stemmed at index time, won't work with some wildcard queries. org.apache.lucene.queryParser.analyzing.AnalyzingQueryParser may help?

Re: Stemming and Wildcard Queries

2010-05-20 Thread Herbert Roitblat
At a general level, we have found that stemming during indexing is not advisable. Sometimes users want the exact form and if you have removed the exact form during indexing, obviously, you cannot provide that. Rather, we have found that stemming during search is more useful, or maybe it