from:"Thomas"

Re: Searching within very large subset of documents

2025-08-05 Thread Thomas Barr

> On Aug 4, 2025, at 11:26 PM, Adrien Grand wrote: > > Hi Thomas, > > Your question suggests that you are creating a huge BooleanQuery to > identify these documents. A TermInSetQuery should perform better. > > Doing better would require to better understand what you are

Searching within very large subset of documents

2025-08-04 Thread Thomas Barr

I have a medium-sized (~10m) Lucene index and I frequently want to repeatedly search within a subset of around ~100k documents. I can increase MaxClauseCount and build up a huge TermQuery, keep that around, then build a BooleanQuery out of the result at runtime, but the resulting query is quite

java 17 and older lucene (4.x)

2022-09-26 Thread Thomas Matthijs

Hello, Just wondering if anyone has patched lucene 4.x for usage with java 17+ and willing to share their work? anything would be appreciated. No we cannot upgrade lucene, and will likely spend time to try to backport/patch it ourselves, but maybe someone already has? if anyone has interest in

How to ignore a ,

2016-11-28 Thread Thomas Johnson

; when we search for "Doe*" Thank you. Thomas W. Johnson, Senior Programmer 678-397-1663 tjohn...@paperhost.com<mailto:tjohn...@paperhost.com> [PaperHost] [asdf]<http://bit.ly/PaperHost_Twitter> Follow PaperHost on T

Re: org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-02 Thread Thomas Fuchs

Hi, I couldn't reproduce the problem in the following test case, so let's drop this. Regards - Thomas -- import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import org.apache.lucene.document.*; import org.apache.lucene.index

org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-01 Thread Thomas Fuchs

.run(Thread.java:695) I don't think thats an expected behavior and it is a bug in org.apache.lucene.search.TopScoreDocCollector. Am I wrong? Regards - Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Search in a specific ScoreDiopoc result

2013-09-17 Thread Thomas Guttesen

Kkkutterujjjbbb hgggja Den 17/09/2013 12.55 skrev "David Miranda" : > > Hi, > > I want to do a kind of 'facet search', that initial research in a field of > all documents in the Lucene index, and second search in other field of the > documents returned to the first research. > > Currently I'm do th

RE: Partial word match using n-grams

2013-07-30 Thread Becker, Thomas

other implications, of course, but you get the idea There are a zillion possibilities here in terms of combining various filterFactories Best Erick On Fri, Jul 19, 2013 at 9:06 AM, Becker, Thomas wrote: > Sorry, at indexing time it's not broken on anything. In other words > qu

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas

ad the entire string. If the string is broken on _ already, then NGramFilter already receives the individual terms and you can put a Filter in front that will pass through a padded token? Shai On Fri, Jul 19, 2013 at 3:45 PM, Becker, Thomas wrote: > In general the data for this field is tha

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas

.almost. Y. You're right. FuzzyQuery is not at all what you want. Don't know if your data is actually as simple as this example. Do you need to tokenize on whitespace? Would it make sense to replace spaces in the query with underscores and then trigramify the whole query as i

RE: Partial word match using n-grams

2013-07-18 Thread Becker, Thomas

dataset you might consider allowing leading wildcards so that you could easily find all words, for example, containing abc with *abc*. If your dataset is larger, you might consider something like ReversedWildcardFilterFactory (Solr) to speed this type of matching. I look forward to other opinion

Partial word match using n-grams

2013-07-18 Thread Becker, Thomas

One of our main use-cases for search is to find objects based on partial name matches. I've implemented this using n-grams and it works pretty well. However we're currently using trigrams and that causes an interesting problem when searching for things like "abc ab" since we first split on whi

RE: query on exact match in lucene

2013-07-17 Thread Becker, Thomas

Sounds like you need a PhraseQuery. -Original Message- From: madan mp [mailto:madan20...@gmail.com] Sent: Wednesday, July 17, 2013 7:40 AM To: java-user@lucene.apache.org Subject: query on exact match in lucene how to get exact string match ex- i am searching for file which consist of s

Re: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist

2013-07-13 Thread Thomas Matthijs

On Sat, Jul 13, 2013 at 10:25 AM, VIGNESH S wrote: > Hi, > > I tried indexing in Desktop..It works fine. > The above error loading error comes only in android.. > Any comments.. Don't strip META-INF/services files out of the jars

Re: Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas

is a contradiction in terms. > > If you are finding that the use of a filter is affecting the scores of > documents, then that is clearly a bug. > > -- Jack Krupansky > > -Original Message- From: Nigel V Thomas > Sent: Monday, July 01, 2013 7:38 AM > To: java-use

Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas

Hi, I would like to know if it is possible to calculate the relevance ranks of documents based on filtered document count? The current filter implementations as far as I know, seems to be applied after the query is processed and ranked against the full set of documents. Since system wide IDF value

Re: Indexing file with security problem

2013-06-26 Thread Nigel V Thomas

of any suitable solutions yet. Nigel V Thomas On 26 June 2013 20:42, lukasw wrote: > Hello > > I'll try to briefly describe my problem and task. > My name is Lukas and i am Java developer , my task is to create search > engine for different types of file (only text file types)

What to do with Lucene Version parameter on upgrade

2013-06-20 Thread Becker, Thomas

I'm relatively new to Lucene and am in the process of upgrading from 4.0 to 4.3.1. I'm trying to figure out if I need to leave my version at LUCENE_40 or if it is safe to change it to LUCENE_43. Does this parameter directly determine the index format? I have some existing indexes from 4.0 but

Re: Taking backup of a Lucene index

2013-06-06 Thread Thomas Matthijs

On Thu, Jun 6, 2013 at 7:38 AM, Lance Norskog wrote: > The simple answer (that somehow nobody gave) is that you can make a copy > of an index directory at any time. Indexes are changed in "generations". > The segment* files describe the current generation of files. All active > indexing goes on i

Request for addition of ThomasMurphy to ContributorsGroup

2013-06-04 Thread Thomas R. Murphy

Hello. I, ThomasMurphy on the wiki, would like to be a member of ContributorsGroup.

Re: RAMDirectory and expungeDeletes()/optimize()

2013-05-21 Thread Thomas Matthijs

On Tue, May 21, 2013 at 3:12 PM, Konstantyn Smirnov wrote: > I want to refresh the topic a bit. > > Using the Lucene 4.3.0, I could'n find a method like expungeDeletes() in > the > IW anymore. http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes()

Re: Taking backup of a Lucene index

2013-04-17 Thread Thomas Matthijs

On Wed, Apr 17, 2013 at 12:57 PM, Ashish Sarna wrote: > I want to take back-up of a Lucene index. I need to ensure that index files > would not change when I take their backup. > > > I am concerned about the housekeeping/merge/optimization activities which > Lucene performs internally. I am not

RE: Detecting when an index was not closed properly

2013-04-09 Thread Becker, Thomas

ginal Message- From: Becker, Thomas [mailto:thomas.bec...@netapp.com] Sent: Friday, April 05, 2013 1:33 PM To: java-user@lucene.apache.org Subject: Detecting when an index was not closed properly We are doing some crash resiliency testing of our application. One of the things we found is tha

Detecting when an index was not closed properly

2013-04-05 Thread Becker, Thomas

We are doing some crash resiliency testing of our application. One of the things we found is that the Lucene index seems to get out of sync with the database pretty easily. I suspect this is because we are using near real time readers and never actually calling IndexWriter.commit(). I'm tryin

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs

On Mon, Feb 25, 2013 at 12:19 PM, Thomas Matthijs wrote: > On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > >> >> On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: >> >>> On 20/02/2013 11:28, Paul Taylor wrote: >>> >>>> Just upd

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs

On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > > On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > >> On 20/02/2013 11:28, Paul Taylor wrote: >> >>> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >>> that use Norma

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs

On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > On 20/02/2013 11:28, Paul Taylor wrote: > >> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >> that use NormalizeCharMap for replacing characters in the anyalzers are not >> working. >> >> bump, anybody I thought a s

RE: updateDocument question

2013-02-07 Thread Becker, Thomas

estion Hi Thomas, On Wed, Feb 6, 2013 at 2:50 PM, Becker, Thomas wrote: > I've built a search prototype feature for my application using Lucene, and it > works great. The application monitors a remote system and currently indexes > just a few core attributes of the object

updateDocument question

2013-02-06 Thread Becker, Thomas

I've built a search prototype feature for my application using Lucene, and it works great. The application monitors a remote system and currently indexes just a few core attributes of the objects on that system. I get notifications when objects change, and I then update the Lucene index to kee

Lucene-MoreLikethis

2013-01-15 Thread Thomas Keller

Hey, I have a question about "MoreLikeThis" in Lucene, Java. I built up an index and want to find similar documents. But I always get no results for my query, mlt.like(1) is always empty. Can anyone find my mistake? Here is an example. (I use Lucene 4.0) public class HelloLucene { public

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs

On Mon, Oct 8, 2012 at 2:29 PM, Thomas Matthijs wrote: > On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: >> Hi, >> >> This is a known problem currently. I think there is already an issue open, >> so this was not solved for 4.0 (I don't have the issu

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs

On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs

On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs

Hello, I have some custom queries & scorer that need to able to construct the "global" docIds (doc + docBase). But when i use these in a QueryWrapperFilter they no longer work, because QueryWrapperFilter.getDocIdSet uses a "private context" (context.reader().getContext();) which always has a docB

Re: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Thomas Matthijs

And to include the code On Thu, Oct 4, 2012 at 3:52 PM, Markus Jelsma wrote: > I forgot to add that this is with today's build of trunk. > > -Original message- >> From:Markus Jelsma >> Sent: Thu 04-Oct-2012 15:42 >> To: java-user@lucene.apache.org >> Subject: Highlighter IOOBE with modif

problem understanding the documentation for the TieredMergePolicy class

2012-06-12 Thread thomas

ePolicy.html#findMerges%28org.apache.lucene.index.SegmentInfos%29> Would somebody be so kind to explain it to me? Thanks, thanks a lot Thomas

Scoring similarity by the position of the terms

2012-03-22 Thread Thomas Rewig

Similarity. Lucene has been developed and grown and I was wondering if you can now do the same thing in a simpler and more straigth forward way. Maybe with some of the newer SpanQuerys or a other use of payloads. Does anyone have any idea where to start? Regards Thomas

Re: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas

>> -Original Message- >> From: stephen.warner.tho...@gmail.com >> [mailto:stephen.warner.tho...@gmail.com] On Behalf Of Stephen Thomas >> Sent: Tuesday, November 29, 2011 5:20 PM >> To: java-user@lucene.apache.org >> Subject: Custom Filter for Splitting CamelCase?

Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas

List, I have written my own CustomAnalyzer, as follows: public TokenStream tokenStream(String fieldName, Reader reader) { // TODO: add calls to RemovePuncation, and SplitIdentifiers here // First, convert to lower case TokenStream

Re: Scoring a document using LDA topics

2011-11-29 Thread Stephen Thomas

ote about this sometime back...maybe this would help you. > http://sujitpal.blogspot.com/2011/01/payloads-with-solr.html > > -sujit > > On Mon, 2011-11-28 at 12:29 -0500, Stephen Thomas wrote: >> List, >> >> I am trying to incorporate the Latent Dirichlet Allocation

Scoring a document using LDA topics

2011-11-28 Thread Stephen Thomas

List, I am trying to incorporate the Latent Dirichlet Allocation (LDA) topic model into Lucene. Briefly, the LDA model extracts topics (distribution over words) from a set of documents, and then represents each document with topic vectors. For example, documents could be represented as: d1 = (0,

Do duplicate documents affect term scoring?

2011-11-27 Thread Stephen Thomas

List, I am indexing a subset of Wikipedia. I have 4 years worth of data, and have taken snapshots of each document at each month in the 4 year span. Thus, I have 4*12=36 versions of each document. (I keep track of the timestamp in a field.) I have noticed that in many cases, a Wikipedia document d

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-19 Thread Thomas Rewig

re=12,2324 Doc.Id=8060id=709579name=aim溝脇しほみ 1Score=12,2324 Doc.Id=227606id=716893name=aim To avoid these problems right from the start, I need to use a different analyser for indexing? (So that the docs 'aim溝脇しほみ' and 'aim' have different scores) Thank

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-18 Thread Thomas Rewig

Doc.Id=227606 id=716893 name=aim Is there a way to guarantee the inner sorting of same scores? Or how can I avoid that documente with special characters have the same score as documente of exact matches? Thanks in advance! Thomas Am 18.07.2011 10:08, schrieb Ian Lea: I'm not su

TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-15 Thread Thomas Rewig

I would expect if I do a 'exact matching' Term Query. Each index was indexed with its associated LuceneVersion. I tested it with luke and with my own Code - the result was always the same. Is it a new feature in Lucene 3.3.0 or a b

name matching / mapping

2011-07-06 Thread Thomas Rewig

s all names of the second id-space and the first id-space is used for the querrys. String[] suggestions = spellchecker.suggestSimilar("john w.", 5); But is there a better approach? Can someone point me in the right direction for a effective approach? Thanks in

Check Numeric Fields

2011-03-11 Thread Thomas Rewig

t the NumericRangeQuery query does not work? I use lucene v. 3.0.2. Thanks in advance! Thomas

Re: Deleted File Handles - Index Writer

2010-11-19 Thread Thomas Rewig

.0.2 Release version or have I wait for a future release? Thanks for your help. Thomas Listen Read phonetically

Re: Deleted File Handles - Index Writer

2010-11-18 Thread Thomas Rewig

help. Thomas I've found a case, only with compound file, where IndexWriter holds open a SegmentReader on the pre-compound-file files... I'm working on a test case& fix. Mike On Fri, Nov 12, 2010 at 5:49 AM, Thomas Rewig wrote: Hello, I use the searcherManager for LiveIndexin

Deleted File Handles - Index Writer

2010-11-12 Thread Thomas Rewig

used" by the indexwriter) grows. Is that possible and if yes why does the indexwriter do it? Is there a max Value of deleted handles an IndexWriter could own, because I don't want to chrash the system because of too much open filehandles? Thanks in advance. Thomas --

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-11-10 Thread Thomas Rewig

amount of the deleted file handles will be stable - but first at a amount of 500 or so. Thanks in advance Thomas I integrated your SearchManager class into our code, but I am still seeing file handles marked deleted in the index directory. I am running the following command on Linux: sudo watch

Restore documents marked as deleted

2010-10-06 Thread Philippe Thomas

Hi, I was indexing some documents, but my program crashed after several days of work. If I reopen this index it is empty. I guess the reason is that auto-commit was not set and I never performed a commit. (Lesson learned) So probably all documents are marked as "deleted" and re-opening the i

Re: Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas

m frequency, idf and field norm > > 0.07028562 = (MATCH) fieldWeight(payload:ces in 550), product of: > > 1.0 = *tf(*termFreq(payload:ces)=1) > > 2.2491398 = *idf(*docFreq=157, maxDocs=551) > > 0.03125 = *fieldNorm*(field=payload, doc=550) > >

Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas

Hello Guys, I trying to understand how lucene score is calculated. So 'm using the searcher.explain() function. But the output it gives is really confusing for me. Below are the details of the query that I gave and o/p it gave me Query: *It is definitely a CES deal that will be over in Sep or Oct

Fielded Queries Question

2010-07-06 Thread Thomas Nguyen

Hello All, Can someone explain to me how fielded queries work with phrases? My first thought is that the phrase is broken down into terms and those terms are then fielded and separated with the AND operator. An example would be the following: name:"Tom Jones" --> name:"Tom" AND name:"Jones" I

Introduction to flexible indexing?

2010-06-14 Thread Thomas Koch

understand this page and help to get it in shape. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

IndexSearcher - open file handles by deleted files

2010-05-26 Thread Thomas Rewig

s not automatically if i close searcher.close()? Do I have to close something else, than all IndexSearchers and Directorys? Or am I wrong with my assumption, and the problem is somewhere else? Best Thomas - To unsubscri

[ANN] Eclipse GIT plugin beta version released

2010-03-31 Thread Thomas Koch

http://www.infoq.com/news/2010/03/egit-released http://aniszczyk.org/2010/03/22/the-start-of-an-adventure-egitjgit-0-7-1/ Maybe, one day, some apache / hadoop projects will use GIT... :-) (Yes, I know git.apache.org.) Best regards, Thomas Koch, http://www.ko

google's index layout, lucene on hbase(?)

2010-03-11 Thread Thomas Koch

; ( or http://tinyurl.com/yjr45ut ) The mail is about a lucene index{reader|writer} on top of cassandra and whether sth. like this could also be done with hbase. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail

Re: If you could have one feature in Lucene...

2010-02-25 Thread Thomas Guttesen

For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Med venlig hilsen Thomas Guttesen

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas

ginal questions...: commit/read does not require any external synchronization or locking. You should generally keep your IW open indefinitely and just periodically commit and/or get a new reader (IndexWriter.getReader()) as needed. Mike On Sat, Jan 9, 2010 at 10:06 AM, legrand thomas wrote: > &g

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas

McCandless a écrit : De: Michael McCandless Objet: Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException À: java-user@lucene.apache.org Date: Samedi 9 Janvier 2010, 14h51 Can you post the full FNFE stack trace? Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: >

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas

xWriter is committing) is perfectly fine. The reader searches the point-in-time snapshot of the index as of when it was opened. But: what filesystem are you using? NFS presents challenges, for example. Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: > Hi, > > I often get a Fi

Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-08 Thread legrand thomas

Hi, I often get a FileNotFoundException when my single IndexWriter commits while the IndexReader also tries to read. My application is multithreaded (Tomcat uses the business APIs); I firstly thought the read/write access was thread-safe but I probably forget something. Please help me to unde

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Thomas Becker

be careful. Load on the DB Server will surely increase. Hope that helps. Cheers, Thomas Paul Taylor wrote: > I'm building a lucene index from a database, creating 1 about 1 million > documents, unsuprisingly this takes quite a long time. > I do this by sending a query to the db o

Re: Using TermVectorMapper to compute term frequency across documents

2009-10-15 Thread Thomas D'Silva

while to compute the document,tag probabilities. Thanks, Thomas On Wed, Oct 14, 2009 at 8:15 AM, Grant Ingersoll wrote: > > On Oct 12, 2009, at 10:46 PM, Thomas D'Silva wrote: > >> Hi, >> >> I am trying to compute the counts of terms of the documents return

Using TermVectorMapper to compute term frequency across documents

2009-10-12 Thread Thomas D'Silva

getTermFreqVector(). I do not require the term frequency within a document. Thanks, Thomas HashMap termDocCount = new HashMap(); TermQuery tagQuery = new TermQuery(tagTerm); TopDocs docs = searcher.search(tagQuery, numDocs); for (int i=0 ; i public void map(String term, int frequency

Re: Problems with ItemBasedRecommender with Lucene

2009-09-17 Thread Thomas Rewig

You use Lucene 2.9 is there a way to do this with Lucene 2.4.1 because I can't find e.g. the "PayloadEncoder" or do I have to wait for the release? Regards Thomas You might want to ask on mahout-user, but I'm guessing Ted didn't mean a new field for every item-item,

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

t was only > one search, you must have two segments and therefore no optimized index for > this to be correct? > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > Fo

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

IndexSearcher.search was called only > once. > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- Thomas Becker Senior JEE Deve

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

guess, based on > the 2.9 new api profiling, is that your queries may not be agreeing with > some of the changes somehow. Along with the profiling, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas B

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

ry types you are using as well? (eg qualities) >> >> And grab invocations if its possible. >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> Thomas Becker wrote: >>> Tests run on tmpfs: >>> config: impl=Sepa

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

g, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker

gt; https://issues.apache.org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http:/

Problems with ItemBasedRecommender with Lucene

2009-09-16 Thread Thomas Rewig

e fields... I'm using lucene 2.4.1 and java version "1.6.0_16". Do anyone have an idea to avoid the growing memory. Or do somebody know an other approche for a "realtime Item based Recommender" with Lucene? Regards Thomas --

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

Hi Uwe, already done. See my last message. Cheers, Thomas Uwe Schindler wrote: > On 2.9. NIOFS is only used, if you use FSDirectory.open() instead of > FSDirectory.getDirectory (Deprecated). Can you compare when you use instead > of FSDirectory.open() the direct ctor of SimpleFSDir vs.

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

Mark Miller wrote: > Thomas Becker wrote: >> Hey Mark, >> >> yes. I'm running the app on unix. You see the difference between 2.9 and 2.4 >> here: >> >> http://ankeschwarzer.de/tmp/graph.jpg >> > Right - I know your measurements showed

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http://www.net-m.de Registergericht: Amts

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

ry as well?! Will check that. Thanks a lot for your support! Cheers, Thomas Mark Miller wrote: > A few quick notes - > > Lucene 2.9 old api doesn't appear much worse than Lucene 2.4? > > You save a lot with the new Intern impl, because thats not a hotspot > anymore. But t

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

with lucene 2.4. I will now try a freshly build 2.9 index and see if performance improves. Maybe that already solves the issue...stupid me... We're updating the index every 30 min. at the moment and it gets optimized after each update. Mark Miller wrote: > Thomas Becker wrote: >> Hey Mar

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

nerCache is a Map containing field + parser * (contracttocontentgroup prefix) as the key and as a value yet another map. * The latter map finally contains the docIds as key and positionvalue for this * prefix as value. * * @author Thomas Becker (thomas.bec...@net-m.de) * */ pub

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

Urm and uploaded here: http://ankeschwarzer.de/tmp/graph.jpg Sorry. Thomas Becker wrote: > Missed the attachment, sorry. > > Thomas Becker wrote: >> Hi all, >> >> I'm experiencing a performance degradation after migrating to 2.9 and running >> some tests.

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

Missed the attachment, sorry. Thomas Becker wrote: > Hi all, > > I'm experiencing a performance degradation after migrating to 2.9 and running > some tests. I'm getting out of ideas and any help to identify the reasons why > 2.9 is slower than 2.4 are highly appreci

lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker

y took {} ms", durationMillis); } return docs; } I'm wondering why others are experiencing better performance with 2.9 and why our implementations performance is going bad. Maybe our way of using the 2.9 api is not the best and sorting is definetly

2.9 - leftover (deleted) filehandles after upgrade

2009-07-29 Thread Thomas Becker

mpDir); with IndexSearcher indexSearcherTmp = new IndexSearcher(tmpDir, true); No errors in the logfiles, no catched exceptions, etc. I'm a kinda out of ideas at the moment. I googled and tried couple of things (IndexWriter.setUseCompoundFile(true), etc.) but didn't find a solution. A

Re: Loading an index into memory

2009-07-24 Thread Thomas Becker

/www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM >>>> _WL_QA_HM_sports_photos_072009&cat=sports >>>> >>> - >>> To unsubscribe, e-mail: java-user-u

Re: Index and search terms containing character "-"

2009-06-02 Thread legrand thomas

d strongly recommend you get a copy of Luke, it's invaluable for questions like this because it lets you look at what's actually in your index. It'll also show you how queries get broken down when pushed through various analyzers... BTW, nice test case for demonstrating what you w

Index and search terms containing character "-"

2009-05-31 Thread legrand thomas

Hi, I have a problem using TermQuery and FuzzyQuery for terms containing the character "-". Considering I've indexed "jack" and "jack-bauer" as 2 tokenized captions, I get no result when searching for "jack-bauer". Moreover, "jack" with a TermQuery returns the two captions. What should I do t

Creating document fields by providing termvector directly (bypassing the analyzing/tokenizing stage)

2009-04-21 Thread Thomas Pönitz

] b[2] c[1]. The old discussion had no real solution but it is also a bit outdated, maybe someone has a better idea now. Greets, Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-

Dynamic Indexing?

2009-03-11 Thread Thomas J. Buhr

Lucene, From what I have read on your website indexing does seem like a useful thing. I'm considering the possible use of Lucene in a company project and have a few research questions. What I'm considering is using Lucene as a backend data store for a graphic editor. The typical usage exa

Filtering accents

2008-12-30 Thread legrand thomas

Dear all, I'd like my lucene searches to be insensitive to (French) accents. For example, considering a indexed term "métal", I want to get it when searching for "metal" or "métal" . I use lucene-2.3.2 and the searches are performed with: IndexSearcher.search(query,filter,sorter), Another filte

Lucene and JSON

2008-12-19 Thread Thomas J. Buhr

Lucene, Is there JSON support in Lucene? JSON is more fat-free compared to XML and would be preferred. Digester works well for indexing XML but something along the same lines for JSON would be even sweeter. Best, Thom - To

Re: What are the best document edit options?

2008-12-17 Thread Thomas J. Buhr

same lines for JSON would be even sweeter. Cheers, Thom On 17-Dec-08, at 2:39 PM, Steven A Rowe wrote: Hi Thomas, On 12/17/2008 at 11:52 AM, Thomas J. Buhr wrote: Where can I see how IndexWriter.updateDocument works without getting into Lucene all over again until this important issue is

Re: What are the best document edit options?

2008-12-17 Thread Thomas J. Buhr

version of Lucene are you using? The more recent ones have IndexWriter.updateDocument.. Best Erick On Wed, Dec 17, 2008 at 2:20 AM, Thomas J. Buhr wrote: Hello Lucene, Looking at the document object it seems like each time I want to edit its contents I need to do the following: 1 - fetch

What are the best document edit options?

2008-12-16 Thread Thomas J. Buhr

Hello Lucene, Looking at the document object it seems like each time I want to edit its contents I need to do the following: 1 - fetch the document 2 - dump its contents into a temp container 3 - update field values in the temp container 4 - create a new document 5 - transfer my updated field

is there an histogram feature in lucene ak Magelan

2008-10-13 Thread Thomas Birnbaum

350 damage unrepaired 30 metallic 60 something like this... is there a way to do the same with lucene? thx thomas. -- GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion! http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

Re: Range Query Question

2008-07-25 Thread Thomas Becker

Btw. I tried the wildcard since I found something on google, which noted wildcards together with StartsWith queries. Thomas Becker wrote: Hi Ian, no the wild cards should not be necessary. That was just the last try out of some. I now the exact content of both fields in my range query. The

Re: Range Query Question

2008-07-25 Thread Thomas Becker

t Circle"] gives zero results. Tried it also with braces around the term and such stupid things, even if they shouldn't be needed in a range query. I'm kinda clueless. Cheers, Thomas Ian Lea wrote: Hi Are you sure your range queries should have wild card asterisks on the end? Loo

Range Query Question

2008-07-25 Thread Thomas Becker

Name fields in a range between "A Balladeer*" TO "A Perfect Circle*" and get only terms back which are starting with that terms? Is there a way to accomplish that in Java and try it in luke? And is there a way to sort resultsets in luke? Cheers, Thomas -- Thomas Be

advanced WildcardQuery

2008-07-16 Thread legrand thomas

ardQuery with the term "pretty*car". I also want to get this document when searching for "pretty*sale*". How should I do ? Is it really possible ? I use lucene 2.3.1. Thanks in advance, Thomas Legrand

1 2 3 >

1 - 100 of 231 matches

Mail list logo