java 17 and older lucene (4.x)

2022-09-26 Thread Thomas Matthijs
Hello, Just wondering if anyone has patched lucene 4.x for usage with java 17+ and willing to share their work? anything would be appreciated. No we cannot upgrade lucene, and will likely spend time to try to backport/patch it ourselves, but maybe someone already has? if anyone has interest in

How to ignore a ,

2016-11-28 Thread Thomas Johnson
; when we search for "Doe*" Thank you. Thomas W. Johnson, Senior Programmer 678-397-1663 tjohn...@paperhost.com<mailto:tjohn...@paperhost.com> [PaperHost] [asdf]<http://bit.ly/PaperHost_Twitter> Follow PaperHost on T

Re: org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-02 Thread Thomas Fuchs
Hi, I couldn't reproduce the problem in the following test case, so let's drop this. Regards - Thomas -- import org.apache.lucene.analysis.*; import org.apache.lucene.analysis.standard.*; import org.apache.lucene.document.*; import org.apache.lucene.index

org.apache.lucene.search.TopScoreDocCollector throws NullPointerException

2013-11-01 Thread Thomas Fuchs
.run(Thread.java:695) I don't think thats an expected behavior and it is a bug in org.apache.lucene.search.TopScoreDocCollector. Am I wrong? Regards - Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Search in a specific ScoreDiopoc result

2013-09-17 Thread Thomas Guttesen
Kkkutterujjjbbb hgggja Den 17/09/2013 12.55 skrev "David Miranda" : > > Hi, > > I want to do a kind of 'facet search', that initial research in a field of > all documents in the Lucene index, and second search in other field of the > documents returned to the first research. > > Currently I'm do th

RE: Partial word match using n-grams

2013-07-30 Thread Becker, Thomas
other implications, of course, but you get the idea There are a zillion possibilities here in terms of combining various filterFactories Best Erick On Fri, Jul 19, 2013 at 9:06 AM, Becker, Thomas wrote: > Sorry, at indexing time it's not broken on anything. In other words > qu

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas
ad the entire string. If the string is broken on _ already, then NGramFilter already receives the individual terms and you can put a Filter in front that will pass through a padded token? Shai On Fri, Jul 19, 2013 at 3:45 PM, Becker, Thomas wrote: > In general the data for this field is tha

RE: Partial word match using n-grams

2013-07-19 Thread Becker, Thomas
.almost. Y. You're right. FuzzyQuery is not at all what you want. Don't know if your data is actually as simple as this example. Do you need to tokenize on whitespace? Would it make sense to replace spaces in the query with underscores and then trigramify the whole query as i

RE: Partial word match using n-grams

2013-07-18 Thread Becker, Thomas
dataset you might consider allowing leading wildcards so that you could easily find all words, for example, containing abc with *abc*. If your dataset is larger, you might consider something like ReversedWildcardFilterFactory (Solr) to speed this type of matching. I look forward to other opinion

Partial word match using n-grams

2013-07-18 Thread Becker, Thomas
One of our main use-cases for search is to find objects based on partial name matches. I've implemented this using n-grams and it works pretty well. However we're currently using trigrams and that causes an interesting problem when searching for things like "abc ab" since we first split on whi

RE: query on exact match in lucene

2013-07-17 Thread Becker, Thomas
Sounds like you need a PhraseQuery. -Original Message- From: madan mp [mailto:madan20...@gmail.com] Sent: Wednesday, July 17, 2013 7:40 AM To: java-user@lucene.apache.org Subject: query on exact match in lucene how to get exact string match ex- i am searching for file which consist of s

Re: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene42' does not exist

2013-07-13 Thread Thomas Matthijs
On Sat, Jul 13, 2013 at 10:25 AM, VIGNESH S wrote: > Hi, > > I tried indexing in Desktop..It works fine. > The above error loading error comes only in android.. > Any comments.. Don't strip META-INF/services files out of the jars

Re: Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas
is a contradiction in terms. > > If you are finding that the use of a filter is affecting the scores of > documents, then that is clearly a bug. > > -- Jack Krupansky > > -Original Message- From: Nigel V Thomas > Sent: Monday, July 01, 2013 7:38 AM > To: java-use

Relevance ranking calculation based on filtered document count

2013-07-01 Thread Nigel V Thomas
Hi, I would like to know if it is possible to calculate the relevance ranks of documents based on filtered document count? The current filter implementations as far as I know, seems to be applied after the query is processed and ranked against the full set of documents. Since system wide IDF value

Re: Indexing file with security problem

2013-06-26 Thread Nigel V Thomas
of any suitable solutions yet. Nigel V Thomas On 26 June 2013 20:42, lukasw wrote: > Hello > > I'll try to briefly describe my problem and task. > My name is Lukas and i am Java developer , my task is to create search > engine for different types of file (only text file types)

What to do with Lucene Version parameter on upgrade

2013-06-20 Thread Becker, Thomas
I'm relatively new to Lucene and am in the process of upgrading from 4.0 to 4.3.1. I'm trying to figure out if I need to leave my version at LUCENE_40 or if it is safe to change it to LUCENE_43. Does this parameter directly determine the index format? I have some existing indexes from 4.0 but

Re: Taking backup of a Lucene index

2013-06-06 Thread Thomas Matthijs
On Thu, Jun 6, 2013 at 7:38 AM, Lance Norskog wrote: > The simple answer (that somehow nobody gave) is that you can make a copy > of an index directory at any time. Indexes are changed in "generations". > The segment* files describe the current generation of files. All active > indexing goes on i

Request for addition of ThomasMurphy to ContributorsGroup

2013-06-04 Thread Thomas R. Murphy
Hello. I, ThomasMurphy on the wiki, would like to be a member of ContributorsGroup.

Re: RAMDirectory and expungeDeletes()/optimize()

2013-05-21 Thread Thomas Matthijs
On Tue, May 21, 2013 at 3:12 PM, Konstantyn Smirnov wrote: > I want to refresh the topic a bit. > > Using the Lucene 4.3.0, I could'n find a method like expungeDeletes() in > the > IW anymore. http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/IndexWriter.html#forceMergeDeletes()

Re: Taking backup of a Lucene index

2013-04-17 Thread Thomas Matthijs
On Wed, Apr 17, 2013 at 12:57 PM, Ashish Sarna wrote: > I want to take back-up of a Lucene index. I need to ensure that index files > would not change when I take their backup. > > > I am concerned about the housekeeping/merge/optimization activities which > Lucene performs internally. I am not

RE: Detecting when an index was not closed properly

2013-04-09 Thread Becker, Thomas
ginal Message- From: Becker, Thomas [mailto:thomas.bec...@netapp.com] Sent: Friday, April 05, 2013 1:33 PM To: java-user@lucene.apache.org Subject: Detecting when an index was not closed properly We are doing some crash resiliency testing of our application. One of the things we found is tha

Detecting when an index was not closed properly

2013-04-05 Thread Becker, Thomas
We are doing some crash resiliency testing of our application. One of the things we found is that the Lucene index seems to get out of sync with the database pretty easily. I suspect this is because we are using near real time readers and never actually calling IndexWriter.commit(). I'm tryin

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 12:19 PM, Thomas Matthijs wrote: > On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > >> >> On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: >> >>> On 20/02/2013 11:28, Paul Taylor wrote: >>> >>>> Just upd

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:30 AM, Thomas Matthijs wrote: > > On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > >> On 20/02/2013 11:28, Paul Taylor wrote: >> >>> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >>> that use Norma

Re: Not getting matches for analyzers using CharMappingFilter with Lucene 4.1

2013-02-25 Thread Thomas Matthijs
On Mon, Feb 25, 2013 at 11:24 AM, Paul Taylor wrote: > On 20/02/2013 11:28, Paul Taylor wrote: > >> Just updating codebase from Lucene 3.6 to Lucene 4.1 and seems my tests >> that use NormalizeCharMap for replacing characters in the anyalzers are not >> working. >> >> bump, anybody I thought a s

RE: updateDocument question

2013-02-07 Thread Becker, Thomas
estion Hi Thomas, On Wed, Feb 6, 2013 at 2:50 PM, Becker, Thomas wrote: > I've built a search prototype feature for my application using Lucene, and it > works great. The application monitors a remote system and currently indexes > just a few core attributes of the object

updateDocument question

2013-02-06 Thread Becker, Thomas
I've built a search prototype feature for my application using Lucene, and it works great. The application monitors a remote system and currently indexes just a few core attributes of the objects on that system. I get notifications when objects change, and I then update the Lucene index to kee

Lucene-MoreLikethis

2013-01-15 Thread Thomas Keller
Hey, I have a question about "MoreLikeThis" in Lucene, Java. I built up an index and want to find similar documents. But I always get no results for my query, mlt.like(1) is always empty. Can anyone find my mistake? Here is an example. (I use Lucene 4.0) public class HelloLucene { public

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 2:29 PM, Thomas Matthijs wrote: > On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: >> Hi, >> >> This is a known problem currently. I think there is already an issue open, >> so this was not solved for 4.0 (I don't have the issu

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

Re: lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
On Mon, Oct 8, 2012 at 11:28 AM, Uwe Schindler wrote: > Hi, > > This is a known problem currently. I think there is already an issue open, so > this was not solved for 4.0 (I don't have the issue no available at the > moment). > > My plan to fix this is to make Filters behave like queries (with

lucene-4.0: QueryWrapperFilter & docBase

2012-10-08 Thread Thomas Matthijs
Hello, I have some custom queries & scorer that need to able to construct the "global" docIds (doc + docBase). But when i use these in a QueryWrapperFilter they no longer work, because QueryWrapperFilter.getDocIdSet uses a "private context" (context.reader().getContext();) which always has a docB

Re: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Thomas Matthijs
And to include the code On Thu, Oct 4, 2012 at 3:52 PM, Markus Jelsma wrote: > I forgot to add that this is with today's build of trunk. > > -Original message- >> From:Markus Jelsma >> Sent: Thu 04-Oct-2012 15:42 >> To: java-user@lucene.apache.org >> Subject: Highlighter IOOBE with modif

problem understanding the documentation for the TieredMergePolicy class

2012-06-12 Thread thomas
ePolicy.html#findMerges%28org.apache.lucene.index.SegmentInfos%29> Would somebody be so kind to explain it to me? Thanks, thanks a lot Thomas

Scoring similarity by the position of the terms

2012-03-22 Thread Thomas Rewig
Similarity. Lucene has been developed and grown and I was wondering if you can now do the same thing in a simpler and more straigth forward way. Maybe with some of the newer SpanQuerys or a other use of payloads. Does anyone have any idea where to start? Regards Thomas

Re: Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
>> -Original Message- >> From: stephen.warner.tho...@gmail.com >> [mailto:stephen.warner.tho...@gmail.com] On Behalf Of Stephen Thomas >> Sent: Tuesday, November 29, 2011 5:20 PM >> To: java-user@lucene.apache.org >> Subject: Custom Filter for Splitting CamelCase?

Custom Filter for Splitting CamelCase?

2011-11-29 Thread Stephen Thomas
List, I have written my own CustomAnalyzer, as follows: public TokenStream tokenStream(String fieldName, Reader reader) { // TODO: add calls to RemovePuncation, and SplitIdentifiers here // First, convert to lower case TokenStream

Re: Scoring a document using LDA topics

2011-11-29 Thread Stephen Thomas
ote about this sometime back...maybe this would help you. > http://sujitpal.blogspot.com/2011/01/payloads-with-solr.html > > -sujit > > On Mon, 2011-11-28 at 12:29 -0500, Stephen Thomas wrote: >> List, >> >> I am trying to incorporate the Latent Dirichlet Allocation

Scoring a document using LDA topics

2011-11-28 Thread Stephen Thomas
List, I am trying to incorporate the Latent Dirichlet Allocation (LDA) topic model into Lucene. Briefly, the LDA model extracts topics (distribution over words) from a set of documents, and then represents each document with topic vectors. For example, documents could be represented as: d1 = (0,

Do duplicate documents affect term scoring?

2011-11-27 Thread Stephen Thomas
List, I am indexing a subset of Wikipedia. I have 4 years worth of data, and have taken snapshots of each document at each month in the 4 year span. Thus, I have 4*12=36 versions of each document. (I keep track of the timestamp in a field.) I have noticed that in many cases, a Wikipedia document d

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-19 Thread Thomas Rewig
re=12,2324 Doc.Id=8060id=709579name=aim溝脇しほみ 1Score=12,2324 Doc.Id=227606id=716893name=aim To avoid these problems right from the start, I need to use a different analyser for indexing? (So that the docs 'aim溝脇しほみ' and 'aim' have different scores) Thank

Re: TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-18 Thread Thomas Rewig
Doc.Id=227606 id=716893 name=aim Is there a way to guarantee the inner sorting of same scores? Or how can I avoid that documente with special characters have the same score as documente of exact matches? Thanks in advance! Thomas Am 18.07.2011 10:08, schrieb Ian Lea: I'm not su

TermQuery - ExactMatching, Lucene 3.1.0 vs. 3.3.0, special character behavior

2011-07-15 Thread Thomas Rewig
I would expect if I do a 'exact matching' Term Query. Each index was indexed with its associated LuceneVersion. I tested it with luke and with my own Code - the result was always the same. Is it a new feature in Lucene 3.3.0 or a b

name matching / mapping

2011-07-06 Thread Thomas Rewig
s all names of the second id-space and the first id-space is used for the querrys. String[] suggestions = spellchecker.suggestSimilar("john w.", 5); But is there a better approach? Can someone point me in the right direction for a effective approach? Thanks in

Check Numeric Fields

2011-03-11 Thread Thomas Rewig
t the NumericRangeQuery query does not work? I use lucene v. 3.0.2. Thanks in advance! Thomas

Re: Deleted File Handles - Index Writer

2010-11-19 Thread Thomas Rewig
.0.2 Release version or have I wait for a future release? Thanks for your help. Thomas Listen Read phonetically

Re: Deleted File Handles - Index Writer

2010-11-18 Thread Thomas Rewig
help. Thomas I've found a case, only with compound file, where IndexWriter holds open a SegmentReader on the pre-compound-file files... I'm working on a test case& fix. Mike On Fri, Nov 12, 2010 at 5:49 AM, Thomas Rewig wrote: Hello, I use the searcherManager for LiveIndexin

Deleted File Handles - Index Writer

2010-11-12 Thread Thomas Rewig
used" by the indexwriter) grows. Is that possible and if yes why does the indexwriter do it? Is there a max Value of deleted handles an IndexWriter could own, because I don't want to chrash the system because of too much open filehandles? Thanks in advance. Thomas --

Re: File Handle Leaks During Lucene 3.0.2 Merge

2010-11-10 Thread Thomas Rewig
amount of the deleted file handles will be stable - but first at a amount of 500 or so. Thanks in advance Thomas I integrated your SearchManager class into our code, but I am still seeing file handles marked deleted in the index directory. I am running the following command on Linux: sudo watch

Restore documents marked as deleted

2010-10-06 Thread Philippe Thomas
Hi, I was indexing some documents, but my program crashed after several days of work. If I reopen this index it is empty. I guess the reason is that auto-commit was not set and I never performed a commit. (Lesson learned) So probably all documents are marked as "deleted" and re-opening the i

Re: Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas
m frequency, idf and field norm > > 0.07028562 = (MATCH) fieldWeight(payload:ces in 550), product of: > > 1.0 = *tf(*termFreq(payload:ces)=1) > > 2.2491398 = *idf(*docFreq=157, maxDocs=551) > > 0.03125 = *fieldNorm*(field=payload, doc=550) > >

Need help in understanding output of searcher.explain() function

2010-08-07 Thread Soby Thomas
Hello Guys, I trying to understand how lucene score is calculated. So 'm using the searcher.explain() function. But the output it gives is really confusing for me. Below are the details of the query that I gave and o/p it gave me Query: *It is definitely a CES deal that will be over in Sep or Oct

Fielded Queries Question

2010-07-06 Thread Thomas Nguyen
Hello All, Can someone explain to me how fielded queries work with phrases? My first thought is that the phrase is broken down into terms and those terms are then fielded and separated with the AND operator. An example would be the following: name:"Tom Jones" --> name:"Tom" AND name:"Jones" I

Introduction to flexible indexing?

2010-06-14 Thread Thomas Koch
understand this page and help to get it in shape. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

IndexSearcher - open file handles by deleted files

2010-05-26 Thread Thomas Rewig
s not automatically if i close searcher.close()? Do I have to close something else, than all IndexSearchers and Directorys? Or am I wrong with my assumption, and the problem is somewhere else? Best Thomas - To unsubscri

[ANN] Eclipse GIT plugin beta version released

2010-03-31 Thread Thomas Koch
http://www.infoq.com/news/2010/03/egit-released http://aniszczyk.org/2010/03/22/the-start-of-an-adventure-egitjgit-0-7-1/ Maybe, one day, some apache / hadoop projects will use GIT... :-) (Yes, I know git.apache.org.) Best regards, Thomas Koch, http://www.ko

google's index layout, lucene on hbase(?)

2010-03-11 Thread Thomas Koch
; ( or http://tinyurl.com/yjr45ut ) The mail is about a lucene index{reader|writer} on top of cassandra and whether sth. like this could also be done with hbase. Best regards, Thomas Koch, http://www.koch.ro - To unsubscribe, e-mail

Re: If you could have one feature in Lucene...

2010-02-25 Thread Thomas Guttesen
For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Med venlig hilsen Thomas Guttesen

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
ginal questions...: commit/read does not require any external synchronization or locking.  You should generally keep your IW open indefinitely and just periodically commit and/or get a new reader (IndexWriter.getReader()) as needed. Mike On Sat, Jan 9, 2010 at 10:06 AM, legrand thomas wrote: > &g

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
McCandless a écrit : De: Michael McCandless Objet: Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException À: java-user@lucene.apache.org Date: Samedi 9 Janvier 2010, 14h51 Can you post the full FNFE stack trace? Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: >

Re: Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-09 Thread legrand thomas
xWriter is committing) is perfectly fine.  The reader searches the point-in-time snapshot of the index as of when it was opened. But: what filesystem are you using?  NFS presents challenges, for example. Mike On Fri, Jan 8, 2010 at 5:35 AM, legrand thomas wrote: > Hi, > > I often get a Fi

Concurrent access IndexReader / IndexWriter - FileNotFoundException

2010-01-08 Thread legrand thomas
Hi, I often get a FileNotFoundException when my single IndexWriter commits while the IndexReader also tries to read. My application is multithreaded (Tomcat uses the business APIs); I firstly thought the read/write access was thread-safe but I probably forget something.  Please help me to unde

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Thomas Becker
be careful. Load on the DB Server will surely increase. Hope that helps. Cheers, Thomas Paul Taylor wrote: > I'm building a lucene index from a database, creating 1 about 1 million > documents, unsuprisingly this takes quite a long time. > I do this by sending a query to the db o

Re: Using TermVectorMapper to compute term frequency across documents

2009-10-15 Thread Thomas D'Silva
while to compute the document,tag probabilities. Thanks, Thomas On Wed, Oct 14, 2009 at 8:15 AM, Grant Ingersoll wrote: > > On Oct 12, 2009, at 10:46 PM, Thomas D'Silva wrote: > >> Hi, >> >> I am trying to compute the counts of terms of the documents return

Using TermVectorMapper to compute term frequency across documents

2009-10-12 Thread Thomas D'Silva
getTermFreqVector(). I do not require the term frequency within a document. Thanks, Thomas HashMap termDocCount = new HashMap(); TermQuery tagQuery = new TermQuery(tagTerm); TopDocs docs = searcher.search(tagQuery, numDocs); for (int i=0 ; i public void map(String term, int frequency

Re: Problems with ItemBasedRecommender with Lucene

2009-09-17 Thread Thomas Rewig
You use Lucene 2.9 is there a way to do this with Lucene 2.4.1 because I can't find e.g. the "PayloadEncoder" or do I have to wait for the release? Regards Thomas You might want to ask on mahout-user, but I'm guessing Ted didn't mean a new field for every item-item,

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
t was only > one search, you must have two segments and therefore no optimized index for > this to be correct? > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > Fo

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
IndexSearcher.search was called only > once. > > Uwe > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > -- Thomas Becker Senior JEE Deve

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
guess, based on > the 2.9 new api profiling, is that your queries may not be agreeing with > some of the changes somehow. Along with the profiling, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas B

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
ry types you are using as well? (eg qualities) >> >> And grab invocations if its possible. >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> Thomas Becker wrote: >>> Tests run on tmpfs: >>> config: impl=Sepa

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
g, can you fill us > in on the query types you are using as well? (eg qualities) > > And grab invocations if its possible. > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-16 Thread Thomas Becker
gt; https://issues.apache.org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http:/

Problems with ItemBasedRecommender with Lucene

2009-09-16 Thread Thomas Rewig
e fields... I'm using lucene 2.4.1 and java version "1.6.0_16". Do anyone have an idea to avoid the growing memory. Or do somebody know an other approche for a "realtime Item based Recommender" with Lucene? Regards Thomas --

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Hi Uwe, already done. See my last message. Cheers, Thomas Uwe Schindler wrote: > On 2.9. NIOFS is only used, if you use FSDirectory.open() instead of > FSDirectory.getDirectory (Deprecated). Can you compare when you use instead > of FSDirectory.open() the direct ctor of SimpleFSDir vs.

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Mark Miller wrote: > Thomas Becker wrote: >> Hey Mark, >> >> yes. I'm running the app on unix. You see the difference between 2.9 and 2.4 >> here: >> >> http://ankeschwarzer.de/tmp/graph.jpg >> > Right - I know your measurements showed

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
org/jira/browse/LUCENE-753 > -- Thomas Becker Senior JEE Developer net mobile AG Zollhof 17 40221 Düsseldorf GERMANY Phone:+49 211 97020-195 Fax: +49 211 97020-949 Mobile: +49 173 5146567 (private) E-Mail: mailto:thomas.bec...@net-m.de Internet: http://www.net-m.de Registergericht: Amts

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
ry as well?! Will check that. Thanks a lot for your support! Cheers, Thomas Mark Miller wrote: > A few quick notes - > > Lucene 2.9 old api doesn't appear much worse than Lucene 2.4? > > You save a lot with the new Intern impl, because thats not a hotspot > anymore. But t

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
with lucene 2.4. I will now try a freshly build 2.9 index and see if performance improves. Maybe that already solves the issue...stupid me... We're updating the index every 30 min. at the moment and it gets optimized after each update. Mark Miller wrote: > Thomas Becker wrote: >> Hey Mar

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
nerCache is a Map containing field + parser * (contracttocontentgroup prefix) as the key and as a value yet another map. * The latter map finally contains the docIds as key and positionvalue for this * prefix as value. * * @author Thomas Becker (thomas.bec...@net-m.de) * */ pub

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Urm and uploaded here: http://ankeschwarzer.de/tmp/graph.jpg Sorry. Thomas Becker wrote: > Missed the attachment, sorry. > > Thomas Becker wrote: >> Hi all, >> >> I'm experiencing a performance degradation after migrating to 2.9 and running >> some tests.

Re: lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
Missed the attachment, sorry. Thomas Becker wrote: > Hi all, > > I'm experiencing a performance degradation after migrating to 2.9 and running > some tests. I'm getting out of ideas and any help to identify the reasons why > 2.9 is slower than 2.4 are highly appreci

lucene 2.9.0RC4 slower than 2.4.1?

2009-09-15 Thread Thomas Becker
y took {} ms", durationMillis); } return docs; } I'm wondering why others are experiencing better performance with 2.9 and why our implementations performance is going bad. Maybe our way of using the 2.9 api is not the best and sorting is definetly

2.9 - leftover (deleted) filehandles after upgrade

2009-07-29 Thread Thomas Becker
mpDir); with IndexSearcher indexSearcherTmp = new IndexSearcher(tmpDir, true); No errors in the logfiles, no catched exceptions, etc. I'm a kinda out of ideas at the moment. I googled and tried couple of things (IndexWriter.setUseCompoundFile(true), etc.) but didn't find a solution. A

Re: Loading an index into memory

2009-07-24 Thread Thomas Becker
/www.windowslive.com/Online/Hotmail/Campaign/QuickAdd?ocid=TXT_TAGLM >>>> _WL_QA_HM_sports_photos_072009&cat=sports >>>> >>> - >>> To unsubscribe, e-mail: java-user-u

Re: Index and search terms containing character "-"

2009-06-02 Thread legrand thomas
d strongly recommend you get a copy of Luke, it's invaluable for questions like this because it lets you look at what's actually in your index. It'll also show you how queries get broken down when pushed through various analyzers... BTW, nice test case for demonstrating what you w

Index and search terms containing character "-"

2009-05-31 Thread legrand thomas
Hi, I have a problem using TermQuery and FuzzyQuery for terms containing the character "-". Considering I've indexed "jack" and "jack-bauer" as 2 tokenized captions, I get no result when searching for "jack-bauer". Moreover, "jack" with a TermQuery returns the two captions.   What should I do t

Creating document fields by providing termvector directly (bypassing the analyzing/tokenizing stage)

2009-04-21 Thread Thomas Pönitz
] b[2] c[1]. The old discussion had no real solution but it is also a bit outdated, maybe someone has a better idea now. Greets, Thomas - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-

Dynamic Indexing?

2009-03-11 Thread Thomas J. Buhr
Lucene, From what I have read on your website indexing does seem like a useful thing. I'm considering the possible use of Lucene in a company project and have a few research questions. What I'm considering is using Lucene as a backend data store for a graphic editor. The typical usage exa

Filtering accents

2008-12-30 Thread legrand thomas
Dear all, I'd like my lucene searches to be insensitive to (French) accents. For example, considering a indexed term "métal", I want to get it when searching for "metal" or "métal" . I use lucene-2.3.2 and the searches are performed with: IndexSearcher.search(query,filter,sorter), Another filte

Lucene and JSON

2008-12-19 Thread Thomas J. Buhr
Lucene, Is there JSON support in Lucene? JSON is more fat-free compared to XML and would be preferred. Digester works well for indexing XML but something along the same lines for JSON would be even sweeter. Best, Thom - To

Re: What are the best document edit options?

2008-12-17 Thread Thomas J. Buhr
same lines for JSON would be even sweeter. Cheers, Thom On 17-Dec-08, at 2:39 PM, Steven A Rowe wrote: Hi Thomas, On 12/17/2008 at 11:52 AM, Thomas J. Buhr wrote: Where can I see how IndexWriter.updateDocument works without getting into Lucene all over again until this important issue is

Re: What are the best document edit options?

2008-12-17 Thread Thomas J. Buhr
version of Lucene are you using? The more recent ones have IndexWriter.updateDocument.. Best Erick On Wed, Dec 17, 2008 at 2:20 AM, Thomas J. Buhr wrote: Hello Lucene, Looking at the document object it seems like each time I want to edit its contents I need to do the following: 1 - fetch

What are the best document edit options?

2008-12-16 Thread Thomas J. Buhr
Hello Lucene, Looking at the document object it seems like each time I want to edit its contents I need to do the following: 1 - fetch the document 2 - dump its contents into a temp container 3 - update field values in the temp container 4 - create a new document 5 - transfer my updated field

is there an histogram feature in lucene ak Magelan

2008-10-13 Thread Thomas Birnbaum
350 damage unrepaired 30 metallic 60 something like this... is there a way to do the same with lucene? thx thomas. -- GMX Kostenlose Spiele: Einfach online spielen und Spaß haben mit Pastry Passion! http://games.entertainment.gmx.net/de/entertainment/games/free/puzzle/6169196

Re: Range Query Question

2008-07-25 Thread Thomas Becker
Btw. I tried the wildcard since I found something on google, which noted wildcards together with StartsWith queries. Thomas Becker wrote: Hi Ian, no the wild cards should not be necessary. That was just the last try out of some. I now the exact content of both fields in my range query. The

Re: Range Query Question

2008-07-25 Thread Thomas Becker
t Circle"] gives zero results. Tried it also with braces around the term and such stupid things, even if they shouldn't be needed in a range query. I'm kinda clueless. Cheers, Thomas Ian Lea wrote: Hi Are you sure your range queries should have wild card asterisks on the end? Loo

Range Query Question

2008-07-25 Thread Thomas Becker
Name fields in a range between "A Balladeer*" TO "A Perfect Circle*" and get only terms back which are starting with that terms? Is there a way to accomplish that in Java and try it in luke? And is there a way to sort resultsets in luke? Cheers, Thomas -- Thomas Be

advanced WildcardQuery

2008-07-16 Thread legrand thomas
ardQuery with the term "pretty*car". I also want to get this document when searching for "pretty*sale*". How should I do ? Is it really possible ? I use lucene 2.3.1. Thanks in advance, Thomas Legrand

Re: Question about indexing (BrazilianAnalyzer)

2008-06-04 Thread Thomas Arni
; c). Probably the problem is with this accents.. You can check this if you adapt the method tokenStream() in the BrazilianAnalzyer by including the ISOLatin1AccentFilter in the filter chain. Thomas Vinicius Carvalho said the following on 03/06/08 20:51: Hello there! I'm indexing documents u

Search for long titles - wildcard queries

2008-05-10 Thread legrand thomas
Dear all, I'm a recent Lucene user and I'm looking for the best way to perform searches over long titles (ad titles on a website). For example, if the following documents exist: - TITLE, "Fender telecaster" - TITLE, "Land rover defender" - TITLE, "I sale a wonderful fender st

  1   2   3   >