Re: Punctuation marks in documents prevent recognition of synonyms at indexing?

2009-09-26 Thread AHMET ARSLAN
You lose the WordDelimiterFilterFactory functionality: Syn.txt has: ADC, HIV-dementie Search on ADC doesn't find document with HIV-dementie. synonym filter can handle multi word synonyms. Replace Syn.txt to Syn.txt has: ADC, HIV dementie And search on ADC will find document with

Re: Solr + Jboss + Custom Transformers

2009-09-26 Thread Shalin Shekhar Mangar
On Fri, Sep 25, 2009 at 11:24 PM, Papiya Misra pmi...@pinkotc.com wrote: I could use the source code to create solr.war that includes the CustomTransformer class. Is there any other option - one that preferably does not include re-packaging solr.war ? You should add your custom transformers

Re: DIH RSS 1.4 nightly 2009-09-25 full-importclean=false always clean and import command do nothing

2009-09-26 Thread Shalin Shekhar Mangar
On Sat, Sep 26, 2009 at 9:41 PM, Brahim Abdesslam brahim.abdess...@maecia.com wrote: on a Linux system the command : curl http://192.168.0.14:8983/solr/dataimport?command=full-importclean=false just don't work like this command : curl

Re: Punctuation marks in documents prevent recognition of synonyms at indexing?

2009-09-26 Thread G.S.J. Lobbestael
The wiki uses the example:     fieldtype name=syn class=solr.TextField       analyzer           tokenizer class=solr.WhitespaceTokenizerFactory/           filter class=solr.SynonymFilterFactory synonyms=syn.txt ignoreCase=true expand=false/       /analyzer     /fieldtype

Re: Punctuation marks in documents prevent recognition of synonyms at indexing?

2009-09-26 Thread AHMET ARSLAN
Hi, The wiki uses the example:     fieldtype name=syn class=solr.TextField       analyzer           tokenizer class=solr.WhitespaceTokenizerFactory/           filter class=solr.SynonymFilterFactory synonyms=syn.txt ignoreCase=true expand=false/       /analyzer     /fieldtype

Re: DIH RSS 1.4 nightly 2009-09-25 full-importclean=false always clean and import command do nothing

2009-09-26 Thread Shalin Shekhar Mangar
On Fri, Sep 25, 2009 at 6:48 PM, Brahim Abdesslam brahim.abdess...@maecia.com wrote: Hello everybody, we are using Solr to index some RSS feeds for a news agregator application. We've got some difficulties with the publication date of each item because each site use an homemade date

Re: Solr and Garbage Collection

2009-09-26 Thread Mark Miller
Jonathan Ariel wrote: I have around 8M documents. Thats actually not so bad - I take it you are faceting/sorting on quite a few unique fields? I set up my server to use a different collector and it seems like it decreased from 11% to 4%, of course I need to wait a bit more because it is

Punctuation marks in documents prevent recognition of synonyms at indexing?

2009-09-26 Thread G.S.J. Lobbestael
Hi, The wiki uses the example: fieldtype name=syn class=solr.TextField analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=syn.txt ignoreCase=true expand=false/ /analyzer /fieldtype With dog, canine

Re: Using two Solr documents to represent one logical document/file

2009-09-26 Thread Matt Weber
Check out the field collapsing patch: http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236 Thanks, Matt Weber On Sep 25, 2009, at 3:15 AM, Peter Ledbrook wrote: Hi, I want to index both the contents of a document/file and metadata associated with

Re: Solr and Garbage Collection

2009-09-26 Thread Jonathan Ariel
Ok. After the server ran for more than 12 hours, the time spent on GC decreased from 11% to 3,4%, but 5 hours later it crashed. This is the thread dump, maybe you can help identify what happened? # # An unexpected error has been detected by Java Runtime Environment: # # SIGSEGV (0xb) at

Re: Solr and Garbage Collection

2009-09-26 Thread Mark Miller
Jonathan Ariel wrote: Ok. After the server ran for more than 12 hours, the time spent on GC decreased from 11% to 3,4%, but 5 hours later it crashed. This is the thread dump, maybe you can help identify what happened? Well thats a tough ;) My guess is its a bug :) Your two survivor spaces

Re: Solr and Garbage Collection

2009-09-26 Thread Mark Miller
Also, in case the info might help track something down: Its pretty darn odd that both your survivor spaces are full. I've never seen that ever in one of these dumps. Always one is empty. When one is filled, its moved to the other. Then back. And forth. For a certain number of times until its