Re: suggestions for DIH batchSize

2009-12-23 Thread Marc Sturlese

If you want to retrieve a huge volume of rows you will end up with an
OutOfMemoryException due to the jdbc driver. Setting batchSize to -1 in your
data-config.xml (that internally will set it to Integer.MIN_VALUE) will make
the query to be executed in streaming, avoiding the memory exception.

Joel Nylund wrote:
 
 Hi,
 
 it looks like from looking at the code the default is 500, is the  
 recommended setting for this?
 
 Has anyone notice any significant performance/memory tradeoffs by  
 making this much bigger?
 
 thanks
 Joel
 
 
 

-- 
View this message in context: 
http://old.nabble.com/suggestions-for-DIH-batchSize-tp26894539p26897636.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Search both diacritics and non-diacritics

2009-12-23 Thread Yurish



Olala wrote:
 
 Hi all!
 
 I am developing a seach engine with Solr, and now I want to search both
 with and without diacritics, for example: if I query kho, it will response
 kho, khó, khò,... But if I query khó, it will response only khó.
 
 Who anyone have solution? I have used filter
 class=solr.ISOLatin1AccentFilterFactory/ but it is not correct :(
 

How about using filter class=solr.PatternReplaceFilterFactory/ ? 
Here
you can define regexp, in which you can define: If term has some diactrics,
then convert it to non-diactric. Then, concatenate to this non-diactric term
your original one. 
Place it in index part. In query part don't convert your query in such
pattern. Then, you must be able to search kho and get both: with diactrics
and without, but when querying kho with diactrics, get only with diactrics..
-- 
View this message in context: 
http://old.nabble.com/Search-both-diacritics-and-non-diacritics-tp26897627p26897638.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multi Solr

2009-12-23 Thread Yurish



Olala wrote:
 
 Hi all!
 
 I have developed Solr on Tomcat, but now I want to building many Solr on
 only one Tomcat server.Is that can be done or not??? 
 

I configured my SOLR to use multiple cores. For more information, you can
see multicore example, which is in example folder of SOLR.
-- 
View this message in context: 
http://old.nabble.com/Multi-Solr-tp26884086p26897639.html
Sent from the Solr - User mailing list archive at Nabble.com.



SynonymFilterFactory parseRules

2009-12-23 Thread Peter A. Kirk
Hi

I am still looking at synonyms, and the possibility of having synonyms loaded 
via another mechanism than reading from a text file.

A few questions:
Does anyone know why, in SynonymFilterFactory, the method parseRules is 
package private (actually I'm not sure of the terminology - what I mean is, I 
can't call this method from outside the package, so I can't use this method if 
I extend SynonymFilterFactory).

And why are parseRules and several other methods static?

Also, where does the ResourceLoader supplied to the inform method come 
from? Or, who is it that instantiates the resource-loader and the 
filter-factory, and calls the inform method. Can I influence this at all - 
for instance, can I inject my own ResourceLoader into this call?

Thanks very much for your help,.
Peter


Re: SynonymFilterFactory parseRules

2009-12-23 Thread Kevin Jackson
Hi

 I am still looking at synonyms, and the possibility of having synonyms 
 loaded via another mechanism than reading from a text file.

I am also looking at creating a DBSynonymFilterFactory which will
allow us to load the synonyms from a db.

I haven't done much apart from getting solr-trunk and creating the
class - I was going to work on it a little over the holidays as my
holidays project.

Would you like to collaborate?  Not sure how we'd manage it, but there
are enough ways of sharing code now that we could work something out?


 A few questions:
 Does anyone know why, in SynonymFilterFactory, the method parseRules is 
 package private (actually I'm not sure of the terminology - what I mean is, I 
 can't call this method from outside the package, so I can't use this method 
 if I extend SynonymFilterFactory).

 And why are parseRules and several other methods static?

 Also, where does the ResourceLoader supplied to the inform method come 
 from? Or, who is it that instantiates the resource-loader and the 
 filter-factory, and calls the inform method. Can I influence this at all - 
 for instance, can I inject my own ResourceLoader into this call?

 Thanks very much for your help,.
 Peter


Thanks,
Kev


More terms than documents, impossible to sort on tokenized fields

2009-12-23 Thread Pascal Bleser
Using Solr 1.4 (release)
My complete schema is here, basically a somewhat stripped down version of the
example's schema.xml, and a few additional fields: http://pastebin.be/22596

I've read past posts on this issue and believe to mostly understand what caused
it, but I cannot find that problem in my configuration, even after implementing
the workarounds: when I perform a search (*), I get the a 500 stating: there
are more terms than documents in field text, but it's impossible to sort on
tokenized fields

The complete stack trace is here: http://pastebin.be/22597

(*) The search query is as follows:
/solr/select?sort=alphaNameSort+descq=java

* the dismax SearchHandler is configured as default
* the documents are PDFs that have been uploaded with curl into /update/extract
* there are 1398 documents in the index (as of numDocs)

Now, the field text is multi valued and tokenized:
---8--
 field name=text type=text indexed=true stored=false
multiValued=true/
 copyField source=title dest=text/
 copyField source=subject dest=text/
 copyField source=description dest=text/
 copyField source=comments dest=text/
 copyField source=content dest=text/
---8--
(it's the text fieldType as in the example configuration)

But I do try to resort to the alphaOnlySort trick, in order to perform the
search on a non-multivalued field:
---8--
 fieldType name=alphaOnlySort class=solr.TextField
sortMissingLast=true omitNorms=true
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory /
 filter class=solr.TrimFilterFactory /
   /analyzer
 /fieldType
 !-- ... --
 field name=alphaNameSort type=alphaOnlySort indexed=true
stored=false/
 copyField source=title dest=alphaNameSort/
---8--

I even explicitly specify that the sorting must be done on the field
alphaNameSort (using the sort=alphaNameSort+desc query parameter), and Solr
is still complaining about the field text.

Same error happens if I specify other fields to use for sort, such as id.

I'm seriously puzzled at this point. Am I hitting an obscure bug in 1.4 ? A
misleading error message ? Or maybe a bug in my brain ? :)

cheers
-- 
  -o) Pascal Bleser l...@fosdem.orghttp://www.fosdem.org
  /\\   FOSDEM 2010 :: 6+7 February 2010 in Brussels
 _\_v Free and Opensource Software Developers European Meeting



Re: Profiling Solr

2009-12-23 Thread Grant Ingersoll
I usually use YourKit or JProfiler, but there are free ones too, like VisualVM.

Check out: 
http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/
 and 
http://www.lucidimagination.com/blog/2009/02/09/investigating-oom-and-other-jvm-issues/


On Dec 22, 2009, at 9:38 PM, Maduranga Kannangara wrote:

 Hi All,
 
 Recently we noticed that some of our heavy load Solr instances are facing 
 memory leak kind situations.
 It goes onto Full GC and as it was unable to release any memory, the broken 
 pripe and socket errors happen.
 
 (This happens both in Solr 1.3 and 1.4 for us.)
 
 Is there a good tool (preferably open source) that we could use to profile on 
 the Solr deployed Tomcat and to figure out what is happening with the 
 situation? If it was a connection keep alive or some wrong queries/bad schema 
 configuration etc?
 
 Sorry about the laymen language..
 
 Thanks in advance for all the responses!
 Madu
 



Re: Field Collapsing - disable cache

2009-12-23 Thread rob


I'm currently trying to patch aginst trunk, using SOLR-236.patch from 
18/12/2009 but getting the following error...


[...@intelcompute solr]$ patch -p0  SOLR-236.patch
patching file src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
patching file src/test/test-files/solr/conf/schema-fieldcollapse.xml
patching file src/test/test-files/solr/conf/solrconfig.xml
patching file src/test/test-files/fieldcollapse/testResponse.xml
can't find file to patch at input line 787
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--
|
|Property changes on: src/test/test-files/fieldcollapse/testResponse.xml
|___
|Added: svn:keywords
|   + Date Author Id Revision HeadURL
|Added: svn:eol-style
|   + native
|
|Index: src/test/org/apache/solr/BaseDistributedSearchTestCase.java
|===
|--- src/test/org/apache/solr/BaseDistributedSearchTestCase.java
(revision 891214)
|+++ src/test/org/apache/solr/BaseDistributedSearchTestCase.java
(working copy)
--
File to patch:



any suggestions, or should i checkout the 1.4 branch instead?

can't remember what i did last time to get field-collapse-5.patch working 
successfully.






On Tue 22/12/09 22:43 , Lance Norskog goks...@gmail.com wrote:

 To avoid this possible bug, you could change the cache to only have a
 few entries.
 On Tue, Dec 22, 2009 at 6:34 AM, Martijn v Groningen
 wrote:
  In the latest patch some changes where made on the configuration
 side,
  but if you add the CollapseComponent to the conf no field collapse
  cache should be enabled. If not let me know.
 
  Martijn
 
  2009/12/22  :
 
 
 
 
 
  On Tue 22/12/09 12:28 , Martijn v Groningen  wrote:
 
  Hi Rob,
  What patch are you actually using from SOLR-236?
  Martijn
  2009/12/22  :
   I've tried both, the whole fieldCollapsing tag, and just the
   fieldCollapseCache tag inside it.
  both cause error.
  I guess I can just set size, initialSize, and
 autowarmCount
  to 0 ??
   On Tue 22/12/09 11:17 , Toby Cole  wrote:Which elements did
 you
   comment out? It could be the case that you need
   to get rid of the entire fieldCollapsing element, not just the
   fieldCollapsingCache element.
   (Disclaimer: I've not used field collapsing in anger before :)
   Toby.
  
   On 22 Dec 2009, at 11:09,  wrote:
  
   That's what I assumed, but I'm getting the following error
 with
  it
   commented out
   MESSAGE null java.lang.NullPointerException at
   org
   .apache
   .solr
   .search
   .fieldcollapse
   .AbstractDocumentCollapser
  
 .createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
   at
   org
   .apache
   .solr
   .search
   .fieldcollapse
   .AbstractDocumentCollapser
   .executeCollapse(AbstractDocumentCollapser.java:249)
   at
   org
   .apache
   .solr
   .search
   .fieldcollapse
  
 
 .AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:
  
   172)
   at
   org
   .apache
   .solr
   .handler
  
 
 .component.CollapseComponent.doProcess(CollapseComponent.java:173)
   at
   org
   .apache
   .solr
  
  
 
 .handler.component.CollapseComponent.process(CollapseComponent.java:
   127)
   at
   org
   .apache
   .solr
   .handler
  
 
 .component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at
   org
   .apache
   .solr
  
  
 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at
   org
   .apache
  
  
 
 .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
   at
   org
   .apache
  
  
 
 .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
   at
   org
   .apache
   .catalina
   .core
  
  
 
 .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:
  
   215)
   at
   org
   .apache
   .catalina
  
  
 
 .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
   at
   org
   .apache
  
  
 
 .catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:
  
   210)
   at
   org
   .apache
  
  
 
 .catalina.core.StandardContextValve.invoke(StandardContextValve.java:
  
   172)
   at
   org
   .apache
  
 
 .catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
   at
   org
   .apache
  
 
 .catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
   at
   org
   .apache
  
 
 .catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:
  
   108)
   at
   org
  
  
 
 .apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:
   151)
   at
   org
  
 
 .apache.coyote.http11.Http11Processor.process(Http11Processor.java:
  
   870)
   at
   org.apache.coyote.http11.Http11BaseProtocol
  
 
 $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:
  
   665)
   at
   org
   .apache
  
  
 
 

Re: Field Collapsing - disable cache

2009-12-23 Thread rob

Sorry, that was when trying to patch the 1.4 branch

attempting to patch the trunk gives...

patching file src/test/test-files/fieldcollapse/testResponse.xml
patching file src/test/org/apache/solr/BaseDistributedSearchTestCase.java
Hunk #2 FAILED at 502.
1 out of 2 hunks FAILED -- saving rejects to file 
src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
patching file 
src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTest.java


btw, when is trunk actually updated?





On Wed 23/12/09 11:53 , r...@intelcompute.com wrote:

 I'm currently trying to patch aginst trunk, using SOLR-236.patch
 from 18/12/2009 but getting the following error...
 [ solr]$ patch -p0  SOLR-236.patch
 patching file
 src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
 patching file src/test/test-files/solr/conf/schema-fieldcollapse.xml
 patching file src/test/test-files/solr/conf/solrconfig.xml
 patching file src/test/test-files/fieldcollapse/testResponse.xml
 can't find file to patch at input line 787
 Perhaps you used the wrong -p or --strip option?
 The text leading up to this was:
 --
 |
 |Property changes on:
 src/test/test-files/fieldcollapse/testResponse.xml
 |___
 |Added: svn:keywords
 |   + Date Author Id Revision HeadURL
 |Added: svn:eol-style
 |   + native
 |
 |Index: src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 |===
 |--- src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 (revision 891214)
 |+++ src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 (working copy)
 --
 File to patch:
 any suggestions, or should i checkout the 1.4 branch instead?
 can't remember what i did last time to get field-collapse-5.patch
 working successfully.
 On Tue 22/12/09 22:43 , Lance Norskog  wrote:
  To avoid this possible bug, you could change the cache to only
 have a
  few entries.
  On Tue, Dec 22, 2009 at 6:34 AM, Martijn v Groningen
  wrote:
   In the latest patch some changes where made on the configuration
  side,
   but if you add the CollapseComponent to the conf no field
 collapse
   cache should be enabled. If not let me know.
  
   Martijn
  
   2009/12/22  :
  
  
  
  
  
   On Tue 22/12/09 12:28 , Martijn v Groningen  wrote:
  
   Hi Rob,
   What patch are you actually using from SOLR-236?
   Martijn
   2009/12/22  :
I've tried both, the whole fieldCollapsing tag, and just the
fieldCollapseCache tag inside it.
   both cause error.
   I guess I can just set size, initialSize, and
  autowarmCount
   to 0 ??
On Tue 22/12/09 11:17 , Toby Cole  wrote:Which elements did
  you
comment out? It could be the case that you need
to get rid of the entire fieldCollapsing element, not just
 the
fieldCollapsingCache element.
(Disclaimer: I've not used field collapsing in anger before
 :)
Toby.
   
On 22 Dec 2009, at 11:09,  wrote:
   
That's what I assumed, but I'm getting the following error
  with
   it
commented out
MESSAGE null java.lang.NullPointerException at
org
.apache
.solr
.search
.fieldcollapse
.AbstractDocumentCollapser
   
  .createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
at
org
.apache
.solr
.search
.fieldcollapse
.AbstractDocumentCollapser
.executeCollapse(AbstractDocumentCollapser.java:249)
at
org
.apache
.solr
.search
.fieldcollapse
   
  
 
 .AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:
   
172)
at
org
.apache
.solr
.handler
   
  
  .component.CollapseComponent.doProcess(CollapseComponent.java:173)
at
org
.apache
.solr
   
   
  
 
 .handler.component.CollapseComponent.process(CollapseComponent.java:
127)
at
org
.apache
.solr
.handler
   
  
  .component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org
.apache
.solr
   
   
  
 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
  at
org
.apache
   
   
  
 
 .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
at
org
.apache
   
   
  
 
 .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
at
org
.apache
.catalina
.core
   
   
  
 
 .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:
   
215)
at
org
.apache
.catalina
   
   
  
 
 .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org
.apache
   
   
  
 
 .catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:
   
210)
at
org
.apache
   
   
  
 
 .catalina.core.StandardContextValve.invoke(StandardContextValve.java:
   

Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
Latest SOLR-236.patch is for the trunk, if have updated the latest
patch so it should patch now without conflicts. If I remember
correctly the latest field-collapse-5.patch should work for 1.4, but
it doesn't for the trunk.

2009/12/23  r...@intelcompute.com:

 Sorry, that was when trying to patch the 1.4 branch

 attempting to patch the trunk gives...

 patching file src/test/test-files/fieldcollapse/testResponse.xml
 patching file src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 Hunk #2 FAILED at 502.
 1 out of 2 hunks FAILED -- saving rejects to file 
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
 patching file 
 src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTest.java


 btw, when is trunk actually updated?





 On Wed 23/12/09 11:53 , r...@intelcompute.com wrote:

 I'm currently trying to patch aginst trunk, using SOLR-236.patch
 from 18/12/2009 but getting the following error...
 [ solr]$ patch -p0  SOLR-236.patch
 patching file
 src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
 patching file src/test/test-files/solr/conf/schema-fieldcollapse.xml
 patching file src/test/test-files/solr/conf/solrconfig.xml
 patching file src/test/test-files/fieldcollapse/testResponse.xml
 can't find file to patch at input line 787
 Perhaps you used the wrong -p or --strip option?
 The text leading up to this was:
 --
 |
 |Property changes on:
 src/test/test-files/fieldcollapse/testResponse.xml
 |___
 |Added: svn:keywords
 |   + Date Author Id Revision HeadURL
 |Added: svn:eol-style
 |   + native
 |
 |Index: src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 |===
 |--- src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 (revision 891214)
 |+++ src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 (working copy)
 --
 File to patch:
 any suggestions, or should i checkout the 1.4 branch instead?
 can't remember what i did last time to get field-collapse-5.patch
 working successfully.
 On Tue 22/12/09 22:43 , Lance Norskog  wrote:
  To avoid this possible bug, you could change the cache to only
 have a
  few entries.
  On Tue, Dec 22, 2009 at 6:34 AM, Martijn v Groningen
  wrote:
   In the latest patch some changes where made on the configuration
  side,
   but if you add the CollapseComponent to the conf no field
 collapse
   cache should be enabled. If not let me know.
  
   Martijn
  
   2009/12/22  :
  
  
  
  
  
   On Tue 22/12/09 12:28 , Martijn v Groningen  wrote:
  
   Hi Rob,
   What patch are you actually using from SOLR-236?
   Martijn
   2009/12/22  :
I've tried both, the whole fieldCollapsing tag, and just the
fieldCollapseCache tag inside it.
       both cause error.
       I guess I can just set size, initialSize, and
  autowarmCount
   to 0 ??
On Tue 22/12/09 11:17 , Toby Cole  wrote:Which elements did
  you
comment out? It could be the case that you need
to get rid of the entire fieldCollapsing element, not just
 the
fieldCollapsingCache element.
(Disclaimer: I've not used field collapsing in anger before
 :)
Toby.
   
On 22 Dec 2009, at 11:09,  wrote:
   
That's what I assumed, but I'm getting the following error
  with
   it
commented out
MESSAGE null java.lang.NullPointerException at
org
.apache
.solr
.search
.fieldcollapse
.AbstractDocumentCollapser
   
  .createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
at
org
.apache
.solr
.search
.fieldcollapse
.AbstractDocumentCollapser
.executeCollapse(AbstractDocumentCollapser.java:249)
at
org
.apache
.solr
.search
.fieldcollapse
   
  
 
 .AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:
   
172)
at
org
.apache
.solr
.handler
   
  
  .component.CollapseComponent.doProcess(CollapseComponent.java:173)
at
org
.apache
.solr
   
   
  
 
 .handler.component.CollapseComponent.process(CollapseComponent.java:
127)
at
org
.apache
.solr
.handler
   
  
  .component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org
.apache
.solr
   
   
  
 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
  at
org
.apache
   
   
  
 
 .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
at
org
.apache
   
   
  
 
 .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
at
org
.apache
.catalina
.core
   
   
  
 
 .ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:
   
215)
at
org
.apache
.catalina
   
   
  
 
 

Re: Field Collapsing - disable cache

2009-12-23 Thread rob

Thanks, that latest update to the patch works fine now.



On Wed 23/12/09 13:13 , Martijn v Groningen martijn.is.h...@gmail.com wrote:

 Latest SOLR-236.patch is for the trunk, if have updated the latest
 patch so it should patch now without conflicts. If I remember
 correctly the latest field-collapse-5.patch should work for 1.4, but
 it doesn't for the trunk.
 2009/12/23  :
 
  Sorry, that was when trying to patch the 1.4 branch
 
  attempting to patch the trunk gives...
 
  patching file src/test/test-files/fieldcollapse/testResponse.xml
  patching file
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java
  Hunk #2 FAILED at 502.
  1 out of 2 hunks FAILED -- saving rejects to file
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
  patching file
 src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTes
 t.java
 
  btw, when is trunk actually updated?
 
 
 
 
 
  On Wed 23/12/09 11:53 ,  wrote:
 
  I'm currently trying to patch aginst trunk, using SOLR-236.patch
  from 18/12/2009 but getting the following error...
  [ solr]$ patch -p0  SOLR-236.patch
  patching file
  src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
  patching file
 src/test/test-files/solr/conf/schema-fieldcollapse.xml
  patching file src/test/test-files/solr/conf/solrconfig.xml
  patching file src/test/test-files/fieldcollapse/testResponse.xml
  can't find file to patch at input line 787
  Perhaps you used the wrong -p or --strip option?
  The text leading up to this was:
  --
  |
  |Property changes on:
  src/test/test-files/fieldcollapse/testResponse.xml
 
 |___
  |Added: svn:keywords
  |   + Date Author Id Revision HeadURL
  |Added: svn:eol-style
  |   + native
  |
  |Index:
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java
 
 |===
  |--- src/test/org/apache/solr/BaseDistributedSearchTestCase.java
  (revision 891214)
  |+++ src/test/org/apache/solr/BaseDistributedSearchTestCase.java
  (working copy)
  --
  File to patch:
  any suggestions, or should i checkout the 1.4 branch instead?
  can't remember what i did last time to get field-collapse-5.patch
  working successfully.
  On Tue 22/12/09 22:43 , Lance Norskog  wrote:
   To avoid this possible bug, you could change the cache to only
  have a
   few entries.
   On Tue, Dec 22, 2009 at 6:34 AM, Martijn v Groningen
   wrote:
In the latest patch some changes where made on the
 configuration
   side,
but if you add the CollapseComponent to the conf no field
  collapse
cache should be enabled. If not let me know.
   
Martijn
   
2009/12/22  :
   
   
   
   
   
On Tue 22/12/09 12:28 , Martijn v Groningen  wrote:
   
Hi Rob,
What patch are you actually using from SOLR-236?
Martijn
2009/12/22  :
 I've tried both, the whole fieldCollapsing tag, and just
 the
 fieldCollapseCache tag inside it.
both cause error.
I guess I can just set size, initialSize, and
   autowarmCount
to 0 ??
 On Tue 22/12/09 11:17 , Toby Cole  wrote:Which elements
 did
   you
 comment out? It could be the case that you need
 to get rid of the entire fieldCollapsing element, not
 just
  the
 fieldCollapsingCache element.
 (Disclaimer: I've not used field collapsing in anger
 before
  :)
 Toby.

 On 22 Dec 2009, at 11:09,  wrote:

 That's what I assumed, but I'm getting the following
 error
   with
it
 commented out
 MESSAGE null java.lang.NullPointerException at
 org
 .apache
 .solr
 .search
 .fieldcollapse
 .AbstractDocumentCollapser

  
 .createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
 at
 org
 .apache
 .solr
 .search
 .fieldcollapse
 .AbstractDocumentCollapser
 .executeCollapse(AbstractDocumentCollapser.java:249)
 at
 org
 .apache
 .solr
 .search
 .fieldcollapse

   
  
 
 .AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:

 172)
 at
 org
 .apache
 .solr
 .handler

   
  
 .component.CollapseComponent.doProcess(CollapseComponent.java:173)
 at
 org
 .apache
 .solr


   
  
 
 .handler.component.CollapseComponent.process(CollapseComponent.java:
 127)
 at
 org
 .apache
 .solr
 .handler

   
  
 .component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at
 org
 .apache
 .solr


   
  
 
 .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at
  org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
   at
 org
 .apache


   
  
 
 .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
 at
 org
 .apache


   
  
 
 

Re: Field Collapsing - disable cache

2009-12-23 Thread rob

Still seeing the same error when trying to comment out the fieldCollapsing 
block, or even just the fieldCollapseCache block inside it to disable the cache.

fresh trunk and latest patch.



message null java.lang.NullPointerException at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
 at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.executeCollapse(AbstractDocumentCollapser.java:249)
 at 
org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:172)
 at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:173)
 at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
 at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336) 
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) 
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870) 
at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
 at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
 at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
 at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
 at java.lang.Thread.run(Thread.java:636) 







On Wed 23/12/09 13:26 , r...@intelcompute.com wrote:

 Thanks, that latest update to the patch works fine now.
 On Wed 23/12/09 13:13 , Martijn v Groningen  wrote:
  Latest SOLR-236.patch is for the trunk, if have updated the latest
  patch so it should patch now without conflicts. If I remember
  correctly the latest field-collapse-5.patch should work for 1.4,
 but
  it doesn't for the trunk.
  2009/12/23  :
  
   Sorry, that was when trying to patch the 1.4 branch
  
   attempting to patch the trunk gives...
  
   patching file src/test/test-files/fieldcollapse/testResponse.xml
   patching file
  src/test/org/apache/solr/BaseDistributedSearchTestCase.java
   Hunk #2 FAILED at 502.
   1 out of 2 hunks FAILED -- saving rejects to file
  src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
   patching file
 
 src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTes
  t.java
  
   btw, when is trunk actually updated?
  
  
  
  
  
   On Wed 23/12/09 11:53 ,  wrote:
  
   I'm currently trying to patch aginst trunk, using
 SOLR-236.patch
   from 18/12/2009 but getting the following error...
   [ solr]$ patch -p0  SOLR-236.patch
   patching file
   src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
   patching file
  src/test/test-files/solr/conf/schema-fieldcollapse.xml
   patching file src/test/test-files/solr/conf/solrconfig.xml
   patching file
 src/test/test-files/fieldcollapse/testResponse.xml
   can't find file to patch at input line 787
   Perhaps you used the wrong -p or --strip option?
   The text leading up to this was:
   --
   |
   |Property changes on:
   src/test/test-files/fieldcollapse/testResponse.xml
  
 
 |___
   |Added: svn:keywords
   |   + Date Author Id Revision HeadURL
   |Added: svn:eol-style
   |   + native
   |
   |Index:
  src/test/org/apache/solr/BaseDistributedSearchTestCase.java
  
 
 |===
   |---
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java
   (revision 891214)
   |+++
 src/test/org/apache/solr/BaseDistributedSearchTestCase.java
   (working copy)
   --
   File to patch:
   any suggestions, or should i checkout the 1.4 branch instead?
   can't remember what i did last time to get
 field-collapse-5.patch
   working successfully.
   On Tue 22/12/09 22:43 , Lance Norskog  wrote:
To avoid this possible bug, you could change the cache to
 only
   have a

Re: SCHEMA-INDEX-MISMATCH

2009-12-23 Thread Yonik Seeley
On Tue, Dec 22, 2009 at 11:41 PM, johnson hong
hong.jinch...@goodhope.net wrote:
 I use Lucene's NumericField to index price field,And query with
 solr.TrieDoubleField.
 When i use price:[1  TO 5000] to search,it can return all results that
 price is between 1 and 5000.
 but the price value return is
 :ERROR:SCHEMA-INDEX-MISMATCH,stringValue=2000.0/str
 anybogy know why?

The index format is compatible, but solr stores the values in binary.
Use Solr to index, and the issues should go away.

-Yonik
http://www.lucidimagination.com


RE: Implementing Autocomplete/Query Suggest using Solr

2009-12-23 Thread Ankit Bhatnagar
In addition to what Shalin said, you could use the TermsComponent.
However you will be better off using the Dismax request handler 

Ankit

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Wednesday, December 23, 2009 2:49 AM
To: solr-user@lucene.apache.org
Subject: Re: Implementing Autocomplete/Query Suggest using Solr

On Wed, Dec 23, 2009 at 6:14 AM, Prasanna R plistma...@gmail.com wrote:


  I am curious how an approach that simply uses the wildcard query
 functionality on an indexed field would work.


It works fine as long as the terms are not repeated across documents.


 While Solr does not support
 wildcard queries out of the box currently, it will definitely be included
 in
 the future and I believe the edismax parser already lets you do that.


Solr supports prefix queries and there's a reverse wild card filter in trunk
too.

We do auto-complete through prefix searches on shingles.

-- 
Regards,
Shalin Shekhar Mangar.


Solr 1.4 - stats page slow

2009-12-23 Thread Stephen Weiss
We've been using Solr 1.4 for a few days now and one slight downside  
we've noticed is the stats page comes up very slowly for some reason -  
sometimes more than 10 seconds.  We call this programmatically to  
retrieve the last commit date so that we can keep users from  
committing too frequently.  This means some of our administration  
pages are now taking a long time to load.  Is there anything we should  
be doing to ensure that this page comes up quickly?  I see some notes  
on this back in October but it looks like that update should already  
be applied by now.  Or, better yet, is there now a better way to just  
retrieve the last commit date from Solr without pulling all of the  
statistics?


Thanks in advance.

--
Steve


SolrCore has a large number of SolrIndexSearchers retained in infoRegistry

2009-12-23 Thread Jon Poulton
Hi there,
I'm looking at some problems we are having with some legacy code which uses 
Solr (1.3) under the hood. We seem to get repeated OutOfMemory errors on a 
24-48 hour basis searching a relatively small index of 70,000 documents. This 
may be an error in the way Solr is configured, or it may be a problem with the 
way it's being invoked, I don't know, as I'm fairly new to Solr.

The OutOfMemoryError is an unusual one:

java.lang.OutOfMemoryError: GC overhead limit exceeded

There are also a few stack traces in the solr logs. The first is:

 SEVERE: Error during auto-warming of 
key:root_show_stop_range:[2009-12-22T08:59:37.254 TO 
*]:java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.init(String.java:215)
at 
org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
at 
org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:167)
at 
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:251)
at 
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218)
at 
org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:55)
at 
org.apache.lucene.index.MultiSegmentReader$MultiTermDocs.termDocs(MultiSegmentReader.java:609)
at 
org.apache.lucene.index.MultiSegmentReader$MultiTermDocs.next(MultiSegmentReader.java:560)
at 
org.apache.lucene.search.RangeFilter.getDocIdSet(RangeFilter.java:268)
at 
org.apache.lucene.search.ConstantScoreQuery$ConstantScorer.init(ConstantScoreQuery.java:116)
at 
org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:81)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:131)
at org.apache.lucene.search.Searcher.search(Searcher.java:126)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:601)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:507)
at 
org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:482)
at 
org.apache.solr.search.SolrIndexSearcher$1.regenerateItem(SolrIndexSearcher.java:224)
at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
at 
org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1518)
at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1018)
at 
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

I took a memory dump of the running application, and found that vast amounts of 
memory (over 400MB) was being consumed by nine or ten large SolrIndexSearcher 
objects; references to which are being held within a LinkedHashMap in SolrCore 
called infoRegistry.

I had a quick look at the Solr 1.3.0 source code to try and figure out what was 
going wrong and whether SolrCore was being used incorrectly in our own source. 
It looks like whenever a new Searcher is created, it registers itself with 
SolrCore, and this registration places a reference to the Searcher in a 
LinkedHashMap (the infoRegistry).

What is puzzling me is why so many SolrIndexSearcher objects are being created, 
and what the conditions are for their creation and removal. The code I can see 
in our own product does not use SolrIndexSearcher directly, it simply makes 
calls to execute on SolrCore; so I would normally expect that that class 
would be managing the Searcher life cycle internally.

Does anyone have any idea as to what may be going on here? I have had a look at 
the solrconfig.xml file, but it does not seem to depart significantly from the 
defaults provided.

Thanks in advance for any help.

Jon


Re: Questions on compound file format

2009-12-23 Thread KennyN


Yonik Seeley wrote:
 
 Compound was a *lot* slower indexing in past versions of Lucene...
 

I've noticed the difference with Lucene 2.4.1 and Solr 1.3 of ~40% speed
improvement on a RHEL 5.1 system while processing a fresh index of ~500,000
files by turning of the compound file.

However, if you process a lot of files, you will inevitably get the
FileNotFound (Too Many Files Open) exception.

-Kenny
-- 
View this message in context: 
http://old.nabble.com/Questions-on-compound-file-format-tp19318855p26903854.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrCore has a large number of SolrIndexSearchers retained in infoRegistry

2009-12-23 Thread Grant Ingersoll

On Dec 23, 2009, at 10:15 AM, Jon Poulton wrote:

 Hi there,
 I'm looking at some problems we are having with some legacy code which uses 
 Solr (1.3) under the hood. We seem to get repeated OutOfMemory errors on a 
 24-48 hour basis searching a relatively small index of 70,000 documents. This 
 may be an error in the way Solr is configured, or it may be a problem with 
 the way it's being invoked, I don't know, as I'm fairly new to Solr.
 
 The OutOfMemoryError is an unusual one:
 
 java.lang.OutOfMemoryError: GC overhead limit exceeded
 
 There are also a few stack traces in the solr logs. The first is:
 
 SEVERE: Error during auto-warming of 
 key:root_show_stop_range:[2009-12-22T08:59:37.254 TO 
 *]:java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.init(String.java:215)
at 
 org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
at 
 org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:167)
at 
 org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:251)
at 
 org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218)
at 
 org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:55)
at 
 org.apache.lucene.index.MultiSegmentReader$MultiTermDocs.termDocs(MultiSegmentReader.java:609)
at 
 org.apache.lucene.index.MultiSegmentReader$MultiTermDocs.next(MultiSegmentReader.java:560)
at 
 org.apache.lucene.search.RangeFilter.getDocIdSet(RangeFilter.java:268)
at 
 org.apache.lucene.search.ConstantScoreQuery$ConstantScorer.init(ConstantScoreQuery.java:116)
at 
 org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:81)
at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:131)
at org.apache.lucene.search.Searcher.search(Searcher.java:126)
at 
 org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:601)
at 
 org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:507)
at 
 org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:482)
at 
 org.apache.solr.search.SolrIndexSearcher$1.regenerateItem(SolrIndexSearcher.java:224)
at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
at 
 org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1518)
at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1018)
at 
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 
 I took a memory dump of the running application, and found that vast amounts 
 of memory (over 400MB) was being consumed by nine or ten large 
 SolrIndexSearcher objects; references to which are being held within a 
 LinkedHashMap in SolrCore called infoRegistry.
 
 I had a quick look at the Solr 1.3.0 source code to try and figure out what 
 was going wrong and whether SolrCore was being used incorrectly in our own 
 source. It looks like whenever a new Searcher is created, it registers 
 itself with SolrCore, and this registration places a reference to the 
 Searcher in a LinkedHashMap (the infoRegistry).
 
 What is puzzling me is why so many SolrIndexSearcher objects are being 
 created, and what the conditions are for their creation and removal. The code 
 I can see in our own product does not use SolrIndexSearcher directly, it 
 simply makes calls to execute on SolrCore; so I would normally expect that 
 that class would be managing the Searcher life cycle internally.

It sounds like you are either using embedded mode or you have some custom code. 
 Are you sure you are releasing your resources correctly?

 
 Does anyone have any idea as to what may be going on here? I have had a look 
 at the solrconfig.xml file, but it does not seem to depart significantly from 
 the defaults provided.
 
 Thanks in advance for any help.
 
 Jon



Re: query parsing ( expansion ) in solr

2009-12-23 Thread gudumba l
Hi,
 I have explored DisMaxRequestHandler. It could serve for some
of my purposes but not all.
1) It seems we have to decide that alternative field list beforehand
and declare them in the config.xml . But the field list for which
synonyms are to be considered is not definite ( at least in the view
of declaring manually in the xml ), its getting updated frequently
depending upon the indexed fiels. Anyways if the list is too big its
hard to follow this approach.

2) I have another issue too.. We could mention city , place, town in
the dismax declaration, but what if there is another list of synonyms
like .. if the query is organisation : xyz.. for which I would like
to convert the query to
  organisation:xyz OR company:xyz OR institution:xyz .

   As far as I explored it is not possible to link city, organisation
to their corresponding synonyms seperately, but we can only decalre a
set of default field names to be searched.
 If I am wrong at any point, please let me know.
  Any other suggestions?
Thanks.


2009/12/22 AHMET ARSLAN iori...@yahoo.com:

 Hello All,
             I have been
 trying to find out the right place to parse
 the query submitted. To be brief, I need to expand the
 query. For
 example.. let the query be
        city:paris
 then I would like to expand the query as .. follows
     city:paris OR place:paris OR town:paris .

      I guess the synonym support is
 provided only for values but not
 field names.

 Why not use DisMaxRequestHandler?
 ...search for the individual words across several fields...
 http://wiki.apache.org/solr/DisMaxRequestHandler






Re: query parsing ( expansion ) in solr

2009-12-23 Thread AHMET ARSLAN
 Hi,
      I have explored
 DisMaxRequestHandler. It could serve for some
 of my purposes but not all.
 1) It seems we have to decide that alternative field list
 beforehand
 and declare them in the config.xml . But the field list for
 which
 synonyms are to be considered is not definite ( at least in
 the view
 of declaring manually in the xml ), its getting updated
 frequently
 depending upon the indexed fiels. Anyways if the list is
 too big its
 hard to follow this approach.
 
 2) I have another issue too.. We could mention city ,
 place, town in
 the dismax declaration, but what if there is another list
 of synonyms
 like .. if the query is organisation : xyz.. for which I
 would like
 to convert the query to
       organisation:xyz OR company:xyz OR
 institution:xyz .
 
    As far as I explored it is not possible
 to link city, organisation
 to their corresponding synonyms seperately, but we can only
 decalre a
 set of default field names to be searched.
      If I am wrong at any point, please
 let me know.
       Any other suggestions?
 Thanks.

If you want field synonyms seperately, then you can extend 
org.apache.solr.handler.component.SearchHandler and override

public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
{

  final String q = req.getParams().get(CommonParams.Q);
  
  String expandedQuery = process incoming query with string operations e.g. if 
q.startsWith(organisation:) ;

  ModifiableSolrParams solrParams = new   ModifiableSolrParams(req.getParams());

  solrParams.set(CommonParams.Q, expandedQuery);
  req.setParams(solrParams);

  super.handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp);
}

Then register this new request handler in solrconfig.xml and use it.
Does this approach serve your purposes?





Re: query parsing ( expansion ) in solr

2009-12-23 Thread gudumba l
Hello,
 Thanks. This would absolutely serve. I thought of doing it in
queryparser part which I mentioned in first mail. But if the query is
a complex one, then it would become a bit complicated. Thats why I
wanted to know whether there is any other way which is similar to  the
second point in my first mail..

---2) I could first pass the incoming query string to a default parser
provided by Solr and then retrieve all the Terms ( and then add
synonym terms ) using by calling Query.extractTerms() ..on the
returned Query object but I am unable to get how to point out
relations among the Terms like.. whether its Term1 OR Term2 AND Term3,
 or   Term1 AND Term2 AND Term3 .. or something else.  

Thanks.

2009/12/23 AHMET ARSLAN iori...@yahoo.com:
 Hi,
      I have explored
 DisMaxRequestHandler. It could serve for some
 of my purposes but not all.
 1) It seems we have to decide that alternative field list
 beforehand
 and declare them in the config.xml . But the field list for
 which
 synonyms are to be considered is not definite ( at least in
 the view
 of declaring manually in the xml ), its getting updated
 frequently
 depending upon the indexed fiels. Anyways if the list is
 too big its
 hard to follow this approach.

 2) I have another issue too.. We could mention city ,
 place, town in
 the dismax declaration, but what if there is another list
 of synonyms
 like .. if the query is organisation : xyz.. for which I
 would like
 to convert the query to
       organisation:xyz OR company:xyz OR
 institution:xyz .

    As far as I explored it is not possible
 to link city, organisation
 to their corresponding synonyms seperately, but we can only
 decalre a
 set of default field names to be searched.
      If I am wrong at any point, please
 let me know.
       Any other suggestions?
 Thanks.

 If you want field synonyms seperately, then you can extend 
 org.apache.solr.handler.component.SearchHandler and override

 public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp)
 {

  final String q = req.getParams().get(CommonParams.Q);

  String expandedQuery = process incoming query with string operations e.g. 
 if q.startsWith(organisation:) ;

  ModifiableSolrParams solrParams = new   
 ModifiableSolrParams(req.getParams());

  solrParams.set(CommonParams.Q, expandedQuery);
  req.setParams(solrParams);

  super.handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp);
 }

 Then register this new request handler in solrconfig.xml and use it.
 Does this approach serve your purposes?






Re: highlighting and external storage

2009-12-23 Thread Erik Hatcher
Thomas - this is a common need that deserves some implementation.  I  
have a personal interest in seeing this implemented and will do so  
myself eventually if no one beats me to it.


There's a Solr JIRA issue to track this:  
https://issues.apache.org/jira/browse/SOLR-1397

Erik

On Dec 22, 2009, at 12:06 PM, Thomas Koch wrote:


Hi,

I'm working on a news crawler with continuous indexing. Thus indexes  
are
merged frequently and older documents aren't as important as recent  
ones.


Therefor I'd like to store the fulltext of documents in an external  
storage
(HBase?) so that merging of indexes isn't as IO intensive. This  
would give me
the additional benefit, that I could selectively delete the fulltext  
of older
articles when running out of disc space while keeping the url of the  
document

in the index.

Do you know, whether sth. like this would be possible?

Best regards,

Thomas Koch, http://www.koch.ro




NOT highlighting synonym

2009-12-23 Thread darniz

Hi Guys.
i have a requirement where we dont want to hightlight synonym matches. 
for example i search for caddy and i dont want to highlight matched synonym
like cadillac.
Looking at highlighting parameters i didn't find any support for this.
anyone can offer any advice.

darniz
-- 
View this message in context: 
http://old.nabble.com/NOT-highlighting-synonym-tp26906321p26906321.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: NOT highlighting synonym

2009-12-23 Thread Erik Hatcher


On Dec 23, 2009, at 2:26 PM, darniz wrote:

i have a requirement where we dont want to hightlight synonym matches.
for example i search for caddy and i dont want to highlight matched  
synonym

like cadillac.
Looking at highlighting parameters i didn't find any support for this.
anyone can offer any advice.


You can control what gets highlighted by which analyzer is used.  You  
may need a different field for highlighting than you use for searching  
in this case - but you can just create another field type without the  
synonym filter in it and use that for highlighting.


Erik



Re: Implementing Autocomplete/Query Suggest using Solr

2009-12-23 Thread Prasanna R
On Tue, Dec 22, 2009 at 11:49 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


   I am curious how an approach that simply uses the wildcard query
  functionality on an indexed field would work.


 It works fine as long as the terms are not repeated across documents.


 I do not follow why terms repeating across documents would be an issue. As
long as you can differentiate between multiple matches and rank them
properly it should work right?



  While Solr does not support
  wildcard queries out of the box currently, it will definitely be included
  in
  the future and I believe the edismax parser already lets you do that.


 Solr supports prefix queries and there's a reverse wild card filter in
 trunk
 too.


Are you referring to facet prefix queries as prefix queries? I looked at
reversed wild card filter but think that the regular wild card matching as
opposed to leading wild card matching is better suited for an
auto-completion feature.


 We do auto-complete through prefix searches on shingles.


Just to confirm, do you mean using EdgeNgram filter to produce letter ngrams
of the tokens in the chosen field?

Assuming the regular wild card query would also work, any thoughts on how it
compares to the EdgeNGram approach in terms of added indexing cost,
performance, etc.?

Thanks a lot for your valuable inputs/comments.

Prasanna.


Re: Field Collapsing - disable cache

2009-12-23 Thread Martijn v Groningen
How have you configured field collapsing?
I have field collapsing configured without caching like this:
searchComponent name=collapse
class=org.apache.solr.handler.component.CollapseComponent /

and with caching like this:
searchComponent name=collapse
class=org.apache.solr.handler.component.CollapseComponent

fieldCollapseCache
class=solr.FastLRUCache
size=512
initialSize=512
autowarmCount=128/

  /searchComponent

In both situations I don't get NPE. Also the line (in the patch) where
the exceptions occurs seems unlikely for a NPE.
if (fieldCollapseCache != null) {
  fieldCollapseCache.put(currentCacheKey, new CacheValue(result,
collectors, collapseContext)); // line 276
}

Does the exception occur immediately after the first search?

Martijn

2009/12/23  r...@intelcompute.com:

 Still seeing the same error when trying to comment out the fieldCollapsing 
 block, or even just the fieldCollapseCache block inside it to disable the 
 cache.

 fresh trunk and latest patch.



 message null java.lang.NullPointerException at 
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.createDocumentCollapseResult(AbstractDocumentCollapser.java:276)
  at 
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.executeCollapse(AbstractDocumentCollapser.java:249)
  at 
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:172)
  at 
 org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:173)
  at 
 org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:127)
  at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
  at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
  at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)
  at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
  at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
  at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
  at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
  at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
  at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) 
 at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870) 
 at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
  at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
  at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
  at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
  at java.lang.Thread.run(Thread.java:636)







 On Wed 23/12/09 13:26 , r...@intelcompute.com wrote:

 Thanks, that latest update to the patch works fine now.
 On Wed 23/12/09 13:13 , Martijn v Groningen  wrote:
  Latest SOLR-236.patch is for the trunk, if have updated the latest
  patch so it should patch now without conflicts. If I remember
  correctly the latest field-collapse-5.patch should work for 1.4,
 but
  it doesn't for the trunk.
  2009/12/23  :
  
   Sorry, that was when trying to patch the 1.4 branch
  
   attempting to patch the trunk gives...
  
   patching file src/test/test-files/fieldcollapse/testResponse.xml
   patching file
  src/test/org/apache/solr/BaseDistributedSearchTestCase.java
   Hunk #2 FAILED at 502.
   1 out of 2 hunks FAILED -- saving rejects to file
  src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
   patching file
 
 src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTes
  t.java
  
   btw, when is trunk actually updated?
  
  
  
  
  
   On Wed 23/12/09 11:53 ,  wrote:
  
   I'm currently trying to patch aginst trunk, using
 SOLR-236.patch
   from 18/12/2009 but getting the following error...
   [ solr]$ patch -p0  SOLR-236.patch
   patching file
   src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
   patching file
  src/test/test-files/solr/conf/schema-fieldcollapse.xml
   patching file src/test/test-files/solr/conf/solrconfig.xml
   patching file
 src/test/test-files/fieldcollapse/testResponse.xml
   can't find file to patch at input line 787
   Perhaps you used the wrong -p or --strip option?
   The text leading up to this was:
   --
   |
   

Re: Field Collapsing - disable cache

2009-12-23 Thread rob

Here's the sorlconfig, with a few fields removed from the request (which 
shouldn't make any difference)

http://www.intelcompute.com/solrconfig.xml

perhaps an old solrconfig to start with?  tho it came no later than a 1.3 
standard config file.




On Wed 23/12/09 21:33 , Martijn v Groningen martijn.is.h...@gmail.com wrote:

 How have you configured field collapsing?
 I have field collapsing configured without caching like this:
 and with caching like this:
 In both situations I don't get NPE. Also the line (in the patch)
 where
 the exceptions occurs seems unlikely for a NPE.
 if (fieldCollapseCache != null) {
 fieldCollapseCache.put(currentCacheKey, new CacheValue(result,
 collectors, collapseContext)); // line 276
 }
 Does the exception occur immediately after the first search?
 Martijn
 2009/12/23  :
 
  Still seeing the same error when trying to comment out the
 fieldCollapsing block, or even just the fieldCollapseCache block
 inside it to disable the cache.
 
  fresh trunk and latest patch.
 
 
 
  message null java.lang.NullPointerException at
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.createDocume
 ntCollapseResult(AbstractDocumentCollapser.java:276)at
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.executeColla
 pse(AbstractDocumentCollapser.java:249)at
 org.apache.solr.search.fieldcollapse.AbstractDocumentCollapser.collapse(Abs
 tractDocumentCollapser.java:172)at
 org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseCompo
 nent.java:173)at
 org.apache.solr.handler.component.CollapseComponent.process(CollapseCompone
 nt.java:127)at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHan
 dler.java:195)at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase
 .java:131)at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
 336)at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
 :239)at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
 nFilterChain.java:215)at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
 hain.java:188)at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.j
 ava:210)at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.j
 ava:172)at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:12
 7)at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:11
 7)at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.jav
 a:108)at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
 at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)
 at
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.process
 Connection(Http11BaseProtocol.java:665)at
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.ja
 va:528)at
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerW
 orkerThread.java:81)at
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.ja
 va:685)at java.lang.Thread.run(Thread.java:636)
 
 
 
 
 
 
 
  On Wed 23/12/09 13:26 ,  wrote:
 
  Thanks, that latest update to the patch works fine now.
  On Wed 23/12/09 13:13 , Martijn v Groningen  wrote:
   Latest SOLR-236.patch is for the trunk, if have updated the
 latest
   patch so it should patch now without conflicts. If I remember
   correctly the latest field-collapse-5.patch should work for
 1.4,
  but
   it doesn't for the trunk.
   2009/12/23  :
   
Sorry, that was when trying to patch the 1.4 branch
   
attempting to patch the trunk gives...
   
patching file
 src/test/test-files/fieldcollapse/testResponse.xml
patching file
   src/test/org/apache/solr/BaseDistributedSearchTestCase.java
Hunk #2 FAILED at 502.
1 out of 2 hunks FAILED -- saving rejects to file
   src/test/org/apache/solr/BaseDistributedSearchTestCase.java.rej
patching file
  
 
 src/test/org/apache/solr/search/fieldcollapse/FieldCollapsingIntegrationTes
   t.java
   
btw, when is trunk actually updated?
   
   
   
   
   
On Wed 23/12/09 11:53 ,  wrote:
   
I'm currently trying to patch aginst trunk, using
  SOLR-236.patch
from 18/12/2009 but getting the following error...
[ solr]$ patch -p0  SOLR-236.patch
patching file
src/test/test-files/solr/conf/solrconfig-fieldcollapse.xml
patching file
   src/test/test-files/solr/conf/schema-fieldcollapse.xml
patching file src/test/test-files/solr/conf/solrconfig.xml
patching file
  src/test/test-files/fieldcollapse/testResponse.xml
can't find file to patch at input line 787
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--
|
|Property changes on:

Re: SCHEMA-INDEX-MISMATCH

2009-12-23 Thread johnson hong



Yonik Seeley-2 wrote:
 
 On Tue, Dec 22, 2009 at 11:41 PM, johnson hong
 hong.jinch...@goodhope.net wrote:
 I use Lucene's NumericField to index price field,And query with
 solr.TrieDoubleField.
 When i use price:[1  TO 5000] to search,it can return all results that
 price is between 1 and 5000.
 but the price value return is
 :ERROR:SCHEMA-INDEX-MISMATCH,stringValue=2000.0/str
 anybogy know why?
 
 The index format is compatible, but solr stores the values in binary.
 Use Solr to index, and the issues should go away.
 
 -Yonik
 http://www.lucidimagination.com
 
 

Thank you,Yonik .
I dont know why solr store numericField in binary,it may be consume less 
storage,buy take more time to index.
then I must copy the code from TrieField to index in the way solr do.


-- 
View this message in context: 
http://old.nabble.com/get-SCHEMA-INDEX-MISMATCH-when-doing-range-query-on-lucene%27s-index.-tp26897605p26909633.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Implementing Autocomplete/Query Suggest using Solr

2009-12-23 Thread Shalin Shekhar Mangar
On Thu, Dec 24, 2009 at 2:39 AM, Prasanna R plistma...@gmail.com wrote:

 On Tue, Dec 22, 2009 at 11:49 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 
I am curious how an approach that simply uses the wildcard query
   functionality on an indexed field would work.
 
 
  It works fine as long as the terms are not repeated across documents.
 
 
  I do not follow why terms repeating across documents would be an issue. As
 long as you can differentiate between multiple matches and rank them
 properly it should work right?


A prefix search would return documents. If a field X being used for
auto-complete has the same value in two documents then the user will see the
same value being suggested twice.



 
   While Solr does not support
   wildcard queries out of the box currently, it will definitely be
 included
   in
   the future and I believe the edismax parser already lets you do that.
 
 
  Solr supports prefix queries and there's a reverse wild card filter in
  trunk
  too.
 

 Are you referring to facet prefix queries as prefix queries? I looked at
 reversed wild card filter but think that the regular wild card matching as
 opposed to leading wild card matching is better suited for an
 auto-completion feature.


No, I'm talking about regular prefix search e.g. field:val*



  We do auto-complete through prefix searches on shingles.
 

 Just to confirm, do you mean using EdgeNgram filter to produce letter
 ngrams
 of the tokens in the chosen field?


No, I'm talking about prefix search on tokens produced by a ShingleFilter.


 Assuming the regular wild card query would also work, any thoughts on how
 it
 compares to the EdgeNGram approach in terms of added indexing cost,
 performance, etc.?


With EdgeNGram, you can do phrase (exact) matches which are faster. But if
you have a big corpus of terms then EdgeNGramFilter can produce too many
tokens. In some places we are using phrase search on n-gram, in other places
(with more terms) we opted for prefix search on shingles.

-- 
Regards,
Shalin Shekhar Mangar.