Re: Auto Suggest

2010-09-03 Thread Luke Tebbs

What about if you do something like this? -

facet=true&facet.mincount=1&q=apple&facet.limit=10&facet.prefix=mou&facet.field=term_suggest&qt=basic&wt=javabin&rows=0&version=1


Jason Rutherglen wrote:

To clarify, the query analyzer returns that.  Variations such as
"apple mou" also do not return anything.  Maybe Jay can comment and
then we can amend the article?

On Fri, Sep 3, 2010 at 6:12 AM, Jason Rutherglen
 wrote:
  

Analysis returns "app mou".

On Thu, Sep 2, 2010 at 6:12 PM, Lance Norskog  wrote:


What does analysis.jsp show?

On Thu, Sep 2, 2010 at 5:53 AM, Jason Rutherglen
 wrote:
  

I'm having a different issue with the EdgeNGram technique described
here: 
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

That is one word queries q=app on the query_text field, work fine
however "q=app mou" do not.  Why would this be or is there a
configuration that could be missing?

On Wed, Sep 1, 2010 at 3:53 PM, Eric Grobler  wrote:


Thanks for your feedback Robert,

I will try that and see how Solr performs on my data - I think I will create
a field that contains only important key/product terms from the text.

Regards
Johan

On Wed, Sep 1, 2010 at 9:12 PM, Robert Petersen  wrote:

  

We don't have that many, just a hundred thousand, and solr response
times (since the index's docs are small and not complex) are logged as
typically 1 ms if not 0 ms.  It's funny but sometimes it is so fast no
milliseconds have elapsed.  Incredible if you ask me...  :)

Once you get SOLR to consider the whole phrase as just one big term, the
wildcard is very fast.

-Original Message-
From: Eric Grobler [mailto:impalah...@googlemail.com]
Sent: Wednesday, September 01, 2010 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Auto Suggest

Hi Robert,

Interesting approach, how many documents do you have in Solr?
I have about 2 million and I just wonder if it might be a bit slow.

Regards
Johan

On Wed, Sep 1, 2010 at 7:38 PM, Robert Petersen 
wrote:



I do this by replacing the spaces with a '%' in a separate search
  

field


which is not parsed nor tokenized and then you can wildcard across the
whole phrase like you want and the spaces don't mess you up.  Just
  

store


the original phrase with spaces in a separate field for returning to
  

the


front end for display.

-Original Message-
From: Jazz Globe [mailto:jazzgl...@hotmail.com]
Sent: Wednesday, September 01, 2010 7:33 AM
To: solr-user@lucene.apache.org
Subject: Auto Suggest


Hallo

How would one implement a multiple term auto-suggest feature in Solr
that is filter sensitive?
For example, a user enters :
"mp3"
 and solr might suggest:
 ->   "mp3 player"
 ->   "mp3 nano"
 ->   "mp3 sony"
and then the user starts the second word :
"mp3 n"
and that narrows it down to:
 -> "mp3 nano"

I had a quick look at the Terms Component.
I suppose it just returns term totals for the entire index and cannot
  

be


used with a filter or query?

Thanks
Johan



  


--
Lance Norskog
goks...@gmail.com

  




Re: Localsolr with Dismax => workaround using spatial solr

2010-09-03 Thread Luke Tebbs
I finally managed to get spatial searching working in combination with 
dismax so I'm sending this should anyone else have the same problem.


I gave up using localsolr in the end - one of the resultsets of the two 
it returned was correct (dismax+spatial) but I don't trust this enough 
to depend upon it, don't want the surplus data and want to be able to 
sort by distance.


In the end I switched to using the spatial solr module. On my setup the 
default search is a dismax and the basedOn parameter for the spatial 
query parser is set to dismax. If you search without the spatial 
criteria it gives a dismax search as expected however when I try to use 
a spatial criteria it reverts to using a normal search despite the 
settings - much the same as with localsolr.


I seem to however have a working solution, albeit a not particularly 
pretty one, using filter queries -



   term
   {!spatial lat=50 long=-3 radius=100}*:*


However this does not allow you to sort against distance - presumably 
the filter is applied after the sort, instead I managed to get this 
working by swapping the two around and explicitly telling the filter to 
use dismax.



   distance asc
   {!spatial lat=50 long=-3 radius=100}*:*
   {!dismax}term


I don't know if this is suboptimal as the spatial search will likely be 
more expensive than the dismax (i think) but using the 20,000 odd 
records I'm testing with this is still ninja-quick. I'm going to up the 
dataset to a couple of million records and see if it is still acceptably 
fast.


Anyway, does anyone know if there is something I could be doing wrong 
that is causing dismax to not play nice with the two spatial searching 
methods, or is this one for the JIRA?


Luke



Luke Tebbs wrote:

Thanks Dan,
That seems to have moved things forwards, however if I do this I get 
two  sets, presumably one from localsolr and one from dismax.


e.g -


0
116


...


...



Also it seems to explode with a NullPointerException if I dare to try 
and sort by distance -
INFO: [testCore] webapp=/solr path=/select 
params={sort=geo_distance+asc&q=some+phrase&radius=30&long=-0.1262362&qt=geo&wt=javabin&lat=51.5001524&rows=0&version=1} 
status=500 QTime=123

02-Sep-2010 16:44:50 org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
   at 
org.apache.lucene.spatial.tier.DistanceFieldComparatorSource$DistanceScoreDocLookupComparator.copy(DistanceFieldComparatorSource.java:105) 

   at 
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.collect(TopFieldCollector.java:84) 

   at 
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:292)

   

I don't know if these are related - perhaps it's trying to compare 
against the dismax records that don't have a geo_distance?

Did you get anything like this?

Luke

dan whelan wrote:
 I experienced the same issue. The localsolr site says to configure 
like this:



localsolr
facet
mlt
highlight
debug


but the default solr components are (note the above config is missing 
query):


query
facet
mlt
highlight
stats
debug

I fixed it by doing this instead


localsolr






On 9/2/10 4:15 AM, Luke Tebbs wrote:

Anyone?

I'm really lost as to what to do here... if anyone has any 
experience with this

or even ideas of things to try I'd really appreciate your input.

It seems like what I'm trying to do should work but for some reason 
'defType' seems to be

ignored

Thankyou

Luke

 Original Message ---

Does anyone have any experience with getting dismax to work with a 
geospatial (localsolr) search?


I have the following configuration -


default="true">


dismax
title description^0.5
title description^0.5
0%
0.1





dismax
title description^0.5
title description^0.5
0%
0.1


localsolr facet
mlt
highlight
debug




All of the location searching works fine, as does the normal search, 
but when using the "geo" handler the textual search seems to be 
using the standard search handler and only the title field is searched.


I'm a bit stumped on this one, any help would be greatly appreciated.

Luke









Re: Localsolr with Dismax

2010-09-02 Thread Luke Tebbs

Thanks Dan,
That seems to have moved things forwards, however if I do this I get two 
 sets, presumably one from localsolr and one from dismax.


e.g -


0
116


...


...



Also it seems to explode with a NullPointerException if I dare to try 
and sort by distance - 

INFO: [testCore] webapp=/solr path=/select 
params={sort=geo_distance+asc&q=some+phrase&radius=30&long=-0.1262362&qt=geo&wt=javabin&lat=51.5001524&rows=0&version=1} 
status=500 QTime=123

02-Sep-2010 16:44:50 org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
   at 
org.apache.lucene.spatial.tier.DistanceFieldComparatorSource$DistanceScoreDocLookupComparator.copy(DistanceFieldComparatorSource.java:105)
   at 
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.collect(TopFieldCollector.java:84)
   at 
org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:292)

   

I don't know if these are related - perhaps it's trying to compare 
against the dismax records that don't have a geo_distance?

Did you get anything like this?

Luke

dan whelan wrote:
 I experienced the same issue. The localsolr site says to configure 
like this:



localsolr
facet
mlt
highlight
debug


but the default solr components are (note the above config is missing 
query):


query
facet
mlt
highlight
stats
debug

I fixed it by doing this instead


localsolr






On 9/2/10 4:15 AM, Luke Tebbs wrote:

Anyone?

I'm really lost as to what to do here... if anyone has any experience 
with this

or even ideas of things to try I'd really appreciate your input.

It seems like what I'm trying to do should work but for some reason 
'defType' seems to be

ignored

Thankyou

Luke

 Original Message ---

Does anyone have any experience with getting dismax to work with a 
geospatial (localsolr) search?


I have the following configuration -


default="true">


dismax
title description^0.5
title description^0.5
0%
0.1





dismax
title description^0.5
title description^0.5
0%
0.1


localsolr facet
mlt
highlight
debug




All of the location searching works fine, as does the normal search, 
but when using the "geo" handler the textual search seems to be using 
the standard search handler and only the title field is searched.


I'm a bit stumped on this one, any help would be greatly appreciated.

Luke







Re: java.lang.OutOfMemoryError: PermGen space when reopening solr server

2010-09-02 Thread Luke Tebbs

Antonio Calo' wrote:

 Il 02/09/2010 8.51, Lance Norskog ha scritto:

Loading a servlet creates a bunch of classes via reflection. These are
in PermGen and never go away. If you load&unload over and over again,
any PermGen setting will fill up.
I agree , taking a look to all the links suggested by Peter seems that 
this exception could be caused by the memory leak. Also, it seems that 
the CGLibe that manage the .class loading used by Spring have a big 
issue about this.
I looked into this about 6 months ago (with regards to this problem 
occurring with tomcat + spring, not jetty + solr) and found quite a bit 
of information on the Spring Community Forums. There was (at the time) 
no conclusive answer - it seemed CGLIB is used by just about everything 
(hibernate, AOP, tomcat, jetty) and that it has this latent defect.
Spring developers were basically saying it is due to CGLIB and there was 
nothing that they were able to do about it.


In the end I switched to jetty (for quicker startup) and just accepted 
restarting after every handful of redeploys.


CGLIB mailing lists (http://cglib.sourceforge.net/mail-lists.html) might 
be a good place to start.


Maibe it is  just an accident that it happens while opening a anew 
solr instance.


I'll investigate about general Permgem fault, but if someone have a 
suggestion on how to close solr server in a safe manner, you are  
welcome!

Many thanks for your feedbacks.

Antonio




Localsolr with Dismax

2010-09-02 Thread Luke Tebbs

Anyone?

I'm really lost as to what to do here... if anyone has any experience 
with this

or even ideas of things to try I'd really appreciate your input.

It seems like what I'm trying to do should work but for some reason 
'defType' seems to be

ignored

Thankyou

Luke

 Original Message ---

Does anyone have any experience with getting dismax to work with a 
geospatial (localsolr) search?


I have the following configuration -



  
dismax
title description^0.5
title description^0.5
0%
0.1
  



  
dismax
title description^0.5
title description^0.5
0%
0.1
  
  
localsolr 
facet

mlt
highlight
debug
  



All of the location searching works fine, as does the normal search, but 
when using the "geo" handler the textual search seems to be using the 
standard search handler and only the title field is searched.


I'm a bit stumped on this one, any help would be greatly appreciated.

Luke



Re: java.lang.OutOfMemoryError: PermGen space when reopening solr server

2010-09-02 Thread Luke Tebbs

I agree.

I wasn't proposing it as a fix merely as a means to reduce the time 
between restarts.



Luke

Lance Norskog wrote:

Loading a servlet creates a bunch of classes via reflection. These are
in PermGen and never go away. If you load&unload over and over again,
any PermGen setting will fill up.

On Wed, Sep 1, 2010 at 2:23 PM, Luke Tebbs  wrote:
  

Have you tried to up the MaxHeapSize?

I tend to run solr and the development instance in a separate jetty (on a
separate port) and actually restart the web server for the dev application
every now and again.
It doesn't take too long if you only have one webapp on jetty - I tend to
use mvn jetty:run on the CLI rather than launch jetty in eclipse. I also use
JRebel to reduce the number of restarts needed during dev.

As for a production instance, should you need to redeploy that often?

Luke

Antonio Calo' wrote:


 Hi guys

I'm facing an error in our production environment with our search
application based on maven with spring + solrj.

When I try to change a class, or try to redeploy/restart an application, I
catch a java.lang.OutOfMemoryError: PermGen

I've tryed to understand the cause of this and also I've succeded in
reproducing this issue on my local develop environment by just restarting
the jetty several time (I'm using eclipse + maven plugin).

The logs obtained are those:

  [...]
  1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
  /admin/: org.apache.solr.handler.admin.AdminHandlers
  1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
  /admin/ping: PingRequestHandler
  1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
  /debug/dump: solr.DumpRequestHandler
  32656 [Finalizer] INFO org.apache.solr.core.SolrCore - []  CLOSING
  SolrCore org.apache.solr.core.solrc...@1409c28
  17:43:19 ERROR InvertedIndexEngine:124 open -
  java.lang.OutOfMemoryError: PermGen space
  java.lang.RuntimeException: java.lang.OutOfMemoryError: PermGen space
   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
   at org.apache.solr.core.SolrCore.(SolrCore.java:579)
   at

org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
   at

com.intellisemantic.intellifacet.resource.invertedIndex.InvertedIndexEngine.open(InvertedIndexEngine.java:113)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeCustomInitMethod(AbstractAutowireCapableBeanFactory.java:1536)
   at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1477)
   at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1409)
  [...]

The exception is always thrown while solr init is performed after a
restart (this is the reason why I'm asking your support ;) )

It seems that while solr is trying to be set up (by [Timer-1]), another
thread ([Finalizer]) is trying to close it. I can see from the Solr code
that this exception is thrown always in the same place: SolrCore.java:1068.
Here there is a comment that say:

  // need to close the searcher here??? we shouldn't have to.
 throw new RuntimeException(th);
   } finally {
 if (newestSearcher != null) {
   newestSearcher.decref();
 }
   }

I'm using slorj lib in a Spring container, so I'm supposing that Spring
will manage the relase of all the singleton classes. Should I do something
other like force closing solr?

Thanks in advance for your support.

Best regards

Antonio

  





  




Re: java.lang.OutOfMemoryError: PermGen space when reopening solr server

2010-09-01 Thread Luke Tebbs


Have you tried to up the MaxHeapSize?

I tend to run solr and the development instance in a separate jetty (on 
a separate port) and actually restart the web server for the dev 
application every now and again.
It doesn't take too long if you only have one webapp on jetty - I tend 
to use mvn jetty:run on the CLI rather than launch jetty in eclipse. I 
also use JRebel to reduce the number of restarts needed during dev.


As for a production instance, should you need to redeploy that often?

Luke

Antonio Calo' wrote:

 Hi guys

I'm facing an error in our production environment with our search 
application based on maven with spring + solrj.


When I try to change a class, or try to redeploy/restart an 
application, I catch a java.lang.OutOfMemoryError: PermGen


I've tryed to understand the cause of this and also I've succeded in 
reproducing this issue on my local develop environment by just 
restarting the jetty several time (I'm using eclipse + maven plugin).


The logs obtained are those:

   [...]
   1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
   /admin/: org.apache.solr.handler.admin.AdminHandlers
   1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
   /admin/ping: PingRequestHandler
   1078 [Timer-1] INFO org.apache.solr.core.RequestHandlers - created
   /debug/dump: solr.DumpRequestHandler
   32656 [Finalizer] INFO org.apache.solr.core.SolrCore - []  CLOSING
   SolrCore org.apache.solr.core.solrc...@1409c28
   17:43:19 ERROR InvertedIndexEngine:124 open -
   java.lang.OutOfMemoryError: PermGen space
   java.lang.RuntimeException: java.lang.OutOfMemoryError: PermGen space
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1068)
at org.apache.solr.core.SolrCore.(SolrCore.java:579)
at
   
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) 


at
   
com.intellisemantic.intellifacet.resource.invertedIndex.InvertedIndexEngine.open(InvertedIndexEngine.java:113) 


at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
   
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 


at
   
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 


at java.lang.reflect.Method.invoke(Method.java:597)
at
   
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeCustomInitMethod(AbstractAutowireCapableBeanFactory.java:1536) 


at
   
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1477) 


at
   
org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1409) 


   [...]

The exception is always thrown while solr init is performed after a 
restart (this is the reason why I'm asking your support ;) )


It seems that while solr is trying to be set up (by [Timer-1]), 
another thread ([Finalizer]) is trying to close it. I can see from the 
Solr code that this exception is thrown always in the same place: 
SolrCore.java:1068.

Here there is a comment that say:

   // need to close the searcher here??? we shouldn't have to.
  throw new RuntimeException(th);
} finally {
  if (newestSearcher != null) {
newestSearcher.decref();
  }
}

I'm using slorj lib in a Spring container, so I'm supposing that 
Spring will manage the relase of all the singleton classes. Should I 
do something other like force closing solr?


Thanks in advance for your support.

Best regards

Antonio





Localsolr with Dismax

2010-09-01 Thread Luke Tebbs
Does anyone have any experience with getting dismax to work with a 
geospatial (localsolr) search?


I have the following configuration -


 
   
 dismax
 title description^0.5
 title description^0.5
 0%
 0.1
   
 

 
   
 dismax
 title description^0.5
 title description^0.5
 0%
 0.1
   
   
 localsolr 
 facet

 mlt
 highlight
 debug
   
 


All of the location searching works fine, as does the normal search, but 
when using the "geo" handler the textual search seems to be using the 
standard search handler and only the title field is searched.


I'm a bit stumped on this one, any help would be greatly appreciated.

Luke


Units/Currency range searching

2010-04-14 Thread Luke Tebbs
Hello everyone, 

I'm trying to find a way to search a range using units with on-the-fly
conversion for currency, for instance - 

[* TO 200USD] 

obviously most metrics could just be stored in one unit for the whole
dataset and converted prior to the query but my records could have
entries in any currency and thus would change their real value in
relation to each other as the exchange rates change. 

The only other way I can think to do this is to store two fields, one in
the native currency for the record and one in a homogenised converted
currency purely for searching and periodically update the latter to
allow for the exchange rates changing - but I was hoping someone might
know a better way :)

Regards,


Luke