date:20110324

stopwords not working in multicore setup

2011-03-24 Thread Christopher Bottaro

Hello,

I'm running a Solr server with 5 cores.  Three are for English content and
two are for German content.  The default stopwords setup works fine for the
English cores, but the German stopwords aren't working.

The German stopwords file is stopwords-de.txt and resides in the same
directory as stopwords.txt.  The German cores use a different schema (named
schema.page.de.xml) which has the following text field definition:
http://pastie.org/1711866

The stopwords-de.txt file looks like this:  http://pastie.org/1711869

The query I'm doing is this:  q => "title:für"

And it's returning documents with für in the title.  Title is a text field
which should use the stopwords-de.txt, as seen in the aforementioned pastie.

Any ideas?  Thanks for the help.

Re: [ANNOUNCEMENT] solr-packager 1.0.2 released!

2011-03-24 Thread Otis Gospodnetic

Hi Simone,

This is handy!
Any chance you'll be adding a version with Jetty 7.* ?

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Simone Tripodi 
> To: solr-user@lucene.apache.org
> Sent: Sat, March 19, 2011 8:13:36 PM
> Subject: [ANNOUNCEMENT] solr-packager 1.0.2 released!
> 
> Hi all,
> The Sourcesense's Solr Packager team is pleased to announce  the
> solr-packager-site-1.0.2 release!
> 
> Solr-Packager is a Maven  archetype to package Standalone Apache Solr
> embedded in Tomcat,
>  brought to you by Sourcesense
> 
> Changes in this version  include:
> 
> 
> Fixed Bugs:
> o Custom context root.  Issue: 4.
> o  Slave classifier doesn't get installed in M2 local repo.  Issue:  5.
> 
> More informations on http://sourcesense.github.com/solr-packager/
> 
> 
> Have fun!
> -  Simone Tripodi, on behalf of Sourcesense
> 
> http://people.apache.org/~simonetripodi/
> http://www.99soft.org/
>

Re: solr on the cloud

2011-03-24 Thread Otis Gospodnetic

Hi,


> I have tried running the sharded solr with zoo keeper on a  single machine.

> The SOLR code is from current trunk. It runs nicely. Can you  please point me
> to a page, where I can check the status of the solr on the  cloud development
> and available features, apart from http://wiki.apache.org/solr/SolrCloud ?

I'm afraid that's the most comprehensive documentation so far.

> Basically, of high interest  is checking out the Map-Reduce for distributed
> faceting, is it even possible  with the trunk?

Hm, MR for distributed faceting?  Maybe I missed this... can you point to a 
place that mentions this?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma

I believe it's example/solr/lib where it looks for shared libs in multicore. 
But, each core can has its own lib dir, usually in core/lib. This is 
referenced to in solrconfig.xml, see the example config for the lib directive.

> Well, there lies the problem--it's not JUST the Tika jar.  If it's not one
> thing, it's another, and I'm not even sure which directory Solr actually
> looks in.  In my Solr.xml file I have it use a shared library folder for
> every core.  Since each core will be holding very homologous data, there's
> no need to have any different library modules for each.
> 
> The relevant line in my solr.xml file is  sharedLib="lib">.  That is housed in .../example/solr/.  So, does it look
> in .../example/lib or .../example/solr/lib?
> 
> ~Brandon Waterloo
> 
> From: Markus Jelsma [markus.jel...@openindex.io]
> Sent: Thursday, March 24, 2011 11:29 AM
> To: solr-user@lucene.apache.org
> Cc: Brandon Waterloo
> Subject: Re: Multiple Cores with Solr Cell for indexing documents
> 
> Sounds like the Tika jar is not on the class path. Add it to a directory
> where Solr's looking for libs.
> 
> On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> > Hello everyone,
> > 
> > I've been trying for several hours now to set up Solr with multiple cores
> > with Solr Cell working on each core. The only items being indexed are
> > PDF, DOC, and TXT files (with the possibility of expanding this list,
> > but for now, just assume the only things in the index should be
> > documents).
> > 
> > I never had any problems with Solr Cell when I was using a single core.
> > In fact, I just ran the default installation in example/ and worked from
> > that. However, trying to migrate to multi-core has been a never ending
> > list of problems.
> > 
> > Any time I try to add a document to the index (using the same curl
> > command as I did to add to the single core, of course adding the core
> > name to the request URL-- host/solr/corename/update/extract...), I get
> > HTTP 500 errors due to classes not being found and/or lazy loading
> > errors. I've copied the exact example/lib directory into the cores, and
> > that doesn't work either.
> > 
> > Frankly the only libraries I want are those relevant to indexing files.
> > The less bloat, the better, after all. However, I cannot figure out
> > where to put what files, and why the example installation works
> > perfectly for single-core but not with multi-cores.
> > 
> > Here is an example of the errors I'm receiving:
> > 
> > command prompt> curl
> > "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> > "myfile=@test2.txt"
> > 
> > 
> > 
> > 
> > Error 500 
> > 
> > HTTP ERROR:
> > 500org/apache/tika/exception/TikaException
> > 
> > java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:247)
> > at
> > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java
> > : 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
> > at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
> > at
> > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappe
> > dH andler(RequestHandlers.java:240) at
> > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequ
> > e st(RequestHandlers.java:231) at
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.jav
> > a
> > 
> > :338) at
> > 
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.ja
> > v a:241) at
> > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand
> > l er.java:1089) at
> > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> > at
> > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21
> > 6 ) at
> > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> > at
> > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> > at
> > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC
> > o llection.java:211) at
> > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java
> > : 114) at
> > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> > at org.mortbay.jetty.Server.handle(Server.java:285)
> > at
> > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> > at
> > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.ja
> > v a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at
> > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at
> > org.mortbay.jetty.bio.SocketConnector$Connection.run(Socke

Re: Fuzzy query using dismax query parser

2011-03-24 Thread cyang2010

OK, i will have to wait till solr 3 release then.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727572.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to run boost query for non-dismax query parser

2011-03-24 Thread cyang2010

iorixxx, thanks for your reply.

Another a little bit off topic question.  I looked over all the subclasses
of QParserPlugin.  It seesm like most of them provide complementary parsing
to the default lucene/solr parser.   Except prefixParser.  What is the
intended usage of that one?  The default lucene/solr parser is able to parse
prefix query.  Is the intended usage with dismax parser?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2727566.html
Sent from the Solr - User mailing list archive at Nabble.com.

Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo

Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is .  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
>
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
>
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
>
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
>
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
>
> Here is an example of the errors I'm receiving:
>
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
>
> 
> 
> 
> Error 500 
> 
> HTTP ERROR: 500org/apache/tika/exception/TikaException
>
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(Cla

Re: Fuzzy query using dismax query parser

2011-03-24 Thread Ahmet Arslan

> I wonder how to conduct fuzzy query using dismax query
> parser?  I am able to
> do prefix query with local params and
> prefixQueryParser.  But how to handle
> fuzzy query?  
> 
> I like the behavior of dismax except it does not support
> the prefix query
> and fuzzy query.

You may interested in https://issues.apache.org/jira/browse/SOLR-1553

RE: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo

Well, there lies the problem--it's not JUST the Tika jar.  If it's not one 
thing, it's another, and I'm not even sure which directory Solr actually looks 
in.  In my Solr.xml file I have it use a shared library folder for every core.  
Since each core will be holding very homologous data, there's no need to have 
any different library modules for each.

The relevant line in my solr.xml file is .  That is housed in .../example/solr/.  So, does it look in 
.../example/lib or .../example/solr/lib?

~Brandon Waterloo

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: Thursday, March 24, 2011 11:29 AM
To: solr-user@lucene.apache.org
Cc: Brandon Waterloo
Subject: Re: Multiple Cores with Solr Cell for indexing documents

Sounds like the Tika jar is not on the class path. Add it to a directory where
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
>
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
>
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
>
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
>
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
>
> Here is an example of the errors I'm receiving:
>
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
>
> 
> 
> 
> Error 500 
> 
> HTTP ERROR: 500org/apache/tika/exception/TikaException
>
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(Cla

Newbie wants to index XML content.

2011-03-24 Thread Marcelo Iturbe

Hello,
I've been reading up on how to index XML content but have a few questions.

How is data in element attributes handled or defined? How are nested
elements handled?

In the following XML structure, I want to index the content of what is
between the  tags.
In one XML document, there can be up to 100  tags.
So the  tag would be equivalent to the  tag...

Can I somehow index this XML "as is" or will I have to parse it, creating
the  tag and placing all the elements on the same level?

Thanks for your help.



manual

MC Anon User
mca...@mcdomain.com




John Smith

jsmit...@gmail.com




First Last
First
Last


MC S.A.
CIO

fi...@mcdomain.com
flas...@yahoo.com
+5629460600
fi...@mcdomain.com
First.Last
111 Bude St, Toronto
http://blog.mcdomain.com/



regards
Marcelo
WebRep
Overall rating

Re: invert terms in search with exact match

2011-03-24 Thread Ahmet Arslan


Then you need to write some custom code for that. Lucene in Action Book (second 
edition, section 6.3.4) has an example for Translating PhraseQuery to 
SpanNearQuery. 

Just use false for the third parameter in SpanNearQuery's ctor.


You can plug https://issues.apache.org/jira/browse/SOLR-1604 too.


> yes sorry i made  a mistake
> 
> title(my AND love AND darling)
> 
> all three words have to match. the problem is always i
> don't want results
> with other words.
> 
> 
> 2011/3/24 Dario Rigolin 
> 
> > On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo
> wrote:
> >
> > >
> > > title1: my love darling
> > > title2: my darling love
> > > title3: darling my love
> > > title4: love my darling
> >
> > Sorry but simply search for:
> >
> >
> >  title:( my OR love OR darling)
> > If you have default operator OR you don't need to put
> OR on the query
> >
> > Best regards.
> > Dario Rigolin
> > Comperio srl (Italy)
> >
> 
> 
> 
> -- 
> Gastone Penzo
> *www.solr-italia.it*
> *The first italian blog about Apache Solr*
>

Fuzzy query using dismax query parser

2011-03-24 Thread cyang2010

Hi,

I wonder how to conduct fuzzy query using dismax query parser?  I am able to
do prefix query with local params and prefixQueryParser.  But how to handle
fuzzy query?  

I like the behavior of dismax except it does not support the prefix query
and fuzzy query.

Thanks.

cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Fuzzy-query-using-dismax-query-parser-tp2727075p2727075.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to run boost query for non-dismax query parser

2011-03-24 Thread Ahmet Arslan

> Thanks for your reply.  yeah, an additional query with
> the boost value will
> work.
> 
> However, I just wonder where you get the information that
> BoostQParserPlugin
> only handles function query?
> 
> I looked up the javadoc, and still can't get that. 
> This is the javadoc.
> 
> 
> Create a boosted query from the input value. The main value
> is the query to
> be boosted.
> Other parameters: b, the function query to use as the
> boost. 
> 
> 
> This just say if b value is specified it is a function
> query.   

As you and wiki said, b is the "function query" to use as the boost.

> I just don't
> understand why dismaxParser has both bf and bq, but for
> BoostQParserPlugin
> there is only bf equivalent. 

I don't know that :) However optional clauses with LuceneQParserPlugin will do 
the same effect as dismax's bq.
 
> Another question is by specifying localParameter like that
> in query, does it
> mean to use the default LuceneQParserPlugin primarily and
> only use
> BoostQParserPlugin for the content with the {}?

Not only for BoostQParserPlugin.

http://wiki.apache.org/solr/LocalParams

http://wiki.apache.org/solr/SimpleFacetParameters#Multi-Select_Faceting_and_LocalParams

Re: Solr throwing exception when evicting from filterCache

2011-03-24 Thread Yonik Seeley

On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell  wrote:
> I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
> error when making a request (with fq's), right at the point where the
> eviction count goes from 0 up:

Yep, this was a bug that has since been fixed.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

Re: Solr throwing exception when evicting from filterCache

2011-03-24 Thread Matt Mitchell

Here's the full stack trace:

[Ljava.lang.Object; cannot be cast to
[Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry;
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to
[Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry; at
org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377)
at
org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329)
at
org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144)
at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131) at
org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:613)
at
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:652)
at
org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1233)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1086)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:431)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at
org.mortbay.jetty.handler.ContextHandlerCollection.h

On Thu, Mar 24, 2011 at 1:54 PM, Matt Mitchell  wrote:

> I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
> error when making a request (with fq's), right at the point where the
> eviction count goes from 0 up:
>
> severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to
> [lorg.apache.solr.common.util.concurrentlrucache$cacheentry
>
> If you then make another request, Solr response with the expected result.
>
> Is this a bug? Has anyone seen this before? Any
> tips/help/feedback/questions would be much appreciated!
>
> Thanks,
> Matt
>

Re: how to run boost query for non-dismax query parser

2011-03-24 Thread cyang2010

Hi iorixxx,

Thanks for your reply.  yeah, an additional query with the boost value will
work.

However, I just wonder where you get the information that BoostQParserPlugin
only handles function query?

I looked up the javadoc, and still can't get that.  This is the javadoc.


Create a boosted query from the input value. The main value is the query to
be boosted.
Other parameters: b, the function query to use as the boost. 


This just say if b value is specified it is a function query.   I just don't
understand why dismaxParser has both bf and bq, but for BoostQParserPlugin
there is only bf equivalent. 

Another question is by specifying localParameter like that in query, does it
mean to use the default LuceneQParserPlugin primarily and only use
BoostQParserPlugin for the content with the {}?

Thanks.  look forward to your reply,


cy

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-run-boost-query-for-non-dismax-query-parser-tp2723442p2726422.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Detecting an empty index during start-up

2011-03-24 Thread Chris Hostetter

: I am not familiar with Solr internals, so the approach I wanted to take was
: to basically check the numDocs property of the index during start-up and set
: a READABLE state in the ZooKeeper node if it's greater than 0. I also
: planned to create a commit hook for replication and updating which
: controlled the READABLE property based on numDocs also.
: 
: This just leaves the problem of finding out the number of documents during
: start-up. I planned to have something like:

Most of the ZK stuff you mentioned is over my head, but i get the general 
gist of what you want:

 * a hook on startup that checks numDocs
 * if not empty, trigger some logic

My suggestion would be to implement this as a "firstSearcher" 
SolrEventListener.  when that runs, you'll have easy access to a 
SOlrIndexSearcher (and you won't even have to refcount it) and you can 
fire whatever logic you want based on what you find when looking at it.


-Hoss

Solr throwing exception when evicting from filterCache

2011-03-24 Thread Matt Mitchell

I have a recent build of solr (4.0.0.2011.02.25.13.06.24). I am seeing this
error when making a request (with fq's), right at the point where the
eviction count goes from 0 up:

severe: java.lang.classcastexception: [ljava.lang.object; cannot be cast to
[lorg.apache.solr.common.util.concurrentlrucache$cacheentry

If you then make another request, Solr response with the expected result.

Is this a bug? Has anyone seen this before? Any tips/help/feedback/questions
would be much appreciated!

Thanks,
Matt

Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

2011-03-24 Thread fr . jurain

Hello Solrists,
 
As it says in the subject line, I'm looking for a Java component that,
given an ISO 639-1 code or some equivalent,
would return a Lucene Analyzer ready to gobble documents in the corresponding 
language.
Solr looks like it has to contain one,
only I've not been able to locate it so far; 
can you point the spot?
 
I've found org.apache.solr.analysis,
and thing like org.apache.lucene.analysis.bg &c in lucene/modules,
with many classes which I'm sure are related, however the factory itself still 
eludes me;
I mean the Java class.method that'd decide on request, what to do with all 
these packages
to bring the requisite object to existence, once the language is specified.
Where should I look? Or was I mistaken & Solr has nothing of the kind, at least 
in Java?
Thanks in advance for your help.
 
Best regards,
François Jurain.



  Retrouvez les 10 conseils pour économiser votre carburant sur Voila :  
http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/

Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo

yes sorry i made  a mistake

title(my AND love AND darling)

all three words have to match. the problem is always i don't want results
with other words.


2011/3/24 Dario Rigolin 

> On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote:
>
> >
> > title1: my love darling
> > title2: my darling love
> > title3: darling my love
> > title4: love my darling
>
> Sorry but simply search for:
>
>
>  title:( my OR love OR darling)
> If you have default operator OR you don't need to put OR on the query
>
> Best regards.
> Dario Rigolin
> Comperio srl (Italy)
>



-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*

Re: invert terms in search with exact match

2011-03-24 Thread Dario Rigolin

On Thursday, March 24, 2011 03:52:31 pm Gastone Penzo wrote:

> 
> title1: my love darling
> title2: my darling love
> title3: darling my love
> title4: love my darling

Sorry but simply search for:

 title:( my OR love OR darling) 
If you have default operator OR you don't need to put OR on the query

Best regards.
Dario Rigolin
Comperio srl (Italy)

Re: Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Markus Jelsma

Sounds like the Tika jar is not on the class path. Add it to a directory where 
Solr's looking for libs.

On Thursday 24 March 2011 16:24:17 Brandon Waterloo wrote:
> Hello everyone,
> 
> I've been trying for several hours now to set up Solr with multiple cores
> with Solr Cell working on each core. The only items being indexed are PDF,
> DOC, and TXT files (with the possibility of expanding this list, but for
> now, just assume the only things in the index should be documents).
> 
> I never had any problems with Solr Cell when I was using a single core. In
> fact, I just ran the default installation in example/ and worked from
> that. However, trying to migrate to multi-core has been a never ending
> list of problems.
> 
> Any time I try to add a document to the index (using the same curl command
> as I did to add to the single core, of course adding the core name to the
> request URL-- host/solr/corename/update/extract...), I get HTTP 500 errors
> due to classes not being found and/or lazy loading errors. I've copied the
> exact example/lib directory into the cores, and that doesn't work either.
> 
> Frankly the only libraries I want are those relevant to indexing files. The
> less bloat, the better, after all. However, I cannot figure out where to
> put what files, and why the example installation works perfectly for
> single-core but not with multi-cores.
> 
> Here is an example of the errors I'm receiving:
> 
> command prompt> curl
> "host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F
> "myfile=@test2.txt"
> 
> 
> 
> 
> Error 500 
> 
> HTTP ERROR: 500org/apache/tika/exception/TikaException
> 
> java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:247)
> at
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:
> 359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at
> org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedH
> andler(RequestHandlers.java:240) at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque
> st(RequestHandlers.java:231) at
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java
> :338) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav
> a:241) at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandl
> er.java:1089) at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216
> ) at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
> at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCo
> llection.java:211) at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:
> 114) at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
> at org.mortbay.jetty.Server.handle(Server.java:285)
> at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.jav
> a:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
> at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:
> 226) at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java
> :442) Caused by: java.lang.ClassNotFoundException:
> org.apache.tika.exception.TikaException at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> ... 27 more
> 
> RequestURI=/solr/core0/update/extract href="http://jetty.mortbay.org/";>Powered by
> Jetty:// 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Any assistance you could provide or installation guides/tutorials/etc. that
> you could link me to would be greatly appreciated. Thank you all for your
> time!
> 
> ~Brandon Waterloo

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Multiple Cores with Solr Cell for indexing documents

2011-03-24 Thread Brandon Waterloo

Hello everyone,

I've been trying for several hours now to set up Solr with multiple cores with 
Solr Cell working on each core. The only items being indexed are PDF, DOC, and 
TXT files (with the possibility of expanding this list, but for now, just 
assume the only things in the index should be documents).

I never had any problems with Solr Cell when I was using a single core. In 
fact, I just ran the default installation in example/ and worked from that. 
However, trying to migrate to multi-core has been a never ending list of 
problems.

Any time I try to add a document to the index (using the same curl command as I 
did to add to the single core, of course adding the core name to the request 
URL-- host/solr/corename/update/extract...), I get HTTP 500 errors due to 
classes not being found and/or lazy loading errors. I've copied the exact 
example/lib directory into the cores, and that doesn't work either.

Frankly the only libraries I want are those relevant to indexing files. The 
less bloat, the better, after all. However, I cannot figure out where to put 
what files, and why the example installation works perfectly for single-core 
but not with multi-cores.

Here is an example of the errors I'm receiving:

command prompt> curl 
"host/solr/core0/update/extract?literal.id=2-3-1&commit=true" -F 
"myfile=@test2.txt"




Error 500 

HTTP ERROR: 500org/apache/tika/exception/TikaException

java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.ClassNotFoundException: 
org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 27 more

RequestURI=/solr/core0/update/extracthttp://jetty.mortbay.org/";>Powered by Jetty://























Any assistance you could provide or installation guides/tutorials/etc. that you 
could link me to would be greatly appreciated. Thank you all for your time!

~Brandon Waterloo

Re: invert terms in search with exact match

2011-03-24 Thread Jonathan Rochkind

You can use query slop as others have said to find documents with "my" 
and "love" right next to each other, in any order. And I think query 
slop can probably work for three or more words too to do that.


But it won't find files with ONLY those words in it. For instance "my 
love"~2 will still match:


love my something else
something my love else
other love my

etc.

Solr isn't so good at doing "exact" matches in general, although there 
are some techniques to set up your index and queries to do actual 
"exact" (entire field) matches -- mostly putting fake tokens like 
"$BEGIN" and "$END" at the beginning and end of your indexed values, and 
then doing a phrase search which puts those tokens at begin and end too.


But I'm not sure if you can extend that technique to find exactly the 
words in _any_ order, instead of just the exact exact phrase. Maybe 
somehow using phrase slop?  It gets confusing to think about, I'm not sure.


On 3/24/2011 10:52 AM, Gastone Penzo wrote:

no beacuse i don't know the words i want to ignore.. and i don't want use
dismax.
i have to use standard handler.

the problem is very simple. i want to recive only documents that have in
title field ONLY the words i search,
in any order.

if i search "my love darling", i want solr returns me these possilbe titles:

title1: my love darling
title2: my darling love
title3: darling my love
title4: love my darling
.

all the combinations of these 3 words. others words have to be ignored

thanx


2011/3/24 Bill Bell


Yes create qt with dismax and qf on field that has query stopwords for the
words you want to ignore.

Bill Bell
Sent from mobile


On Mar 24, 2011, at 7:58 AM, Gastone Penzo
wrote:


Hi,
is it possible with standard query search (not dismax) to have
exact matches that allow any terms order?

for example:

if i search "my love" i would solr gives to me docs with

- my love
- love my

it's easy: q=title:(my AND love)

the problem is it returns also docs with

"my love is my dog"

i don't want this. i want only docs with title formed by these 2 terms:

my

and love.

is it possible??

thanx

--
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*

Re: dismax parser, parens, what do they do exactly

2011-03-24 Thread Jonathan Rochkind

Thanks Hoss, this is very helpful, okay, dismax is not intended to do 
anything with parens for semantics, they're just like any other char, 
handled by analyzers.


I think you're right I cut and paste the wrong query before. Just for 
the record, on 1.4.1:


qf=text
pf=
q=book (dog +(cat -frog))


+((DisjunctionMaxQuery((text:book)~0.01) 
DisjunctionMaxQuery((text:dog)~0.01) 
DisjunctionMaxQuery((text:cat)~0.01) 
-DisjunctionMaxQuery((text:frog)~0.01))~3) ()




+(((text:book)~0.01 (text:dog)~0.01 (text:cat)~0.01 -(text:frog)~0.01)~3) ()

Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo

no beacuse i don't know the words i want to ignore.. and i don't want use
dismax.
i have to use standard handler.

the problem is very simple. i want to recive only documents that have in
title field ONLY the words i search,
in any order.

if i search "my love darling", i want solr returns me these possilbe titles:

title1: my love darling
title2: my darling love
title3: darling my love
title4: love my darling
.

all the combinations of these 3 words. others words have to be ignored

thanx


2011/3/24 Bill Bell 

> Yes create qt with dismax and qf on field that has query stopwords for the
> words you want to ignore.
>
> Bill Bell
> Sent from mobile
>
>
> On Mar 24, 2011, at 7:58 AM, Gastone Penzo 
> wrote:
>
> > Hi,
> > is it possible with standard query search (not dismax) to have
> > exact matches that allow any terms order?
> >
> > for example:
> >
> > if i search "my love" i would solr gives to me docs with
> >
> > - my love
> > - love my
> >
> > it's easy: q=title:(my AND love)
> >
> > the problem is it returns also docs with
> >
> > "my love is my dog"
> >
> > i don't want this. i want only docs with title formed by these 2 terms:
> my
> > and love.
> >
> > is it possible??
> >
> > thanx
> >
> > --
> > Gastone Penzo
> > *www.solr-italia.it*
> > *The first italian blog about Apache Solr*
>



-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*

Re: invert terms in search with exact match

2011-03-24 Thread Bill Bell

Yes create qt with dismax and qf on field that has query stopwords for the 
words you want to ignore.

Bill Bell
Sent from mobile


On Mar 24, 2011, at 7:58 AM, Gastone Penzo  wrote:

> Hi,
> is it possible with standard query search (not dismax) to have
> exact matches that allow any terms order?
> 
> for example:
> 
> if i search "my love" i would solr gives to me docs with
> 
> - my love
> - love my
> 
> it's easy: q=title:(my AND love)
> 
> the problem is it returns also docs with
> 
> "my love is my dog"
> 
> i don't want this. i want only docs with title formed by these 2 terms: my
> and love.
> 
> is it possible??
> 
> thanx
> 
> -- 
> Gastone Penzo
> *www.solr-italia.it*
> *The first italian blog about Apache Solr*

Re: invert terms in search with exact match

2011-03-24 Thread Gastone Penzo

Hi Tommaso,
thank you for the answer but the problem in your solution is that solr
returns to me
also docs with other words. For example:

my love is the world

i want to exclude the other words.
it must give to me only docs with my love or love my. stop

Thank you

2011/3/24 Tommaso Teofili 

> Hi Gastone,
> I think you should use proximity search as described here in Lucene query
> syntax page [1].
> So searching for "my love"~2 should work for your use case.
> Cheers,
> Tommaso
>
> [1] : 
> http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches
>
> 2011/3/24 Gastone Penzo 
>
>> Hi,
>> is it possible with standard query search (not dismax) to have
>> exact matches that allow any terms order?
>>
>> for example:
>>
>> if i search "my love" i would solr gives to me docs with
>>
>> - my love
>> - love my
>>
>> it's easy: q=title:(my AND love)
>>
>> the problem is it returns also docs with
>>
>> "my love is my dog"
>>
>> i don't want this. i want only docs with title formed by these 2 terms: my
>> and love.
>>
>> is it possible??
>>
>> thanx
>>
>> --
>> Gastone Penzo
>> *www.solr-italia.it*
>> *The first italian blog about Apache Solr*
>>
>
>


-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*

Detecting an empty index during start-up

2011-03-24 Thread David McLaughlin

Hi,

In our Solr deployment we have a cluster of replicated Solr cores, with the
small change that we have dynamic master look-up using ZooKeeper. The
problem I am trying to solve is to make sure that when a new Solr core joins
the cluster it isn't made available to any search services until it has been
filled with data.

I am not familiar with Solr internals, so the approach I wanted to take was
to basically check the numDocs property of the index during start-up and set
a READABLE state in the ZooKeeper node if it's greater than 0. I also
planned to create a commit hook for replication and updating which
controlled the READABLE property based on numDocs also.

This just leaves the problem of finding out the number of documents during
start-up. I planned to have something like:

int numDocs = 0;
RefCounted searcher = core.getSearcher();
try {
   numDocs = searcher.get().getIndexReader().numDocs();
} finally {
searcher.decref();
}

but getSearcher's documentation specifically says don't use it from the
inform method. I missed this at first and of course I got a deadlock
(although only when I had more than one core on the same Solr instance).

Is there a simpler way to do what I want? Or will I just need to have a
thread which waits until the Searcher is available before setting the
state?

Thanks,
David

Question about http://wiki.apache.org/solr/Deduplication

2011-03-24 Thread eks dev

Hi,
Use case I am trying to figure out is about preserving IDs without
re-indexing on duplicate, rather adding this new ID under list of
document id "aliases".

Example:
Input collection:
"id":1, "text":"dummy text 1", "signature":"A"
"id":2, "text":"dummy text 1", "signature":"A"

I add the first document in empty index, text is going to be indexed,
ID is going to be "1", so far so good

Now the question, if I add second document with id == "2", instead of
deleting/indexing this new document, I would like to store id == 2 in
multivalued Field "id"

At the end, I would have one document less indexed and both ID are
going to be "searchable" (and stored as well)...

Is it possible in solr to have multivalued "id"? Or I need to make my
own "mv_ID" for this? Any ideas how to achieve this efficiently?

My target is not to add new documents if signature matches, but to
have IDs indexed and stored?

Thanks,
eks

Re: invert terms in search with exact match

2011-03-24 Thread Tommaso Teofili

Hi Gastone,
I think you should use proximity search as described here in Lucene query
syntax page [1].
So searching for "my love"~2 should work for your use case.
Cheers,
Tommaso

[1] : 
http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches

2011/3/24 Gastone Penzo 

> Hi,
> is it possible with standard query search (not dismax) to have
> exact matches that allow any terms order?
>
> for example:
>
> if i search "my love" i would solr gives to me docs with
>
> - my love
> - love my
>
> it's easy: q=title:(my AND love)
>
> the problem is it returns also docs with
>
> "my love is my dog"
>
> i don't want this. i want only docs with title formed by these 2 terms: my
> and love.
>
> is it possible??
>
> thanx
>
> --
> Gastone Penzo
> *www.solr-italia.it*
> *The first italian blog about Apache Solr*
>

Re: how to run boost query for non-dismax query parser

2011-03-24 Thread Ahmet Arslan


> I need to code some boosting logic when some field equal to
> some value.   I
> was able to get it work if using dismax query parser. 
> However, since the
> solr query will need to handle prefix or fuzzy query,
> therefore, dismax
> query parser is not really my choice.  
> 
> Therefore, i want to use standard query parser, but still
> have dismax's
> boosting query logic.  For example, this query return
> all the titles
> regardless what the value is, however, will boost the score
> of those which
> genres=5237:
> 
> http://localhost:8983/solr/titles/select?indent=on&start=0&rows=10&fl=*%2Cscore&wt=standard&explainOther=&hl.fl=&qt=standard&q={!boost%20b=genres:5237^2.2}*%3A*&debugQuery=on
> 
> 
> Here is the exception i get:
> HTTP ERROR: 400
> 
> org.apache.lucene.queryParser.ParseException: Expected ','
> at position 6 in
> 'genres:5237^2.2'

BoostingQParserPlugin takes a FunctionQuery. In your case it is lucene/solr 
query. If you want to boost by solr/lucene query, you can add that clause as 
optional clause. Thats all.

q=+*:* genres:5237^2.2&q.op=OR  will do the trick. Just make sure that you are 
using OR as a default operator.

Re: invert terms in search with exact match

2011-03-24 Thread Ahmet Arslan



--- On Thu, 3/24/11, Gastone Penzo  wrote:

> From: Gastone Penzo 
> Subject: invert terms in search with exact match
> To: solr-user@lucene.apache.org
> Date: Thursday, March 24, 2011, 3:58 PM
> Hi,
> is it possible with standard query search (not dismax) to
> have
> exact matches that allow any terms order?
> 
> for example:
> 
> if i search "my love" i would solr gives to me docs with
> 
> - my love
> - love my
> 
> it's easy: q=title:(my AND love)
> 
> the problem is it returns also docs with
> 
> "my love is my dog"
> 
> i don't want this. i want only docs with title formed by
> these 2 terms: my
> and love.

PhraseQuery has an interesting property. If you don't use slop value (means 
zero) it is ordered phrase query. However starting from 1, it is un-ordered.

"my love"~1 will somehow satisfy you.  If really want "my love" to be unordered 
you can try solr-1604.

invert terms in search with exact match

2011-03-24 Thread Gastone Penzo

Hi,
is it possible with standard query search (not dismax) to have
exact matches that allow any terms order?

for example:

if i search "my love" i would solr gives to me docs with

- my love
- love my

it's easy: q=title:(my AND love)

the problem is it returns also docs with

"my love is my dog"

i don't want this. i want only docs with title formed by these 2 terms: my
and love.

is it possible??

thanx

-- 
Gastone Penzo
*www.solr-italia.it*
*The first italian blog about Apache Solr*

Re: Why boost query not working?

2011-03-24 Thread Ahmet Arslan



--- On Thu, 3/24/11, cyang2010  wrote:

> This solr query faile:
> 1. get every title regardless what the title_name is
> 2. within the result, boost the one which genre id =
> 56.  (bq=genres:56^100)
> 
> http://localhost:8983/solr/titles/select?indent=on&version=2.2&start=0&rows=10&fl=*%2Cscore&wt=standard&defType=dismax&qf=title_name_en_US&q=*%3A*&bq=genres%3A56^100&debugQuery=on
> 
> 
> But from debug i can tell it confuse the boost query
> parameter as part of
> query string:
> 
> lst name="debug">
> str name="rawquerystring">*:*
> str name="querystring">*:*
> str name="parsedquery">+() () genres:56^100.0
> str name="parsedquery_toString">+() () genres:56^100.0
> lst name="explain"/>
> str name="QParser">DisMaxQParser
> null name="altquerystring"/>
> −
> arr name="boost_queries">
> str>genres:56^100
> /arr>

With dismax, you cannot use semicolon or field queries. Instead of &q=*:* you 
can try q.alt=*:* (do not use q parameter at all)

Re: boosting with standard search handler

2011-03-24 Thread Gastone Penzo

Thank you Tommaso..
your solution works.
i read there's another methor, using _val_ parameter.

Thank

Gastone

2011/3/24 Tommaso Teofili 

> Hi Gastone,
> I used to do that in standard search handler using the following
> parameters:
> q={!boost b=query($qq,0.7)} text:something title:other
> qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8
> that enabling custom recency based boosting.
> My 2 cents,
> Tommaso
>
>
> 2011/3/24 Gastone Penzo 
>
> > Hi,
> > is possibile to boost fields like bf parameter of dismax in standard
> > request handler?
> > with or without funcions?
> >
> > thanx
> >
> > --
> > Gastone Penzo
> >
> > *www.solr-italia.it*
> > *The first italian blog about Apache Solr*
> >
>



-- 
Gastone Penzo

Re: boosting with standard search handler

2011-03-24 Thread Tommaso Teofili

Hi Gastone,
I used to do that in standard search handler using the following parameters:
q={!boost b=query($qq,0.7)} text:something title:other
qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8
that enabling custom recency based boosting.
My 2 cents,
Tommaso


2011/3/24 Gastone Penzo 

> Hi,
> is possibile to boost fields like bf parameter of dismax in standard
> request handler?
> with or without funcions?
>
> thanx
>
> --
> Gastone Penzo
>
> *www.solr-italia.it*
> *The first italian blog about Apache Solr*
>

boosting with standard search handler

2011-03-24 Thread Gastone Penzo

Hi,
is possibile to boost fields like bf parameter of dismax in standard request
handler?
with or without funcions?

thanx

-- 
Gastone Penzo

*www.solr-italia.it*
*The first italian blog about Apache Solr*

Re: Problem with field collapsing of patched Solr 1.4

2011-03-24 Thread Kai Schlamp-2


Afroz Ahmad wrote:
> 
> Have you enabled the collapse component in solconfig.xml?
> 
>  class="org.apache.solr.handler.component.CollapseComponent"
> />
> 

No, it seems that I missed that completely. Thank you, Afroz. It works fine
now.

Kai


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-with-field-collapsing-of-patched-Solr-1-4-tp2678850p2724321.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: which German stemmer to use?

2011-03-24 Thread Paul Libbrecht

In our ActiveMath project, we have had positive feedback in Lucene with the 
 SnowBallAnalyzer(Version.LUCENE_29,"German") 
which is probably one of the two below.

I note that you may want to be careful to use one field with exact matching 
(e.g. whitespace analyzer and lowercase filter) an done field with stemmed 
matches. That's two fields in the index and a query-expansion mechanism such as 
dismax to

  text-de^2.0 text-de.stemmed^1.2
(add the phonetic...)

One of the biggest issues that our testers formulated is that compound words 
should be split. I believe this issue is also very present in technology texts. 
Thus far only the compound-words analyzer can do such a split and you need the 
compounds to be manually input. Maybe that's doable?

paul


Le 24 mars 2011 à 00:14, Christopher Bottaro a écrit :

> The wiki lists 5 available, but doesn't do a good job at explaining or
> recommending one:
> 
> GermanStemFilterFactory
> SnowballPorterFilterFactory (German)
> SnowballPorterFilterFactory (German2)
> GermanLightStemFilterFactory
> GermanMinimalStemFilterFactory
> 
> Which is the best one to use in general?  Which is the best to use when the
> content being indexed is German technology articles?
> 
> Thanks for the help.

40 matches

Mail list logo