Re: question on tokenization control

2012-05-01 Thread Dan Tuffery
Hi,

"Is that an indexing setting or query setting that will tokenize 'evalu'
but not 'eval'?"

Without seeing the tokenizers you're using for the field type it's hard to
say. You can use Solr's analysis page to see the tokens that are generated
by the tokenizers in your analysis chain at both query time and index time.

http://localhost:8983/solr/admin/analysis.jsp

"how do I get 'eval' to be a match?"

You could use synonyms to map 'eval' to 'evaluation'.

Dan

On Tue, May 1, 2012 at 8:17 PM, kfdroid  wrote:

> I have a field that is defined using what I believe is fairly standard
> "text"
> fieldType. I have documents with the words 'evaluate', 'evaluating',
> 'evaluation' in them. When I search on the whole word, obviously it works,
> if I search on 'eval' it finds nothing. However for some reason if I search
> on 'evalu' it finds all the matches.  Is that an indexing setting or query
> setting that will tokenize 'evalu' but not 'eval' and how do I get 'eval'
> to
> be a match?
>
> Thanks,
> Ken
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/question-on-tokenization-control-tp3953550.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr logo for print

2012-04-30 Thread Dan Tuffery
Try this one:

http://www.lucidimagination.com/sites/default/files/image/solr_logo_rgb.png

Dan

On Mon, Apr 30, 2012 at 8:38 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi,
>
> I'm trying to find a Solr logo in a vector or some other format suitable
> for print.  I found Lucene logo at
> http://svn.apache.org/repos/asf/lucene/site/publish/images/logo.eps , but
> can't find one for Solr.  Does anyone know where to find it?
>
> At the bottom of  http://wiki.apache.org/solr/PublicServers I found a
> link to
> https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/site/src/documentation/content/xdocs/images/
>  ,
> but that leads to 404.
>
>
> Can't find anything in svn either:
>
> $ find . -name \*image\* | xargs ls -l | grep -i solr
> -rw-r--r-- 1 otis otis 14993 2011-01-04 00:32
> ./solr/src/site/build/site/images/solr-book-image.jpg
> ./solr/src/site/build/site/images:
> -rw-r--r-- 1 otis otis 14993 2011-01-04 00:32 solr-book-image.jpg
> -rw-r--r-- 1 otis otis 12719 2011-01-04 00:32 solr.jpg
>
>
> Thanks,
> Otis
> 
> Performance Monitoring for Solr -
> http://sematext.com/spm/solr-performance-monitoring
>


Re: Java out of memory - with fieldcache faceting

2012-04-30 Thread Dan Tuffery
There's a Lucene/Solr memory size estimator spreadsheet in the SVN:

http://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls

Dan

On Mon, Apr 30, 2012 at 11:39 AM, Yuval Dotan  wrote:

> Thanks for the fast answer
> One more question:
> Is there a way to know (some formula) what is the size of memory i need for
> these actions?
>
> Thanks
> Yuval
>
> On Mon, Apr 30, 2012 at 11:50, Dan Tuffery  wrote:
>
> > You need to add more memory to the JVM that is running Solr:
> >
> > http://wiki.apache.org/solr/SolrPerformanceFactors#OutOfMemoryErrors
> >
> > Dan
> >
> > On Mon, Apr 30, 2012 at 9:43 AM, Yuval Dotan 
> wrote:
> >
> > > Hi Guys
> > > I have a problem and i need your assistance
> > > I get an exception when doing field cache faceting (the enum method
> works
> > > perfectly):
> > >
> > > */solr/select?q=*:*&facet=true&facet.field=src_ip_str&facet.limit=10*
> > >
> > > 
> > > java.lang.OutOfMemoryError: Java heap space
> > > 
> > > java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
> > at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:449)
> > > at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:277)
> > > at
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
> > > at
> > >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
> > > at
> > >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
> > > at
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
> > > at
> > >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
> > > at
> > >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
> > > at
> > >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
> > > at org.eclipse.jetty.server.Server.handle(Server.java:351) at
> > >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
> > > at
> > >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
> > > at
> > >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
> > > at
> > >
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
> > > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634) at
> > > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
> at
> > >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
> > > at
> > >
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
> > > at
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
> > > at
> > >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
> > > at java.lang.Thread.run(Thread.java:679) Caused by:
> > > java.lang.OutOfMemoryError: Java heap space at
> > > org.apache.lucene.util.packed.Direct16.(Direct16.java:38) at
> > >
> org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:267)
> > at
> > >
> org.apache.lucene.util.packed.GrowableWriter.set(GrowableWriter.java:81)
> > at
> > >
>

Re: Java out of memory - with fieldcache faceting

2012-04-30 Thread Dan Tuffery
You need to add more memory to the JVM that is running Solr:

http://wiki.apache.org/solr/SolrPerformanceFactors#OutOfMemoryErrors

Dan

On Mon, Apr 30, 2012 at 9:43 AM, Yuval Dotan  wrote:

> Hi Guys
> I have a problem and i need your assistance
> I get an exception when doing field cache faceting (the enum method works
> perfectly):
>
> */solr/select?q=*:*&facet=true&facet.field=src_ip_str&facet.limit=10*
>
> 
> java.lang.OutOfMemoryError: Java heap space
> 
> java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at
> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:449)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:277)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
> at org.eclipse.jetty.server.Server.handle(Server.java:351) at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
> at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
> at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
> at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634) at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230) at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
> at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
> at java.lang.Thread.run(Thread.java:679) Caused by:
> java.lang.OutOfMemoryError: Java heap space at
> org.apache.lucene.util.packed.Direct16.(Direct16.java:38) at
> org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:267) at
> org.apache.lucene.util.packed.GrowableWriter.set(GrowableWriter.java:81) at
> org.apache.lucene.search.FieldCacheImpl$DocTermsIndexCache.createValue(FieldCacheImpl.java:1178)
> at
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:248)
> at
> org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1081)
> at
> org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1077)
> at
> org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:459)
> at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:310)
> at
> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:396)
> at
> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:205)
> at
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:81)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1541) at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
> at

Re: can I use different tokenizer/analyzer for facet count query?

2012-04-25 Thread Dan Tuffery
If you use the KeywordTokenizer at index time too it should do what you
want. If that is not possible create another field.

Best practices for facet fields:

Indexed, not Tokenized (KeywordTokenizer)
Not stored


On Wed, Apr 25, 2012 at 3:52 PM, sam ”  wrote:

> From wiki:
> http://wiki.apache.org/solr/SimpleFacetParameters
>
> If you want both Analysis (for searching) and Faceting on the full literal
> Strings, *use copyField *to create two versions of the field: one Text and
> one String. Make sure both are indexed="true"
>
> Is that the only way? Do I need to have another field of type String? I'm
> using KeywordTokenizer for query...
>
> On Wed, Apr 25, 2012 at 10:41 AM, sam ”  wrote:
>
> > I have the following in schema.xml
> >  > positionIncrementGap="100">
> > 
> >  > delimiter="$"/>
> > 
> > 
> > 
> > 
> > 
> >  > stored="true" multiValued="true"/>
> >
> >
> > And, I have the following doc:
> > 
> > 
> > blues$Teal/Turquoise
> > 
> > ...
> > 
> >
> >
> > Response of the query:
> >
> >
> http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=colors&rows=100
> > is
> >
> > 
> > 
> > 
> > 
> >   1
> >   1
> >  
> > 
> > 
> > 
> > 
> >
> >
> >
> > During index,  blues$Teal/Turquoise  is tokenized into:
> > blues
> > blues$Teal/Turquoise
> >
> > I think that's why facet count includes both blues and
> > blues$Teal/Turquoise.
> >
> > Can I have facet count only include the whole keyword,
> > blues$Teal/Turquoise,  not blues?
> >
> >
> >
>


Re: 'Error 404: missing core name in path' in Solr

2012-04-23 Thread Dan Tuffery
Looks like you need to select a core name on the admin UI before select
search. Have a look in the solr.xml file in your solr home directory, what
cores are defined?

Solr is expecting the core name in the URL:

http://localhost:8080/solr//admin/



On Mon, Apr 23, 2012 at 12:58 AM, vasuj  wrote:

> I http://lucene.472066.n3.nabble.com/file/n3931194/Screenshot_%2847%29.png
> used
>
> //server.deleteByQuery( "*:*" );// CAUTION: deletes everything!
> query in my solr indexing program. Since then i am receiving the error
> whenever , i go to
>
> http://localhost:8080/solr/admin/
>
> and press search with query string :
>
> The error is
>
> HTTP Status 400 - Missing solr core name in path
>
> type Status report
>
> message Missing solr core name in path
>
> description The request sent by the client was syntactically incorrect
> (Missing solr core name in path).
>
> Apache Tomcat/7.0.21
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Error-404-missing-core-name-in-path-in-Solr-tp3931194p3931194.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: How can I get the top term in solr?

2012-04-22 Thread Dan Tuffery
1) The TermsComponent will return the top terms:

http://wiki.apache.org/solr/TermsComponent

2) Add 'debugQuery=on' to your query, look at the 'explain' section in the
results to get information regarding how many times the term appears in the
document (idf).

On Fri, Apr 20, 2012 at 5:31 PM, neosky  wrote:

> Actually I would like to know two meaning of the top term in document level
> and index file level.
> 1.The top term in document level means that I would like to know the top
> term frequency in all document(only calculate once in one document)
> The solr schema.jsp seems to provide to  top 10 term, but it only works in
> small index set. When the index gets large, it is hardly to get the result.
> Suppose I want to use the Solrj to get the top 20 term, What should I do?
> I have reviewed the schema.jsp, but I have no idea how they do this.
>
> 2.Another is that I also would like to know how many times of the a
> specific
> term appear in the index. I would like to know the total number=
> sum(document*appear times in this document)
>
> Any idea will be appreciated.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-can-I-get-the-top-term-in-solr-tp3926536p3926536.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>