Thanks, Jack.
I have filed a tkt: https://issues.apache.org/jira/browse/SOLR-7154
On Tue, Feb 24, 2015 at 11:43 AM, Jack Krupansky jack.krupan...@gmail.com
wrote:
Thanks. That at least verifies that the accented e is stored in the field.
I don't see anything wrong here, so it is as if the
On 24 February 2015 at 15:50, Jack Krupansky jack.krupan...@gmail.com wrote:
It's a string field, so there shouldn't be any analysis. (read back in the
thread for the field and field type.)
It's a multi-term expansion. There is _some_ analysis one way or another :-)
Solr Analyzers,
Tang, Rebecca [rebecca.t...@ucsf.edu] wrote:
[12-15 second response time instead of 0-3]
Solr index size 183G
Documents in index 14364201
We just have single solr box
It has 100G memory
500G Harddrive
16 cpus
The usual culprit is memory (if you are using spinning drive as your storage).
It
It's a string field, so there shouldn't be any analysis. (read back in the
thread for the field and field type.)
-- Jack Krupansky
On Tue, Feb 24, 2015 at 3:19 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:
What happens if the query does not have wildcard expansion (*)? If the
behavior
Exact query:
/select?q=raw_name:beyonce*wt=jsonfl=raw_name
Response:
{ responseHeader: {status: 0,QTime: 0,params: {
fl: raw_name, q: raw_name:beyonce*, wt: json
} }, response: {numFound: 2,start: 0,docs: [
{raw_name: beyoncé }, {
Hi Dirk,
The RPT field type can be used for distance sorting/boosting but it’s a
memory pig when used as-such so don’t do it unless you have to. You only
have to if you have a multi-valued point field. If you have single-valued,
use LatLonType specifically for distance sorting.
Your sample
Our solr index used to perform OK on our beta production box (anywhere between
0-3 seconds to complete any query), but today I noticed that the performance is
very bad (queries take between 12 – 15 seconds).
I haven't updated the solr index configuration (schema.xml/solrconfig.xml)
lately.
I guess the place to start is the Reference Guide:
https://cwiki.apache.org/confluence/display/solr/SolrCloud
Generally speaking, when you start Solr with any sort of Zookeeper, you've
entered cloud mode, which essentially means that Solr is now capable of
organizing cores into groups that
What happens if the query does not have wildcard expansion (*)? If the
behavior is correct, then the issue is somehow with the
MultitermQueryAnalysis (a hidden automatically generated analyzer
chain): http://wiki.apache.org/solr/MultitermQueryAnalysis
Which would still make it a bug, but at least
On 2/24/2015 1:09 PM, Tang, Rebecca wrote:
Our solr index used to perform OK on our beta production box (anywhere
between 0-3 seconds to complete any query), but today I noticed that the
performance is very bad (queries take between 12 – 15 seconds).
I haven't updated the solr index
On 2/24/2015 1:21 PM, Benson Margulies wrote:
On Tue, Feb 24, 2015 at 1:30 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
Benson:
Are you trying to run independent invocations of Solr for every node?
Otherwise, you'd just want to create a 8 shard collection with
Thanks. That at least verifies that the accented e is stored in the field.
I don't see anything wrong here, so it is as if the Lucene prefix query was
mapping the accented characters. It's not supposed to do that, but...
Go ahead and file a Jira bug. Include all of the details that you provided
On Tue, Feb 24, 2015 at 4:27 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:
: Unfortunately, this is all 5.1 and instructs me to run the 'start from
: scratch' process.
a) checkout the left nav of any ref guide page webpage which has a link to
Older Versions of this Guide (PDF)
b) i'm
: Unfortunately, this is all 5.1 and instructs me to run the 'start from
: scratch' process.
a) checkout the left nav of any ref guide page webpage which has a link to
Older Versions of this Guide (PDF)
b) i'm not entirely sure i understand what you're asking, but i'm guessing
you mean...
*
Hi,
I'm using Solr 4.10.3 As field type in my schema.xml. I'm using
location_rpt like the description in the documentation.
fieldType name=location_rpt class=
solr.SpatialRecursivePrefixTreeFieldType geo=true distErrPct=0.025
maxDistErr=0.09 units=degrees /
Everything works good and fine.
On Tue, Feb 24, 2015 at 1:30 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
Benson:
Are you trying to run independent invocations of Solr for every node?
Otherwise, you'd just want to create a 8 shard collection with
maxShardsPerNode set to 8 (or more I guess).
Michael
On Tue, Feb 24, 2015 at 3:32 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
https://cwiki.apache.org/confluence/display/solr/SolrCloud
Unfortunately, this is all 5.1 and instructs me to run the 'start from
scratch' process.
I wish that I could take my existing one-core
Do you mean the snapinstaller (bash) script? Those are legacy scripts. It's
been a long time since they were tested. The ReplicationHandler is the
recommended way to setup replication. If you want to take a snapshot then
the replication handler has an HTTP based API which lets you do that.
In any
Hi Charlie,
Thanks a lot for your response
On Tue, Feb 24, 2015 at 5:08 PM, Charlie Hull char...@flax.co.uk wrote:
On 24/02/2015 03:03, Richard Gibbs wrote:
Hi There,
I am in the process of choosing a search technology for one of my projects
and I was looking into Solr and Elasticsearch.
Erick,
Our default operator is AND.
Both queries below parse the same:
a OR (b c) OR d
a OR (b AND c) OR d
The parsed query:
str name=parsedquery_toStringContents:a (+Contents:b +Contents:c)
Contents:d/str
So this part is consistent with our expectation.
I'm a bit puzzled by your
Looks like the ZooKeeper server is either not running or not accepting
connection possibly because of some configuration issues. Can you look into
the ZooKeeper logs and see if there are any exceptions?
On Tue, Feb 24, 2015 at 11:30 AM, CKReddy Bhimavarapu chaitu...@gmail.com
wrote:
Hi,
I
Dear all,
Hi,
I was wondering is there any performance comparison available for different
solr queries?
I meant what is the cost of different Solr queries from memory and CPU
points of view? I am looking for a report that could help me in case of
having different alternatives for sending single
On 24/02/2015 03:03, Richard Gibbs wrote:
Hi There,
I am in the process of choosing a search technology for one of my projects
and I was looking into Solr and Elasticsearch.
Two features that I am more interested are geo aggregations (for map
clustering) and search alerts. Elasticsearch seem
Hi,
I noticed that not only SOLR does not deliver a WAR file anymore but
also advices not to try to provide a custom WAR file that can be
deployed anymore as future version may depend on custom jetty features.
Until 4.10. we were able to provide a WAR file with all the plug-ins we
need for
Solr is an IR system where Spell correction is a topping however Google has
a team dedicated just for Spell corrections. Did you mean (more general
term and much broader than basic Spell correctors) or Spell Correctors
require a plethora of skills. I will just discuss Spell correctors here and
not
Hi,
Kindly help me understand the behavior of following field.
field name=manu_exact type=string indexed=true stored=false
docValues=true /
For a field like above where indexed=true and docValues=true, is it
that:
1) For sorting/faceting on *manu_exact* the docValues will be used.
2) For
Thanks for your response Mikhail.
On Tue, Feb 24, 2015 at 5:35 PM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
Both statements seem true to me.
On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather modather1...@gmail.com
wrote:
Hi,
Kindly help me understand the behavior of following
If you have an English dictionary available containing words with their
lemmas, you may use my patch:
https://issues.apache.org/jira/browse/LUCENE-6254
This lemmatizer works with Danish, German and Norwegian dictionaries
which are available for free. I'm not sure there exists a free English
Hello,
we are using solr 4.10.1. There are two cores for different use cases with
around 20 million documents (location descriptions) per core. Each document has
a geometry field which stores a point and a bbox field which stores a bounding
box. Both fields are defined with:
fieldType
yes
chaitanya@imart-desktop:~/solr/zookeeper-3.4.6/bin$ ./zkServer.sh start
JMX enabled by default
Using config: /home/chaitanya/solr/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
chaitanya@imart-desktop:~/solr/zookeeper-3.4.6/bin$ ./zkServer.sh status
JMX enabled by
Both statements seem true to me.
On Tue, Feb 24, 2015 at 2:49 PM, Modassar Ather modather1...@gmail.com
wrote:
Hi,
Kindly help me understand the behavior of following field.
field name=manu_exact type=string indexed=true stored=false
docValues=true /
For a field like above where
Hello,
I have a scenario where I want to use own custom field instead
of freq in suggestions of each term. Custom field will be integer value
and having some different value than freq in suggestion.
Is it possible in Solr to use custom field instead of freq in suggestion.
Your help is
Hi Peri,
You cannot do sort on multi-valued field. It should be set to
false.
On Tue, Feb 24, 2015 at 8:07 PM, Peri Subrahmanya
peri.subrahma...@htcinc.com wrote:
All,
Is there a way sorting can work on a multi-valued field or does it always
have to be “false” for it to work.
Hi Dmitry,
Thank you for the detailed clarification!
Recently, I've created a few patches to Pivot version(LUCENE-2562), so I'd
like to some more work and keep up to date it.
If you would like to work on the Pivot version, may I suggest you to fork
the github's version? The ultimate goal is
fwiw,
open solr jira https://issues.apache.org/jira/browse/SOLR-2522 pls vote
however, everything seems done at lucene level
https://issues.apache.org/jira/browse/LUCENE-5454
On Tue, Feb 24, 2015 at 6:11 PM, Nitin Solanki nitinml...@gmail.com wrote:
Hi Peri,
You cannot do sort
Hello,
We cannot use replication with the current architecture, so decided to use
snapshotter with snapinstaller.
Here is the full stack trace
8937 [coreLoadExecutor-5-thread-3] INFO
org.apache.solr.core.CachingDirectoryFactory – Closing directory:
Given the limited needs, I would probably do something like this:
1) Put a language identifier in the UpdateRequestProcessor chain
during indexing and route out at least known problematic languages,
such as Chinese, Japanese, Arabic into individual fields
2) Put everything else together into one
Hi Thomas,
I just downloaded solr5.0.0 tgz and found this in the directory structure:
solr-5.0.0/server/webapps$ ls
solr.war
- How to deploy your schema.xml, stopwords, solr plug-ins etc. for
testing in an isolated environment
the cores, for example are created in the:
On 2/24/2015 1:16 AM, Thomas Scheffler wrote:
I noticed that not only SOLR does not deliver a WAR file anymore but
also advices not to try to provide a custom WAR file that can be
deployed anymore as future version may depend on custom jetty features.
Until 4.10. we were able to provide a
What specifically do you mean by stall? Very slow but comes back?
Never comes back? Throws an error?
What is your field definition for body? How big is the content in it?
Do you change the fields returned if you search body and if you search
just headers?
How many rows do you request back?
One
Hello all,
I have 49 GB of indexed data. I am doing spell checking
things. I have applied ShingleFilter on both index and query part and
taking 25 suggestions of each word in the query and not using collations.
When I search a phrase(taken 5-6 words. Ex.- barack obama is president of
Hi, Tomoko!
Thanks for being a fan of luke!
Current status of github's luke (https://github.com/DmitryKey/luke) is that
it has releases for all the major lucene versions since 4.3.0, excluding
4.4.0 (luke 4.5.0 should be able open indices of 4.4.0) and the latest --
5.0.0.
Porting the github's
All,
Is there a way sorting can work on a multi-valued field or does it always have
to be “false” for it to work.
Thanks
-Peri
*** DISCLAIMER *** This is a PRIVATE message. If you are not the intended
recipient, please delete without copying and kindly advise us by e-mail of the
mistake in
Hi,
I'm an user / fan of Luke, so deeply appreciate your work.
I've carefully read the readme, noticed the (one of) project's goal:
To port the thinlet UI to an ASL compliant license framework so that it
can be contributed back to Apache Lucene. Current work is done with GWT
2.5.1.
There has
Hi ,
We are looking for an option to boost a document while indexing based on
the values of certain field.
For example : lets say we have 10 documents with fields say- name,acc no,
status, age address etc.
Now for documents with status 'Active' we want to boost by value 1000 and
if status is
The usual strategy is to have an UpdateRequestProcessor chain that
will copy the field and keep only one value from it, specifically for
sort. There is a whole collection of URPs to help you choose which
value to keep, as well as how to provide a default.
You can see the full list at:
Please post the info I requested - the exact query, and the Solr response.
-- Jack Krupansky
On Tue, Feb 24, 2015 at 12:45 PM, Arun Rangarajan arunrangara...@gmail.com
wrote:
In our case, the lower-casing is happening in a custom Java indexer code,
via Java's String.toLowerCase() method.
I
BooleanQuery’s extractTerms looks like this:
public void extractTerms(SetTerm terms) {
for (BooleanClause clause : clauses) {
if (clause.isProhibited() == false) {
clause.getQuery().extractTerms(terms);
}
}
}
that’s generally the method called by the Highlighter for what terms
Dear Alex,
Nothing comes back when I do a body search. It shows a searching
process on the client but then it just stops and no result comes up.
I am wondering if this is schema related problem.
When I search a subject on the mail client I get output as below and :-
8025 [main] INFO
Look for the line like this in your log with the search matching the
body. Maybe put a nonsense string and look for that. This should tell
you what the Solr-side search looks like.
The thing that worries me here is: rows=107178 - that's most probably
what's blowing up Solr. You should be paging,
Dear Alex,
I checked the log. When searching the fields From , To, Subject. It records
it
When searching Body, there is no log showing. I am assuming it is a problem
in the schema.
Will post schema.xml output in next mail.
On Wed, Feb 25, 2015 at 1:09 AM, Alexandre Rafalovitch arafa...@gmail.com
Hi Alex,
Sorry for such noobness question.
But where does the schema file go in Solr? Is the directory below correct?
/opt/solr/solr/collection1/data
Correct?
Thanks
Kevin
On Wed, Feb 25, 2015 at 1:21 AM, Kevin Laurie
superinterstel...@gmail.com wrote:
Dear Alex,
I checked the log. When
Hmmm, not quite sure what to say. Offsets and positions help,
particularly with FastVectorHighlighter, but the highlighting is
usually re-analyzed anyway so it _shouldn't_ matter. But what I don't
know about highlighting could fill volumes ;)..
Sorry I can't be more help here.
Erick
On Tue, Feb
You're probably hitting different request handlers. From the fragment
you posted, the one that returns 8 is going to the /browse handler
(see solrconfig.xml). The admin UI goes to either /select or /query.
These are configured totally differently in terms of what fields are
searched etc.
Attach
Hi Alex,
Below is where my schema is stored:-
/opt/solr/solr/collection1/conf#
File name: schema.xml
Below output for body
fields
field name=id type=string indexed=true stored=true
required=true /
field name=uid type=slong indexed=true stored=true
required=true /
field name=box
In our case, the lower-casing is happening in a custom Java indexer code,
via Java's String.toLowerCase() method.
I used the analysis tool in Solr admin (with Jetty). I believe the raw
bytes explain this.
Attached are the results for beyonce in file beyonce_no_spl_chars.JPG and
beyoncé in file
Hmmm, that's not my understanding. docValues are simply a different
layout for storing
the _indexed_ values that facilitates rapid loading of the field from
disk, essentially
putting the uninverted field value in a conveniently-loadable form.
So AFAIK, the field is stored only once and used for
Ticket filed, thanks!
https://issues.apache.org/jira/browse/SOLR-7152
On Fri, Feb 20, 2015 at 9:29 PM, Joel Bernstein joels...@gmail.com wrote:
Ryan,
This looks like a good jira ticket to me.
Joel Bernstein
Search Engineer at Heliosearch
On Fri, Feb 20, 2015 at 6:40 PM, Ryan Josal
Benson:
Are you trying to run independent invocations of Solr for every node?
Otherwise, you'd just want to create a 8 shard collection with
maxShardsPerNode set to 8 (or more I guess).
Michael Della Bitta
Senior Software Engineer
o: +1 646 532 3062
appinions inc.
“The Science of Influence
There is also PostingsHighlighter -- I recommend it, if only for the
performance improvement, which is substantial, but I'm not completely
sure how it handles this issue. The one drawback I *am* aware of is
that it is insensitive to positions (so words from phrases get
highlighted even in
With so much of the site shifted to 5.0, I'm having a bit of trouble
finding what I need, and so I'm hoping that someone can give me a push
in the right direction.
On a big multi-core machine, I want to set up a configuration with 8
(or perhaps more) nodes treated as shards. I have some very
The field definition looks fine. It's not storing any content
(stored=false) but is indexing, so you should find the records but not
see the body in them.
Not seeing a log entry is more of a worry. Are you sure the request
even made it to Solr?
Can you see anything in Dovecot's logs? Or in
How about creating two fields for the multi-valued field. First, grab the
higher and lower values of the multi-valued field by using natural sort order.
Then use the first field to store the highest order value. Use second field to
store lowest order value. Both these fields are single valued.
why you use 15 replicas?
more replicas more slower.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solrcloud-performance-issues-tp4186035p4188738.html
Sent from the Solr - User mailing list archive at Nabble.com.
You're making it too complicated. Both a docValues field and
an indexed (not docValues) field will give you the same
functionality. For rapidly changing indexes, docValues will
load more quickly when a new searcher is opened.
Your question below is not really relevant.
Can it be *field
We used HDFS as our Solr index storage and we really have a heavy update
load. We had met much problems with current leader/replica solution. There
is duplicate index computing on Replilca side. And the data sync between
leader/replica is always a problem.
As HDFS already provides data
Thanks Erick for your detailed response.
Sorry! I missed to put that I was trying to understand it in context of
Solr-5.0.0 where fieldcache is no more available.
Regards,
Modassar
On Wed, Feb 25, 2015 at 11:26 AM, Erick Erickson erickerick...@gmail.com
wrote:
You're making it too
update: everything is working fine after I downloaded new zookeeper and set
same configuration.
thanks for helping.
On Tue, Feb 24, 2015 at 5:13 PM, CKReddy Bhimavarapu chaitu...@gmail.com
wrote:
yes
chaitanya@imart-desktop:~/solr/zookeeper-3.4.6/bin$ ./zkServer.sh start
JMX enabled by
Hello,
We are trying to add documents in solr with ttl defined(document expiration
feature), which is expected to expire at specified time, but it is not.
Following are the settings we have defined in solrconfig.xml and
managed-schema.
solr version : 5.0.0
*solrconfig.xml*
On 2/24/2015 5:45 PM, Tang, Rebecca wrote:
We gave the machine 180G mem to see if it improves performance. However,
after we increased the memory, Solr started using only 5% of the physical
memory. It has always used 90-something%.
What could be causing solr to not grab all the physical
Awesome news. Thanks.
*Sebastián Ramírez*
Diseñador de Algoritmos
http://www.senseta.com
Tel: (+571) 795 7950 ext: 1012
Cel: (+57) 300 370 77 10
Calle 73 No 7 - 06 Piso 4
Linkedin: co.linkedin.com/in/tiangolo/
Twitter: @tiangolo https://twitter.com/tiangolo
Email:
Be careful what you think is being used by Solr since Lucene uses
MMapDirectories under the covers, and this means you might be seeing
virtual memory. See Uwe's excellent blog here:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
Best,
Erick
On Tue, Feb 24, 2015 at 5:02
Rebecca
You don’t want to give all the memory to the JVM. You want to give it just
enough for it to work optimally and leave the rest of the memory for the OS to
use for caching data. Giving the JVM too much memory can result in worse
performance because of GC. There is no magic formula to
rebecca,
i would suggest making sure you have some gc logging configured so you have
some visibility into the JVM, esp if you don't already have JMX for sflow agent
configured to give you external visibility of those internal metrics
the options below just print out the gc activity to a log
meant to type JMX or sflow agent
also should have mentioned you want to be running a very recent JDK
From: Boogie Shafer boogie.sha...@proquest.com
Sent: Tuesday, February 24, 2015 18:03
To: solr-user@lucene.apache.org
Subject: Re: how to debug solr
The other memory is used by the OS as file buffers. All the important parts of
the on-disk search index are buffered in memory. When the Solr process wants a
block, it is already right there, no delays for disk access.
wunder
Walter Underwood
wun...@wunderwood.org
We gave the machine 180G mem to see if it improves performance. However,
after we increased the memory, Solr started using only 5% of the physical
memory. It has always used 90-something%.
What could be causing solr to not grab all the physical memory (grabbing
so little of the physical
So for a requirement where I have a field which is used for sorting,
faceting and searching what should be the better field definition.
Can it be *field name=manu_exact type=string indexed=true
stored=false docValues=true /*
or
Two fields each for sorting+faceting and for searching like
78 matches
Mail list logo