Solrconfig.xml - http://apaste.info/dsbv
Schema.xml - http://apaste.info/67PI
This solrconfig.xml file has optimization enabled. I had another file which I
can't locate at the moment, in which I defined a custom merge scheduler in
order to disable optimization.
When I say 1000 segments, I
Hi,
I have some query building and result processing code, which is currently
running as normal Solr client outside of Solr. I think it would make a lot of
sense to move parts of this code into a custom SearchHandler or
SearchComponent. Because I'm not a big fan of the Java language, I would
How are you handling killer queries with solr?
While solr/lucene (currently 4.2.1) is trying to do its best I see sometimes
stupid queries
in my logs, located with extremly long query time.
Example:
q=???+and+??+and+???+and++and+???+and+??
I even get hits for this
Thanks for your answer.
Can you please elaborate on
mssql text searching is pretty primitive compared to Solr
(Link or anything)
Thanks.
On Sun, Jun 2, 2013 at 4:54 PM, Erick Erickson erickerick...@gmail.comwrote:
1 Maybe, maybe not. mssql text searching is pretty primitive
compared to
On Thu, May 30, 2013 at 5:01 PM, Jack Krupansky j...@basetechnology.com wrote:
You gave an XML example, so I assumed you were working with XML!
Right, I did give the output as XML. I find XML to be a great document
markup language, but a terrible command format! Mostly, due to
(mis-)use of the
Hi,
I am constantly getting this error in my solr log:
Can't find (or read) directory to add to classloader:
/non/existent/dir/yields/warning (resolved as:
E:\Projects\apache_solr\solr-4.3.0\example\solr\genesis_experimental\non\existent\dir\yields\warning).
Anyone got any idea on how to solve
Hello!
You should remove that entry from your solrconfig.xml file. It is
something like this:
lib dir=/non/existent/dir/yields/warning /
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
Hi,
I am constantly getting this error in my solr
ok thanks :)
But why was it there anyway? I mean it says in comments:
If a 'dir' option (with or without a regex) is used and nothing
is found that matches, a warning will be logged.
So it looks like a kind of exception handling or logging for libs not
found... so shouldnt this folder actually
Hi,
I am not very sure what the hostPort attribute in core tag of solr.xml
mean. Can someone please let me know?
Thanks,
Prathik
I call the mlt handler using a query which searches for a certain document
(?q=id:some_document_id). The reference document is included in the result and
the score is also returned. I found out, that the score if fixed, independent
of the document. So for each document id I get the same score.
Hello!
That's a good question. I suppose its there to show users how to setup
a custom path to libraries.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
ok thanks :)
But why was it there anyway? I mean it says in comments:
If a 'dir'
Benson, I think the idea is that Tokenizers are created as needed (from
the TokenizerFactory), while those other objects are singular (one
created for each corresponding stanza in solrconfig.xml). So Tokenizers
should be short-lived; they'll be cleaned up after each use, and the
assumption is
On 6/3/13 3:07 AM, Achim Domma wrote:
Hi,
I have some query building and result processing code, which is currently running as
normal Solr client outside of Solr. I think it would make a lot of sense to
move parts of this code into a custom SearchHandler or SearchComponent. Because I'm not a
On Fri, May 31, 2013 at 3:57 AM, Michael Sokolov
msoko...@safaribooksonline.comgt wrote:
On UNIX platforms, take a look at vmstat for basic I/O measurement, and
iostat for more detailed stats. One coarse measurement is the number of
blocked/waiting processes - usually this is due to I/O
Hi,
I'm seeing really slow query times. 7-25 seconds when I run a simple filter
query that uses my SpatialRecursivePrefixTreeFieldType field.
My index is about 30k documents. Prior to adding the Spatial field, the on
disk space was about 100Mb, so it's a really tiny index. Once I add the
spatial
Hi,
Could you please add EmrahKara to ContributorsGroup in solr wiki?
--
*[image: CNT logo] http://www.cntbilisim.com.tr/
**Emrah Kara*
Developer at CNT
Email / Gtalk: em...@cntbilisim.com.tr Skype: rockipsiz
TEL: +90 232 3481851 GSM: +90 533 3634362 FAX: +90 232
Done, looking forward to your contributions!
Erick
On Mon, Jun 3, 2013 at 7:22 AM, Emrah Kara em...@cntbilisim.com.tr wrote:
Hi,
Could you please add EmrahKara to ContributorsGroup in solr wiki?
--
*[image: CNT logo] http://www.cntbilisim.com.tr/
**Emrah Kara*
Also, here is a sample query, and the debugQuery output
fq={!cost=200}*:* -availability_spatial:Intersects(182.6 0 199.4 1)
Incase the formatting is bad, here is a raw past of the debugQuery:
http://pastie.org/pastes/872/text?key=ksjyboect4imrha0rck8sa
?xml version=1.0 encoding=UTF-8?
Here's a link to various transformations you can do
while indexing and searching in Solr:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Consider
stemming
ngrams
WordDelimiterFilterFactory
ASCIIFoldingFilterFactory
phrase queries
boosting
synonyms
blah blah blah
You can't do
Hi,
but the path looks like it shows how to setup non existent lib warning...
:D
On Mon, Jun 3, 2013 at 2:56 PM, Rafał Kuć r@solr.pl wrote:
Hello!
That's a good question. I suppose its there to show users how to setup
a custom path to libraries.
--
Regards,
Rafał Kuć
Sematext ::
I'm reproducing the problem with the 4.2.1 example with 2 shards.
1) started up solr shards, indexed the example data, and confirmed empty
fieldCaches
[sanniere@funlevel-dx example]$ java
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar
Hi,
I am importing multiple table (by join) into solr using DIH. All is set,
except for 1 confusion:
what to do with *uniqueKey* in schema?
When I had only 1 table, I had it fine. Now how to put 2 uniqueKeys (both
from different table).
For example:
uniqueKeytable1_id/uniqueKey
Hi,
Thanks for your answer.
I want to refer to your message, because I am trying to choose the right
tool.
1. regarding stemming:
I am running in ms-sql
SELECT * FROM sys.dm_fts_parser ('FORMSOF(INFLECTIONAL,provide)', 1033,
0, 0)
and I receive
group_id phrase_id occurrence special_term
On 6/3/2013 2:39 AM, Bernd Fehling wrote:
How are you handling killer queries with solr?
While solr/lucene (currently 4.2.1) is trying to do its best I see sometimes
stupid queries
in my logs, located with extremly long query time.
Example:
On 6/3/2013 3:16 AM, Prathik Puthran wrote:
I am not very sure what the hostPort attribute in core tag of solr.xml
mean. Can someone please let me know?
This only has meaning if you are using SolrCloud. This is how each Solr
server in the cloud informs the cloud what port it is using.
On 6/3/2013 5:58 AM, Raheel Hasan wrote:
but the path looks like it shows how to setup non existent lib warning...
:D
The reason for its existence is encoded in its name. A nonexistent path
results in a warning. It's a way to illustrate to a novice what happens
when you have a non-fatal
I would like to have the min-match set differently for different fields in
my dismax handler. Is this possible?
Hi Shawn,
well, the user is the world and the servers have enough capacity.
So its nothing really to worry about.
OK, could raise timeout from standard 60 to 90, 120 or even 180 seconds.
Just wanted to know how other solr developer handle this.
The technical question, where is the difference
Hi,
Thanks for the replies. Actually, I had only a small confusion:
From table_1 I got key_1; using this I join into table_2. But table_2 also
gave another key key_2 which is needed for joining with table_3.
So for Table1 and Table2 its obviously just fine... but what will happen
when table3 is
ok fantastic... now I will comment it to be sure thanks a lot
Regards,
Raheel
On Mon, Jun 3, 2013 at 7:27 PM, Shawn Heisey s...@elyograg.org wrote:
On 6/3/2013 5:58 AM, Raheel Hasan wrote:
but the path looks like it shows how to setup non existent lib warning...
:D
The reason for
On 6/3/2013 8:43 AM, Bernd Fehling wrote:
Hi Shawn,
well, the user is the world and the servers have enough capacity.
So its nothing really to worry about.
OK, could raise timeout from standard 60 to 90, 120 or even 180 seconds.
Just wanted to know how other solr developer handle this.
The
Same answer. Whether it is 2, 3, 10 or 1000 tables, you, the data architect
must decide how to uniquely identify Solr documents. In general, when
joining n tables, combine the n keys into one composite key. Either do it on
the SQL query side, or with a Solr update request processor.
-- Jack
My first guess is that no documents match the query provinical court.
Because you have spellcheck.maxCollationTries set to a non-zero value, it
will not return these as collations unless the correction will return hits.
You can test my theory out by removing spellcheck.maxCollationTries from
There are two radically distinct use cases:
1. Consumers on the open Internet. They do stupid things. Give them a very
constrained search experience, enforced with query preprocessing. Maybe give
them only dismax queries.
2. Professional power users. They typically have credentials for using
ok. But do we need it? Thats what I am confused at. should 1 key from
table_1 pull all the data in relationship as they were inserted?
On Mon, Jun 3, 2013 at 7:53 PM, Jack Krupansky j...@basetechnology.comwrote:
Same answer. Whether it is 2, 3, 10 or 1000 tables, you, the data
architect must
No, but you can with the LucidWorks Search query parser:
f1:(cat dog fox bat fish cow)~50% f2:(cat dog fox bat fish zebra)~2
See:
http://docs.lucidworks.com/display/lweug/Minimum+Match+for+Simple+Queries
-- Jack Krupansky
-Original Message-
From: Eric Wilson
Sent: Monday, June 03,
Well, there is a hack(ish) way to do it:
_query_:{!type=edismax qf='someField' v='$q' mm=100%}
This is clearly not a solrconfig.xml settings, but part of your query string
using LocalParam behavior.
This is going to get really messy if you have plenty of fields you'd like to
search, where
Hi,
My cluster hangs again running an update process, the HTTP POST request was
aborted because a timeout error. After the hang, I couldn't do more updates
without restart the cluster.
I could see this error on node's log after kill it. Is like if solr waits for
the update response forever …
For each fot he 4 cases listed below, can you give your query request string
(q=...fq=...qt=...etc) and also the spellchecker output?
James Dyer
Ingram Content Group
(615) 213-4311
-Original Message-
From: Raheel Hasan [mailto:raheelhasan@gmail.com]
Sent: Monday, June 03, 2013
I think you should take a look at the TimeLimitingCollector (it is used
also inside SolrIndexSearcher).
My understanding is that it will stop your server from consuming
unnecessary resources.
--roman
On Mon, Jun 3, 2013 at 4:39 AM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de wrote:
How are
Consider the following use case.
Certain words are extracted from a document and indexed. The exact sentence
containing the word cannot be stored alongside the extracted word because
of the volume at which the documents grow; How can the index and, lets call
it doc servers be separated ?
An
You can use this tool to analyze the logs..
https://github.com/dfdeshom/solr-loganalyzer
We use solrmeter to test the performance / Stress testing.
https://code.google.com/p/solrmeter/
--
View this message in context:
There is the timeAllowed parameter:
http://wiki.apache.org/solr/CommonQueryParameters#timeAllowed
-- Jack Krupansky
-Original Message-
From: Roman Chyla
Sent: Monday, June 03, 2013 11:53 AM
To: solr-user@lucene.apache.org
Subject: Re: how are you handling killer queries?
I think you
I will be out of the office starting 03/06/2013 and will not return until
04/06/2013.
Please email to itsta...@actionimages.com for any urgent issues.
Action Images is a division of Reuters Limited and your data will therefore be
protected
in accordance with the Reuters Group Privacy / Data
Hi,
Using the same schema for both Solr 3.5 and Solr 4.2.1 and posting the same
data to both these server, and the memory requirements seem to have gone up
sharply during request handling.
. Requests come in at around 200QPS.
. Document sizes are very large but that did not seem to be a problem
Also, just to be clear, MM/minMatch, is not an option for a field but for
a full BooleanQuery. I mean, you can't have two different MM values within
the same BooleanQuery, except with nested BooleanQuerys, where each BQ has
its own MM.
-- Jack Krupansky
-Original Message-
From:
You can also check out this link.
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-remove-caches-in-SOLR-td4061216.html#a4061219
--
View this message in context:
http://lucene.472066.n3.nabble.com/Disable-all-caches-in-solr-tp4066517p4067870.html
Sent from the Solr - User mailing list
Looks interesting, but it's just for the UpdateHandler. Right? Does a similar
handler for searching already exist?
Achim
Am 03.06.2013 um 17:22 schrieb Jack Krupansky:
Check out the support for external scripting of update request processors:
Sorry about that. Unfortunately, scripting is only on the update side. But I
imagine athat a lot of the logic could be repurposed for the query side.
-- Jack Krupansky
-Original Message-
From: Achim Domma
Sent: Monday, June 03, 2013 2:31 PM
To: solr-user@lucene.apache.org
Subject:
Yeah, it's currently just for the update side of things. But this issue is
open https://issues.apache.org/jira/browse/SOLR-3669 and assigned to me, for
one of these days. I set it for my 5.0 radar. Certainly anyone that wants to
make this happen sooner than I maybe will possibly hopefully
On 6/3/2013 12:35 PM, PeriS wrote:
I noticed the delta-import is creating a new indexed entry on top of the
existing one..is that normal?
Not sure what you are asking here, so I'll give an answer to the
question I think you're asking: If you have a uniqueKey defined in your
schema, then
Shawn,
You got the point; I do have a the unique key defined, but for some reason,
when i run the delta-import; a new entry is created for the same record with a
new unique key. Its almost somehow it doesn't detect the existing record.
On Jun 3, 2013, at 3:51 PM, Shawn Heisey
Hi Erik,
In my case I have to calculate a custom value depending on the retrieved
candidates .This will be for each document.So my choice will be Doc
Transformer.
Lets say in this case if I need to include a java class which does the
computation , how does I tie that with Doc transformer.
Solr
You can refer this post to use doctransforemers..
http://java.dzone.com/news/solr-40-doctransformers-first
--
View this message in context:
http://lucene.472066.n3.nabble.com/Custom-Response-Handler-tp4067558p4067926.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello All,
I've been working on a 2-shard SolrCloud instance with several million
documents, and the import process has recently begun to miss documents as they
are added to the underlying Postgres database. There are no glaring failures in
the log files (all SEVERE and WARNING level errors
You have to be careful looking at the QTime's. They do not include garbage
collection. I've run into issues where QTime is short (cause it was), it just
happened that the query came in during a long garbage collection where
everything was paused. So you can get into situations where once the
On 6/3/2013 3:33 PM, Greg Harris wrote:
You have to be careful looking at the QTime's. They do not include garbage
collection. I've run into issues where QTime is short (cause it was), it just
happened that the query came in during a long garbage collection where
everything was paused. So
Hey guys,
I have recently looked into an issue with my Solrcloud related to very high
load when performing a full-import on DIH.
While some work could be done to improve my queries, etc in DIH, this lead
me to a new feature idea in Solr: weighted internal load balancing.
Basically, I can think
Hi Chris:
Have you read: http://wiki.apache.org/solr/SpatialForTimeDurations
You're modeling your data sub-optimally. Full precision rectangles
(distErrPct=0) doesn't scale well and you're seeing that. You should
represent your durations as a point and it will take up a fraction of the
space
SOLR 4.2.1, tomcat 6.0.35, CentOS 6.2 (2.6.32-220.4.1.el6.x86_64 #1 SMP),
java 6u27 64 bit
6 nodes, 2 shards, 3 replicas each. Names changed to r1s2 (replica1 - shard
2), r2s2, and r3s2 for each replica in shard 2.
What we see:
* Under production load, we restart a leader (r1s2), and observe in
On Jun 3, 2013, at 3:33 PM, Tim Vaillancourt t...@elementspace.com wrote:
Should I JIRA this? Thoughts?
Yeah - it's always been in the back of my mind - it's come up a few times -
eventually we would like nodes to report some stats to zk to influence load
balancing.
- mark
I want to get cluster state of my SolrCloud by Solrj (I know that admin
page shows it but I want to customize it at my application).
Firstly wiki says that:
CloudSolrServer server = new CloudSolrServer(localhost:9983);
why CloudSolrServer takes only one Zookeeper host:port as an argument? I
It actually accepts a comma separated list of zk host addresses (your quorum).
Same format as zk describes in it's docs.
To get the cluster state, get the ZkStateReader from the CloudSolrServer and
then it's getClusterState or something.
- Mark
On Jun 3, 2013, at 5:30 PM, Furkan KAMACI
Thanks - I can try and look into this perhaps next week. You might copy the
details into a JIRA issue to prevent it from getting lost though...
- Mark
On Jun 3, 2013, at 4:46 PM, John Guerrero jguerr...@tagged.com wrote:
SOLR 4.2.1, tomcat 6.0.35, CentOS 6.2 (2.6.32-220.4.1.el6.x86_64 #1
64 matches
Mail list logo