Bump?
On Mon, Jun 27, 2011 at 06:17:42PM +0100, me said:
On the SolrJetty page
http://wiki.apache.org/solr/SolrJetty
there's a link to a tar ball
http://wiki.apache.org/solr/SolrJetty?action=AttachFiledo=viewtarget=DEMO_multiple_webapps_jetty_6.1.3.tgz
which fails with the error
On the SolrJetty page
http://wiki.apache.org/solr/SolrJetty
there's a link to a tar ball
http://wiki.apache.org/solr/SolrJetty?action=AttachFiledo=viewtarget=DEMO_multiple_webapps_jetty_6.1.3.tgz
which fails with the error
You are not allowed to do AttachFile on this page.
Can someone fix
First, a couple of assumptions.
We have boxes with a large amount (~70Gb) of memory which we're running
Solr under. We've currently set -Xmx to 25Gb with the GC settings
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing
We're reluctant to up the -Xmx
Due to some emergency maintenance I needed to run delete on a large
number of documents in a 200Gb index.
The problem is that it's taking an inordinately long amount of time (2+
hours so far and counting) and is steadily eating up disk space -
presumably up to 2x index size which is getting
I have a field 'type' that has several values. If it's type 'foo' then
it also has a field 'restriction_id'.
What I want is a filter query which says either it's not a 'foo' or if
it is then it has the restriction '1'
I expect two matches - one of type 'bar' and one of type 'foo'
Neither
On Mon, Apr 25, 2011 at 04:34:05PM -0400, Jonathan Rochkind said:
This is what I do instead, to rewrite the query to mean the same thing but
not give the lucene query parser trouble:
fq=( (*:* AND -type:foo) OR restriction_id:1)
*:* means everything, so (*:* AND -type:foo) means the same
On Mon, Apr 25, 2011 at 05:02:12PM -0400, Yonik Seeley said:
It really shouldn't be that slow... how many documents are in your
index, and how many match -type:foo?
Total number of docs is 161,000,000
type:foo 39,000,000
-type:foo 122,200,000
type:bar 90,000,000
We're aware it's large and
On Wed, Apr 06, 2011 at 12:05:57AM +0200, Jan Høydahl said:
Just curious, was there any resolution to this?
Not really.
We tuned the GC pretty aggressively - we use these options
-server
-Xmx20G -Xms20G -Xss10M
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+CMSIncrementalMode
On Mon, Feb 07, 2011 at 02:06:00PM +0100, Markus Jelsma said:
Heap usage can spike after a commit. Existing caches are still in use and new
caches are being generated and/or auto warmed. Can you confirm this is the
case?
We see spikes after replication which I suspect is, as you say, because
On Fri, Jan 28, 2011 at 12:29:18PM -0500, Yonik Seeley said:
That's odd - there should be nothing special about negative numbers.
Here are a couple of ideas:
- if you have a really big index and querying by a negative number
is much more rare, it could just be that part of the index wasn't
If I do
qt=dismax
fq=uid:1
(or any other positive number) then queries are as quick as normal - in
the 20ms range.
However, any of
fq=uid:\-1
or
fq=uid:[* TO -1]
or
fq=uid:[-1 to -1]
or
fq=-uid:[0 TO *]
then queries are incredibly slow - in the 9
On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said:
Are you sure you need CMS incremental mode? It's only adviced when running on
a machine with one or two processors. If you have more you should consider
disabling the incremental flags.
I'll test agin but we added those to get
On Thu, Jan 27, 2011 at 11:32:26PM +, me said:
If I do
qt=dismax
fq=uid:1
(or any other positive number) then queries are as quick as normal - in
the 20ms range.
For what it's worth uid is a TrieIntField with precisionStep=0,
omitNorms=true, positionIncrementGap=0
We have two slaves replicating off one master every 2 minutes.
Both using the CMS + ParNew Garbage collector. Specifically
-server -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
but periodically they both get into a GC storm and just keel over.
On Mon, Jan 24, 2011 at 08:00:53PM +0100, Markus Jelsma said:
Are you using 3rd-party plugins?
No third party plugins - this is actually pretty much stock tomcat6 +
solr from Ubuntu. The only difference is that we've adapted the
directory layout to fit in with our house style
On Mon, Jan 24, 2011 at 10:55:59AM -0800, Em said:
Could it be possible that your slaves not finished their replicating until
the new replication-process starts?
If so, there you got the OOM :).
This was one of my thoughts as well - we're currently running a slave
which has no queries in it
I have a fairly classic master/slave set up.
Response times on the slave are generally good with blips periodically,
apparently when replication is happening.
Occasionally however the process will have one incredibly slow query and
will peg the CPU at 100%.
The weird thing is that it will
On Mon, Jan 10, 2011 at 01:56:27PM -0500, Brian Burke said:
This sounds like it could be garbage collection related, especially
with a heap that large. Depending on your jvm tuning, a FGC could
take quite a while, effectively 'pausing' the JVM.
Have you looked at something like jstat
On Mon, Jan 10, 2011 at 05:58:42PM -0500, François Schiettecatte said:
http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html(you
need to read this one)
http://java.sun.com/performance/reference/whitepapers/tuning.html (and
this one).
Yeah, I have these two pages
We've got a largish corpus (~94 million documents). We'd like to be able
to sort on one of the string fields. However this takes an incredibly
long time. A warming query for that field takes about ~20 minutes.
However most of the time the result sets are small since we use filters
heavily -
We currently have a 30G index with 73M of .tii files running on a
machine with 4 Intel 2.27GHz Xeons with 15G of memory.
About once a second a process indexes ~10-20 smallish documents using
the XML Update Handler. A commit happens after every update. However we
see this behaviour even if the
On Fri, Nov 19, 2010 at 12:01:09AM +, me said:
I'm baffled - I've had way bigger indexes than this before with no
performance problems. At first it was the frequent updates but the fact
that it happens even when the indexer isn't running seems to put paid to
that.
More information:
-
On Mon, Nov 01, 2010 at 05:42:51PM -0700, Lance Norskog said:
You should query against the indexer. I'm impressed that you got 5s
replication to work reliably.
That's our current solution - I was just wondering if there was anything
I was missing.
Thanks!
We've been trying to get a setup in which a slave replicates from a
master every few seconds (ideally every second but currently we have it
set at every 5s).
Everything seems to work fine until, periodically, the slave just stops
responding from what looks like it running out of memory:
On Mon, Oct 11, 2010 at 07:17:43PM +0100, me said:
It was just an idea though and I was hoping that there would be a
simpler more orthodox way of doing it.
In the end, for anyone who cares, we used dynamic fields.
There are a lot of them but we haven't seen performance impacted that
badly so
On Sat, Oct 09, 2010 at 06:31:19PM -0400, Erick Erickson said:
I'm confused. What do you mean that a user can set any
number of arbitrarily named fields on a document. It sounds
like you are talking about a user adding arbitrarily may entries
to a multi-valued field? Or is it some kind of
I'm trying to index a latLon field.
I have a fieldType in my schema.xml that looks like
fieldType name=latLon class=solr.LatLonType subFieldSuffix=_latLon/
and a field that looks like
field name=location type=latLon indexed=true stored=true/
I'm trying upload via the JSON update handler but
On Wed, Sep 01, 2010 at 01:05:47AM +0100, me said:
I'm trying to index a latLon field.
fieldType name=latLon class=solr.LatLonType subFieldSuffix=_latLon/
field name=location type=latLon indexed=true stored=true/
Turns out changing it to
fieldType name=latLon class=solr.LatLonType
On Thu, Apr 22, 2010 at 02:15:08AM +0100, me said:
It looks like org.apache.lucene.search.highlight.TextFragment has the
right information to do this (i.e textStartPos)
Turns out that it doesn't seem to have the right information in that
textStartPos always seems to be 0 (and textEndPos just
Having poked around little it doesn't look like there's an query param
to turn this on but it'd be really useful if highlighted fragments could
have a character offset return somehow - maybe something like
lst name=highlighting
lst name=27314523
arr name=content
str offset=600
On Wed, Feb 03, 2010 at 07:38:13PM -0800, Lance Norskog said:
The debugQuery parameter shows you how the query is parsed into a tree
of Lucene query objects.
Well, that's kind of what I'm asking - I know how the query is being
parsed:
str name=rawquerystringmyers 8e psychology chapter 9/str
According to my logs
org.apache.solr.handler.component.QueryComponent.process()
takes a significant amount of time (5s but I've seen up to 15s) when a
query has an odd pattern of numbers in e.g
neodymium megagauss-oersteds (MGOe) (1 MG·Oe = 7,958·10³ T·A/m = 7,958
kJ/m³
myers 8e psychology
The spellchecker in my 1.4 install started behaving increasingly
erratically andsuggestions would only be returned some of the time with
the same query.
I tried to force a rebuild using
spellcheck.build=yes
The full request being
/select/?q=alexandr the great
indent=on
fl=title
I have a Master server with two Slaves populated via Solr 1.4 native
replication.
Slave1 syncs at a respectable speed i.e around 100MB/s but Slave2 runs
much, much slower - the peak I've seen is 56KB/s.
Both are running off the same hardware with the same config -
compression is set to
On Mon, Nov 23, 2009 at 12:10:42PM -0800, Chris Hostetter said:
...hmm, you shouldn't have to reindex everything. arey ou sure you
restarted solr after making the enablePositionIncrements=true change to
the query analyzer?
Yup - definitely restarted
what do the offsets look like when you
On Tue, Nov 17, 2009 at 11:09:38AM -0800, Chris Hostetter said:
Several things about your message don't make sense...
Hmm, sorry - a byproduct of building up the mail over time I think.
The query
?q=Here there be dragons
fl=id,title,score
debugQuery=on
qt=dismax
qf=title
gets echoed as
I have a document with the title Here, there be dragons and a body.
When I search for
q = Here, there be dragons
qf = title^2.0 body^0.8
qt = dismax
Which is parsed as
+DisjunctionMaxQuery((content:here dragon^0.8 | title:here
dragon^2.0)~0.01) ()
I get the document as the first hit which
On Fri, Oct 30, 2009 at 11:20:19AM +0530, Shalin Shekhar Mangar said:
That is very strange. IndexReaders do get re-opened after commits. Do you
see a commit message in the Solr logs?
Sorry for the delay - I've been trying to puzzle over this some more.
The code looks like
We've been trying to build an indexing pipeline using SolrJ but we've
run into a couple of issues - namely that IndexReaders don't seem to get
reopened after a commit().
After an index or delete the change doesn't show up until I restart
solr.
I've tried commit() and commit(true, true) just
We have an indexing script which has been running for a couple of weeks
now without problems. It indexes documents and then periodically commit
(which is a tad redundant I suppose) both via the HTTP interface.
All documents are indexed to a master and a slave rsyncs them off using
the standard
Our index has some items in it which basically contain a title and a
single word body.
If the user searches for a word in the title (especially if title is of
itself only oen word) then that doc will get scored quite highly,
despite the fact that, in this case, it's not really relevant.
I've
I know that the Solr FAQ says
Users should decide for themselves which Servlet Container they
consider the easiest/best for their use cases based on their
needs/experience. For high traffic scenarios, investing time for tuning
the servlet container can often make a big difference.
but is
42 matches
Mail list logo