Hi ,
I have setup solrcloud with solr4.4. The cloud has 2 tomcat instances with
separate zookeeper.
i execute the below command in the url,
http://localhost:8180/solr/colindexer/dataimportmssql?command=full-import&commit=true&clean=false
0
0
data-config-mssql.xml
status
idle
1
0
0
201
I have a test system where I have a index of 15M documents in one shard
that I would like to split in two. I've tried it four times now. I have a
stand-alone zookeeper running on the same machine.
The end result is that I have two new shards with state "construction", and
each has one replica whic
Thanks for your replies.
I am actually doing the frange approach for now. The only downside I see there
is it makes the function call twice, calling createWeight() twice. And so my
social connections are evaluated twice which is quite heavy operation. So I was
thinking if I could get away with o
On Mon, Oct 7, 2013, at 11:09 PM, user 01 wrote:
> Any way to store documents in a fixed sort order within the indexes of
> certain fields(either the arrival order or sorted by int ids, that also
> serve as my unique key), so that I could store them optimized for
> browsing
> lists of items ?
>
I'd index them as separate documents.
Best,
Erick
On Mon, Oct 7, 2013 at 2:59 PM, Darniz wrote:
> Thanks Eric
>
> Ok if we go by that proposal of copying all date fields into on bag_of_dates
> field
>
> Hence now we have a field and it will look something like this.
>
> 2013-09-01T00:00:0
Tim:
Thanks! Mostly I wrote it to have something official looking to hide
behind when I didn't have a good answer to the hardware sizing question
:).
On Mon, Oct 7, 2013 at 2:48 PM, Tim Vaillancourt wrote:
> Fantastic article!
>
> Tim
>
>
> On 5 October 2013 18:14, Erick Erickson wrote:
>
>> Fr
bq: If so, using soft commit without calling hard commit could cause OOM
no. Aside from anything you have configured for auto(hard) commit, the
ramBufferSizeMB in solrconfig.xml will flush the in-memory structures out
to the segments when the size reaches this limit. It won't _close_ the
current
I don't think your model fits well into Solr.
What I'd do is make my the patient ID, and
put the image names (or links or whatever) in a multiValued
field. Then you can do what you want by a simple
q=*:* -image_name:[* TO *]
Best,
Erick
On Mon, Oct 7, 2013 at 9:20 AM, SandroZbinden wrote:
> Ok
Well, one of the attributes parsed out of, probably the
meta-information associated with one of your structured
docs is SMALLER_BIG_BLOCK_SIZE_DETAILS and
Solr Cel is faithfully sending that to your index. If you
want to throw all these in the bit bucket, try defining
a true catch-all field that ig
You're probably having problem with the distinction between
query parsing and analysis which has been discussed many
times.
The issues is that the query parser breaks things up into individual
tokens and _then_ sends them to the analyzer chain as individual
tokens (usually).
Try escaping your spa
Hi,
We are in the process of transitioning to SolrCloud (4.4) from
Master-Slave architecture (4.2) . One of the issues I'm facing now is with
making spell check work. It only seems to work if I explicitly set
distrib=false. I'm using a custom request handler and included the spell
check opt
Any way to store documents in a fixed sort order within the indexes of
certain fields(either the arrival order or sorted by int ids, that also
serve as my unique key), so that I could store them optimized for browsing
lists of items ?
The order for browsing is always fixed & there are no further f
Hi,
I'm doing replicas for my shards manually and the solr.xml config doesn't
save the changes (solr.xml attribute "persist" = true).
The command used is:
curl
'http://192.168.2.18:8983/solr/admin/cores?action=CREATE&name=test_shard1_replica2&collection=test&shard=shard1'
Someone else with the
@Jason: your example worked perfectly!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Adding-OR-operator-in-querystring-and-grouping-fields-tp4093942p4093999.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
We are in the process of transitioning to SolrCloud (4.4) from
Master-Slave architecture (4.2) . One of the issues I'm facing now is with
making spell check work. It only seems to work if I explicitly set
distrib=false. I'm using a custom request handler and included the spell
check option.
I notice that when a SPLISHARD operation finish, the solr.xml is not update
properly.
# Parent solr.xml:
# Children solr.xml:
# Paren Clusterstate:
"shard1":{
"range":"8000-",
"state":"inactive",
"replicas":{"192.168.2.18:8983_solr_test_shard1_r
fq=here:there OR this:that
For the lurker: an AND should be:
fq=here:there&fq=this:that
While you can, technically, pass:
fq=here:there AND this:that
Solr will cache the separate fq= parameters and reuse them in any context. The
AND(ed) filter will be cached as a single entr
Combine the two filter queries with an explicit OR operator.
-- Jack Krupansky
-Original Message-
From: PeterKerk
Sent: Monday, October 07, 2013 1:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Adding OR operator in querystring and grouping fields?
Ok thanks.
"you must combine them
Are there any links describing best practices for interacting with SolrJ? I've
checked the wiki and it seems woefully incomplete:
(http://wiki.apache.org/solr/Solrj)
Some specific questions:
- When working with HttpSolrServer should we keep around instances for ever or
should we create a single
Hi,
I have solr cloud(4.1) setup with embedded jetty server.
I use the below command to start and stop the server.
start server : nohup java -DSTOP.PORT=8085 -DSTOP.KEY= -DnumShards=2
-Dbootstrap_confdir=./solr/nlp/conf -Dcollection.configName=myconf
-DzkHost=10.88.139.206:2181,10.88.139.206:2
Thanks Eric
Ok if we go by that proposal of copying all date fields into on bag_of_dates
field
Hence now we have a field and it will look something like this.
2013-09-01T00:00:00Z
2013-12-01T00:00:00Z
Sept content : Honda is releasing the car this month
Dec content : T
I don't know if there's a way to accomplish your goal directly, but as a pure
workaround, you can write a routine to fetch all the stored values and resubmit
the document without the field in question. This is what atomic updates do,
minus the overhead of the transmission.
On Oct 7, 2013, at 1
Fantastic article!
Tim
On 5 October 2013 18:14, Erick Erickson wrote:
> From my perspective, your question is almost impossible to
> answer, there are too many variables. See:
>
> http://searchhub.org/dev/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Best
Is there a way to make autoCommit only commit if there are pending changes,
ie: if there are 0 adds pending commit, don't autoCommit (open-a-searcher
and wipe the caches)?
Cheers,
Tim
On 2 October 2013 00:52, Dmitry Kan wrote:
> right. We've got the autoHard commit configured only atm. The so
I am using SOLR 4.1.0 and perform atomic updates on SOLR documents.
Unfortunately there is a bug in 4.1.0
(https://issues.apache.org/jira/browse/SOLR-4297) that blocks me from using
null="true" for deleting a field through atomic update functionality. Is
there any other way to delete a field other
Ok thanks.
"you must combine them into one filter query parameter. ", how would I do
that? Can I simply change the URL structure or must I change my schema.xml
and/or data-config.xml?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Adding-OR-operator-in-querystring-and-group
The default query operator applies only within a single query parameter. If
you want to OR two filter queries, you must combine them into one filter
query parameter.
-- Jack Krupansky
-Original Message-
From: PeterKerk
Sent: Monday, October 07, 2013 1:08 PM
To: solr-user@lucene.apach
This query returns the correct results:
http://localhost:8983/solr/tt/select/?indent=on&fq={!geofilt}&pt=41.7882,-71.9498&sfield=geolocation&d=2000&q=*:*&start=0&rows=12&fl=id,title&facet.mincount=1&fl=_dist_:geodist()
However, I want to add OR select on a field city as well:
&fq=city:(brooklyn)
Dear all,
We are looking for a new member to join our team. This position requires
solid knowledge of Python, plus experience with web development, HTML5,
XSLT, JSON, CSS3, relational databases and NoSQL but search (and SOLR) is
the central point of everything we do here. So, if you love SOLR/Luce
Use the location_rpt field type in the example schema.xml -- it has "good
performance & less memory" (what you asked for) compared to LatLonType.
To learn how to tweak some of the settings to get better performance at
the expense of some accuracy, see
http://wiki.apache.org/solr/SolrAdaptersForLuce
I think we'd all love to see those improvements land in Solr.
I was involved in the work at AOL WebMail where the LotsOfCores idea
originated. We had many of the problems that you've had to solve yourself.
I remember that we switched to compound file format to reduce file
descriptors. Also we had
I assume that the lotOfCores feature doesn't use zookeeper
I tried simulate the cores as collection, but when the size of
clusterstate.json is bigger than 1M and -Djute.maxbuffer is needed to increase
the 1 mega limitation.
A naive question, why clusterstate.json is doesn't by collection?
Out of Memory Exception is well known as OOM.
Guido.
On 07/10/13 14:11, adfel70 wrote:
Sorry, by "OOE" I meant Out of memory exception...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093902.html
Sent from the Solr - User mailing list a
Okay I try to specify my question a little bit.
I have a denormalized index of two sql tables patient and table.
If I add a patient with two images to the solr index my index contains 3
documents.
---
Pat_ID |Patient_Lastnname | Image_ID | Image_Name
-
Sorry, by "OOE" I meant Out of memory exception...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Soft-commit-and-flush-tp4091726p4093902.html
Sent from the Solr - User mailing list archive at Nabble.com.
Im trying to index .doc,.docx,pdf files,
im using this url:
curl
"http://localhost:8080/solr/document/update/extract?literal.id=12&commit=true";
-F"myfile=@complex.doc"
This is the error I get:
Oct 07, 2013 5:02:18 PM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException
Thanks for the great writeup! It's always interesting to see how
a feature plays out "in the real world". A couple of questions
though:
bq: We added 2 Cores options :
Do you mean you patched Solr? If so are you willing to shard the code
back? If both are "yes", please open a JIRA, attach the patch
Hi,
I have a question regarding to parsing of tokens in edismax parser and
subsequently a follow up question related to same.
- Each field has list of analyzers and tokenizers as configured in
schema.xml (Index and query time). Now, say I search for query - red shoes.
So, is it like that
Query time is the time spent in Solr getting the search
results. It does NOT include reading the bits off disk
to assemble the response etc.
elapsed time is the time from when the query was sent
to the time it gets back. It includes qtime, reading the bits
off disk to assemble the response, trans
bq: Does the NRTCachingDirectoryFactory relevant for both types of commit, or
just for hard commit
Don't know the code deeply, but NRT==Near Real Time == Soft commit I'd guess.
bq: If soft commit does not flush...
soft commit flushes the transaction log. On restart if the content of
the tlog isn
That's what the "autowarm" number for filterCache is about. It
re-executes the last N fq clauses and caches them. Similarly
for some of the other autowarm.
But don't go wild here. Measure _then_ fix. Usually autowarming
just a few (< 32) is sufficient. And remember that autowarming
is done whenev
Wait, are you saying you have fields like
2013-12-01T00:00:00Z_entryDate? So
you have some wildcard definition in your
schema like
*_entryDate type="tdate"?
If so, I think your model is just wrong and you should
have some field(s) that you store dates in.
That aside, and assuming you have wildcard
Just skimmed, but the usual reason you can't max out the server
is that the client can't go fast enough. Very quick experiment:
comment out the server.add line in your client and run it again,
does that speed up the client substantially? If not, then the time
is being spent on the client.
Or split
I am using Solr 4.4 version with SolrCloud on Windows machine.
Somehow i am not able to share schema between multiple core.
My solr.xml file look like:-
${shareSchema:true}
${hostContext:SolrEngine}
${tomcat.port:8080}
${zkClientTimeout:15000}
I have used core.properties file for each core. On
No, the queryResultCache contains the top N for the query, _including_
the filters.
The idea is that you should be able to get the next page of results
without going
to any searching code. You couldn't do this if in the scenario you describe.
If your filters are truly unique, you'll gain a little
Hello,
In my company, we use Solr in production to offer full text search on
mailboxes.
We host dozens million of mailboxes, but only webmail users have such
feature (few millions).
We have the following use case :
- non static indexes with more update (indexing and deleting), than
select requests
On 10/07/2013 12:55 PM, Furkan KAMACI wrote:
One more thing, could you say that which version of Solr you are using?
The stacktrace comes from 4.2.1, but I suspect that this could occur on
4.4 as well. I've not been able to reproduce this consistently: it has
happened twice (!) after indexing
One more thing, could you say that which version of Solr you are using?
2013/10/7 Bram Van Dam
> On 10/07/2013 11:51 AM, Furkan KAMACI wrote:
>
>> Could you send you error logs?
>>
>
> Whoops, forgot to paste:
>
>
> Caused by: org.apache.solr.client.solrj.**SolrServerException:
> IOException oc
On 10/07/2013 11:51 AM, Furkan KAMACI wrote:
Could you send you error logs?
Whoops, forgot to paste:
Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
occured when talking to server at: http://localhost:8080/solr/fooIndex
at
org.apache.solr.client.solrj.impl.H
Hi Bram;
Could you send you error logs?
2013/10/7 Bram Van Dam
> Hi folks,
>
> Long story short: I'm occasionally getting exceptions under heavy load
> (SocketException: Connection reset). I would expect HttpSolrServer to try
> again maxRetries-times, but it doesn't.
>
> For reasons I don't en
Hi folks,
Long story short: I'm occasionally getting exceptions under heavy load
(SocketException: Connection reset). I would expect HttpSolrServer to
try again maxRetries-times, but it doesn't.
For reasons I don't entirely understand, the call to
httpClient.execute(method) is not inside the
If the replica has 20G must probably the recovery will take more than 120
seconds.
In my case I have ssd's and 120 it's not enough.
--
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Monday, October 7, 2013 at 9:19 AM, Shalin Shekhar Mangar wrote:
> I think what is h
QueryResponse object at Solrj has two different methods for required time
for a given query. One of them is for *QTime(queryTime)* and the other one
is for *elapsedTime. *What are the differences between them and what
exactly for elapsedTime?
I think what is happening here is that the sub shard replicas are taking
time to recover. We use a core admin command to wait for the replicas to
become active before the shard states are switched. The timeout value for
that command is just 120 seconds. We should wait for more than that. I'll
open
54 matches
Mail list logo