Hi, just a general question as I was unable to find any old posts relating
to stats/percentile/facets performance/cache settings.
I have been using Solr since version 4.0 , now using the latest v. 5.2.1.
What I have done:
- Increase heap memory to 30gb
- Experimented with the cache settings
-
Troy Edwards tedwards415...@gmail.com wrote:
1) There are about 6000 clients
2) The number of documents from each client are about 50 (average
document size is about 400 bytes)
So roughly 3 billion documents / 1TB index size. So at least 2 shards, due to
the 2 billion limit in Lucene. If
Thanks for your answersCurrently I have one machine (6 cores, 148 GB RAM, 2.5
TB HDD) and I index around 60 million documents for a day - the index size is
around 26GB.I do have customer-ID today and I use it for the queries. I don't
split the customers but I get bad performance.
If I will make
Thanks, i didn't know you could do this, I'll check this out.
On Aug 15, 2015 12:54 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:
From the teaching to fish category of advice (since I don't know the
actual answer).
Did you try Analysis screen in the Admin UI? If you check Verbose
I expect that the amount of concurrent customers will be low.Today I have 1
machine so I don't have the capacity for all the data. Because of that I am
thinking on a new cluster solution.Today is 1 billion each day for 90 days =
90 billion (around 45TB data).
I should prefer a lot of machines
yura last y_ura_2...@yahoo.com.INVALID wrote:
I expect that the amount of concurrent customers will be low.
Today I have 1 machine so I don't have the capacity for all
the data.
You aim for 90 billion documents in the first go and want to prepare for 10
times that. Your current test setup is
yura last y_ura_2...@yahoo.com.INVALID wrote:
I have one machine (6 cores, 148 GB RAM, 2.5 TB HDD) and I index
around 60 million documents for a day - the index size is around 26GB.
So 1 billion documents would be approximately 500GB.
...and 10 billion/day in 90 days would be 450TB.
I do
Erik,
After Walters reply I started thinking along the lines you mentioned and
realized the folly of doing that!
Scott
On 8/15/2015 9:57 PM, Erick Erickson wrote:
Scott:
You better not even let them access Solr directly.
I exactly have the same requirement
On 13-Aug-2015, at 2:12 pm, Kiran Sai Veerubhotla sai.sq...@gmail.com wrote:
does solr support joins?
we have a use case where two collections have to be joined and the join has
to be on the faceted results of the two collections. is this possible?
Scott Derrick sc...@tnstaafl.net wrote:
Is there a way to get the list of terms that matched in a query response?
Add debug=query to your request:
https://wiki.apache.org/solr/CommonQueryParameters#debug
You might also want to try
http://splainer.io/
- Toke Eskildsen
with a query like
q=mar*
I tried the debugQuery=true but it just said
rawquerystring: mar*,
querystring: mar*,
parsedquery: _text_:mar*,
parsedquery_toString: _text_:mar*,
I already know that!
one document match's Mary
another matches Mary and martyr
I will look at splainer.io
Scott
Is there a way to get the list of terms that matched in a query response?
I realize the q parameter is returned, but I'm looking for just the list
of terms and not the operators.
Scott
--
To those leaning on the sustaining infinite, to-day is big with blessings.
Mary Baker Eddy
I have a solr cloud with 3 nodes. I've added password protection following the
steps here:
http://stackoverflow.com/questions/28043957/how-to-set-apache-solr-admin-password
Now only one node is able to load the collections. The others are getting 401
Unauthorized error when loading the
I did a dataimport with 'clean' set to false.
The DIH status upon completion was:
str name=statusidle/str
str name=importResponse/
lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched6843427/str
str name=Total Documents Processed6843427/str
str
You almost certainly have a non-unique ID field.
Yes it is not absolutely unique but do not think it is at this 1 to 6 ratio.
Try it with a clean index, and then review the number of deleted documents
(updates are a delete then insert action)
I tried on a new instance - same effect. I do not
I'm using a dataimporthandler
requestHandler name=/update/html startup=lazy
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=confightml-config.xml/str
/lst
/requestHandler
I'm using the xsl attribute on all the entities, but this one is
On 8/16/2015 12:09 PM, Tarala, Magesh wrote:
I have a solr cloud with 3 nodes. I've added password protection following
the steps here:
http://stackoverflow.com/questions/28043957/how-to-set-apache-solr-admin-password
Now only one node is able to load the collections. The others are
This isn't going to be easy. Why do you need to know? Especially
with wildcards this'll be challenging.
For the specific docs that are returned, highlighting will tell you _some_
of them. Why only some? Because usually only the best N snippets are
returned, say 3 (it's configurable). And it's
Is there any chance of this feature(merge the results to create a composite
document) coming out in the next release 5.3 ?
On Sun, Aug 16, 2015 at 2:08 PM, Upayavira u...@odoko.co.uk wrote:
You can do what are called pseudo joins, which are eqivalent to a
nested query in SQL. You get back data
Thanks Shawn!
We are on 4.10.4. Will consider 5.x upgrade shortly.
-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Sunday, August 16, 2015 9:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud Security Question
On 8/16/2015 12:09 PM, Tarala, Magesh
bq: Is there any chance of this feature(merge the results to create a composite
document) coming out in the next release 5.3
In a word no. And there aren't really any long-range plans either that I'm
aware of.
You could also explore streaming aggregation, if the need here is more
batch-oriented.
splainer doesn't return anything the debug parameter can.
On 8/16/2015 11:39 AM, Toke Eskildsen wrote:
Scott Derrick sc...@tnstaafl.net wrote:
Is there a way to get the list of terms that matched in a query response?
Add debug=query to your request:
I'm searching a collection of documents.
When I build my results page I provide a link to each document. If the
user click the link I display the document with all the matched terms
highlighted. I need to supply my highlighter a list of words to hilight
in the doc.
I thought the
You can do what are called pseudo joins, which are eqivalent to a
nested query in SQL. You get back data from one core, based upon
criteria in the other. You cannot (yet) merge the results to create a
composite document.
Upayavira
On Sun, Aug 16, 2015, at 06:02 PM, Nagasharath wrote:
I exactly
https://issues.apache.org/jira/browse/SOLR-7090
I see this jira open in support of joins which might solve the problem.
On Sun, Aug 16, 2015 at 2:51 PM, Erick Erickson erickerick...@gmail.com
wrote:
bq: Is there any chance of this feature(merge the results to create a
composite
document)
You almost certainly have a non-unique ID field. Some documents are
overwritten during indexing. Try it with a clean index, and then review
the number of deleted documents (updates are a delete then insert
action). Deletes are calculated with maxDocs minus numDocs.
Upayavira
On Sun, Aug 16,
26 matches
Mail list logo