Great updates. Thanks for keeping us all in the loop!
On Thu, Oct 22, 2020 at 7:43 PM Wei wrote:
>
> Hi Shawn,
>
> I.m circling back with some new findings with our 2 NUMA issue. After a
> few iterations, we do see improvement with the useNUMA flag and other JVM
> setting changes. Here are the
Hi Shawn,
I.m circling back with some new findings with our 2 NUMA issue. After a
few iterations, we do see improvement with the useNUMA flag and other JVM
setting changes. Here are the current settings, with Java 11:
-XX:+UseNUMA
-XX:+UseG1GC
-XX:+AlwaysPreTouch
-XX:+UseTLAB
-XX:G1MaxNewSiz
On 9/28/2020 12:17 PM, Wei wrote:
Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do you
see any backward compatibility issue for Solr 8 with Java 11? Can we run
Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
11 JDK?
I do not know of any problems
Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do you
see any backward compatibility issue for Solr 8 with Java 11? Can we run
Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
11 JDK?
Best,
Wei
On Sat, Sep 26, 2020 at 6:44 PM Shawn Heisey wrote:
>
On 9/26/2020 1:39 PM, Wei wrote:
Thanks Shawn! Currently we are still using the CMS collector for solr with
Java 8. When last evaluated with Solr 7, CMS performs better than G1 for
our case. When using G1, is it better to upgrade from Java 8 to Java 11?
From https://lucene.apache.org/solr/guide/
Thanks Shawn! Currently we are still using the CMS collector for solr with
Java 8. When last evaluated with Solr 7, CMS performs better than G1 for
our case. When using G1, is it better to upgrade from Java 8 to Java 11?
>From https://lucene.apache.org/solr/guide/8_4/solr-system-requirements.html,
On 9/23/2020 7:42 PM, Wei wrote:
Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
noticed that query latency almost doubled compared to deployment on single
NUMA machines. Not sure what's causing the huge difference. Is there any
tuning to boost the performance on multipl
Thanks Dominique. I'll start with the -XX:+UseNUMA option.
Best,
Wei
On Fri, Sep 25, 2020 at 7:04 AM Dominique Bejean
wrote:
> Hi,
>
> This would be a Java VM option, not something Solr itself can know about.
> Take a look at this article in comments. May be it will help.
>
> https://blog.theta
Hi,
This would be a Java VM option, not something Solr itself can know about.
Take a look at this article in comments. May be it will help.
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html?showComment=1347033706559#c229885263664926125
Regards
Dominique
Le jeu. 24 sept.
Hi,
Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
noticed that query latency almost doubled compared to deployment on single
NUMA machines. Not sure what's causing the huge difference. Is there any
tuning to boost the performance on multiple NUMA machines? Any pointer i
On 6/1/2020 9:29 AM, Odysci wrote:
Hi,
I'm looking for some advice on improving performance of our solr setup.
Does anyone have any insights on what would be better for maximizing
throughput on multiple searches being done at the same time?
thanks!
In almost all cases, adding memory will pr
Hi,
I'm looking for some advice on improving performance of our solr setup. In
particular, about the trade-offs between applying larger machines, vs more
smaller machines. Our full index has just over 100 million docs, and we do
almost all searches using fq's (with q=*:*) and facets. We are using s
On 4/18/2020 12:20 PM, Odysci wrote:
We don't used this field for general queries (q:*), only for fq and
faceting.
Do you think making it indexed="true" would make a difference in fq
performance?
fq means "filter query". It's still a query. So yes, the field should
be indexed. The query you
We don't used this field for general queries (q:*), only for fq and
faceting.
Do you think making it indexed="true" would make a difference in fq
performance?
Thanks
Reinaldo
On Sat, Apr 18, 2020 at 3:06 PM Sylvain James
wrote:
> Hi Reinaldo,
>
> Involved fields should be indexed for better per
Hi Reinaldo,
Involved fields should be indexed for better performance ?
Sylvain
Le sam. 18 avr. 2020 à 18:46, Odysci a écrit :
> Hi,
>
> We are seeing significant performance degradation on single queries that
> use fq with multiple values as in:
>
> fq=field1_name:(V1 V2 V3 ...)
>
> If we u
Hi,
We are seeing significant performance degradation on single queries that
use fq with multiple values as in:
fq=field1_name:(V1 V2 V3 ...)
If we use only one value in the fq (say only V1) we get Qtime = T ms
As we increase the number of values, say to 5 values, Qtime more than
triples, even i
Hi,
It means that you are either committing too frequently or your warming up takes
too long. If you are committing on every bulk, stop doing that and use
autocommit.
Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - ht
Hi All,
I am using SOLR 7.5 version with master slave architecture.
I am getting :
"PERFORMANCE WARNING: Overlapping onDeckSearchers=2"
continuously on my master logs for all cores. Please help me to resolve this.
Thanks & Regards,
Akreeti Agarwal
::DISCLAIMER::
___
Hi,
Which Solr version are you using?
Also, how many collections do you have, and how many records have you
indexed in those collections?
Regards,
Edwin
On Mon, 4 Feb 2019 at 23:33, Anchal Sharma2 wrote:
>
>
> Hi All,
>
> We had recently enabled SSL on solr. But afterwards ,our application
>
Hi All,
We had recently enabled SSL on solr. But afterwards ,our application
performance has degraded significantly i.e the time for the source
application to fetch a record from solr has increased from approx 4 ms to
200 ms(this is for a single record) .This amounts to a lot of time ,when
mu
On 11/21/2018 8:59 AM, Marc Schöchlin wrote:
Is it possible to modify the log4j appender to also log other query attributes
like response/request size in bytes and number of resulted documents?
Changing the log4j config might not do anything useful at all. In order
for such a change to be us
Hello list,
i am using the pretty old solr 4.7 *sigh* release and i am currently in
investigation of performance problems.
The solr instance runs currently very expensive queries with huge results and i
want to find the most promising queries for optimization.
I am currently using the solr logf
Sharding can be one of the option.
But what is the size of your documents? And which Solr version are you
using?
Regards,
Edwin
On Tue, 20 Nov 2018 at 01:40, Balanathagiri Ayyasamypalanivel <
bala.cit...@gmail.com> wrote:
> Hi,
> We are in the process for live Publishing document in solr and th
Hi,
We are in the process for live Publishing document in solr and the same
time we have to maintain the search performance.
Total existing docs : 120 million
Expected data for live publishing : 1 million
For every 1 hour, we will get 1m docs to publish in live to the hot solr
collection, can you
On 2/15/2018 2:00 AM, Srinivas Kashyap wrote:
> I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the
> child entities in data-config.xml. And i'm using the same for full-import
> only. And in the beginning of my implementation, i had written delta-import
> query to index th
Srinivas:
Not an answer to your question, but when DIH starts getting this
complicated, I start to seriously think about SolrJ, see:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
IN particular, it moves the heavy lifting of acquiring the data from a
Solr node (which I'm assuming also has
Hi,
I have implemented 'SortedMapBackedCache' in my SqlEntityProcessor for the
child entities in data-config.xml. And i'm using the same for full-import only.
And in the beginning of my implementation, i had written delta-import query to
index the modified changes. But my requirement grew and i
Hi Erick,
As suggested, I did try nonHDFS solr cloud instance and it response looks to
be really better. From the configuration side to, I am mostly using default
configurations and with block.cache.direct.memory.allocation as false. On
analysis of hdfs cache, evictions seems to be on higher sid
Hi Arun,
It is hard to measure something without affecting it, but we could use debug
results and combine with QTime without debug: If we ignore merging results, it
seems that majority of time is spent for retrieving docs (~500ms). You should
consider reducing number of rows if you want better r
Hi Emir,
Please find the response without bq parameter and debugQuery set to true.
Also it was noted that Qtime comes down drastically without the debug
parameter to about 700-800.
true
0
3446
("hybrid electric powerplant" "hybrid electric powerplants" "Electric"
"Electrical" "Electricity"
Hi Erick,
Qtime comes down with rows set as 1. Also it was noted that qtime comes down
when debug parameter is not added with the query. It comes to about 900.
Thanks,
Arun
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
On Tue, 2017-09-26 at 07:43 -0700, sasarun wrote:
> Allocated heap size for young generation is about 8 gb and old
> generation is about 24 gb. And gc analysis showed peak
> size utlisation is really low compared to these values.
That does not come as a surprise. Your collections would normally b
Hi Arun,
This is not the most simple query either - a dozen of phrase queries on several
fields + the same query as bq. Can you provide debugQuery info.
I did not look much into debug times and what includes what, but one thing that
is strange to me is that QTime is 4s while query in debug is 1.3
Well, 15 second responses are not what I'd expect either. But two
things (just looked again)
1> note that the time to assemble the debug information is a large
majority of your total time (14 of 15.3 seconds).
2> you're specifying 600 rows which is quite a lot as each one
requires that a 16K block
Hi Erick,
Thank you for the quick response. Query time was relatively faster once it
is read from memory. But personally I always felt response time could be far
better. As suggested, We will try and set up in a non HDFS environment and
update on the results.
Thanks,
Arun
--
Sent from: http
Does the query time _stay_ low? Once the data is read from HDFS it
should pretty much stay in memory. So my question is whether, once
Solr warms up you see this kind of query response time.
Have you tried this on a non HDFS system? That would be useful to help
figure out where to look.
And given
Hi All,
I have been using Solr for some time now but mostly in standalone mode. Now
my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml
has the following configuration. In the prod environment the performance on
querying seems to really slow. Can anyone help me with few poin
Impossible to answer as Shawn says. Or even recommend. For instance,
you say "but once we launch our application all across the world it
may give performance issues."
You haven't defined at all what changes when you "launch our
application all across the world". Increasing your query traffic 10
fo
Thanks, Shawn.
As of now, we don't have any performance issues, We are just working for
the future purpose.
So I was looking for any general architecture which is agreed by many of
Solr experts.
Thanks,
Venkat.
On Thu, May 11, 2017 at 8:19 PM, Shawn Heisey wrote:
> On 5/11/2017 7:39 AM, Venka
On 5/11/2017 7:39 AM, Venkateswarlu Bommineni wrote:
> In current design we have below configuration: *One collection with
> one shard with 4 replication factor with 4 nodes.* as of now, it is
> working fine.but once we launch our application all across the world
> it may give performance issues. T
Hello Guys,
In current design we have below configuration:
*One collection with one shard with 4 replication factor with 4 nodes.*
as of now, it is working fine.but once we launch our application all across
the world it may give performance issues.
To improve the performance below is our thoug
Already have a Jira issue for next week. I have a script to run prod logs
against a cluster. I’ll be testing a four shard by two replica cluster with 17
million docs and very long queries. We are working on getting the 95th
percentile under one second, so we should exercise the timeAllowed featu
+Walter test it
Jeff,
How much CPU does the EC2 hypervisor use? I have heard 5% but that is for a
normal workload, and is mostly consumed during system calls or context changes.
So it is quite understandable that frequent time calls would take a bigger bite
in the AWS cloud compared to bare met
It’s presumably not a small degradation - this guy very recently suggested it’s
77% slower:
https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/
The other reason that blog post is interesting to me is that his benchmark
utility showed the work of entering the kernel
I remember seeing some performance impact (even when not using it) and it
was attributed to the calls to System.nanoTime. See SOLR-7875 and SOLR-7876
(fixed for 5.3 and 5.4). Those two Jiras fix the impact when timeAllowed is
not used, but I don't know if there were more changes to improve the
perf
Hmm, has anyone measured the overhead of timeAllowed? We use it all the time.
If nobody has, I’ll run a benchmark with and without it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 2, 2017, at 9:52 AM, Chris Hostetter wrote:
>
>
> : I speci
: I specify a timeout on all queries,
Ah -- ok, yeah -- you mean using "timeAllowed" correct?
If the root issue you were seeing is in fact clocksource related,
then using timeAllowed would probably be a significant compounding
factor there since it would involve a lot of time checks in a s
Yes, that’s the Xenial I tried. Ubuntu 16.04.2 LTS.
On 5/1/17, 7:22 PM, "Will Martin" wrote:
Ubuntu 16.04 LTS - Xenial (HVM)
Is this your Xenial version?
On 5/1/2017 6:37 PM, Jeff Wartes wrote:
> I tried a few variations of various things before we found
I started with the same three-node 15-shard configuration I’d been used to, in
an RF1 cluster. (the index is almost 700G so this takes three r4.8xlarge’s if I
want to be entirely memory-resident) I eventually dropped down to a 1/3rd size
index on a single node (so 5 shards, 100M docs each) so I
Ubuntu 16.04 LTS - Xenial (HVM)
Is this your Xenial version?
On 5/1/2017 6:37 PM, Jeff Wartes wrote:
> I tried a few variations of various things before we found and tried that
> linux/EC2 tuning page, including:
>- EC2 instance type: r4, c4, and i3
>- Ubuntu version: Xenial and Trust
Might want to measure the single CPU performance of your EC2 instance. The last
time I checked, my MacBook was twice as fast as the EC2 instance I was using.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On May 1, 2017, at 6:24 PM, Chris Hostetter w
: tldr: Recently, I tried moving an existing solrcloud configuration from
: a local datacenter to EC2. Performance was roughly 1/10th what I’d
: expected, until I applied a bunch of linux tweaks.
How many total nodes in your cluster? How many of them running ZooKeeper?
Did you observe the hea
I tried a few variations of various things before we found and tried that
linux/EC2 tuning page, including:
- EC2 instance type: r4, c4, and i3
- Ubuntu version: Xenial and Trusty
- EBS vs local storage
- Stock openjdk vs Zulu openjdk (Recent java8 in both cases - I’m aware of
the issues
It's also very important to consider the type of EC2 instance you are
using...
We settled on the R4.2XL... The R series is labeled "High-Memory"
Which instance type did you end up using?
On Mon, May 1, 2017 at 8:22 AM, Shawn Heisey wrote:
> On 4/28/2017 10:09 AM, Jeff Wartes wrote:
> > tldr:
On 4/28/2017 10:09 AM, Jeff Wartes wrote:
> tldr: Recently, I tried moving an existing solrcloud configuration from a
> local datacenter to EC2. Performance was roughly 1/10th what I’d expected,
> until I applied a bunch of linux tweaks.
How very strange. I knew virtualization would have overhe
I’d like to think I helped a little with the metrics upgrade that got released
in 6.4, so I was already watching that and I’m aware of the resulting
performance issue.
This was 5.4 though, patched with https://github.com/whitepages/SOLR-4449 - an
index we’ve been running for some time now.
Mgan
We use Solr 6.2 in EC2 instance with Cent OS 6.2 and we don't see any
difference in performance between EC2 and in local environment.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-on-EC2-linux-tp4332467p4332553.html
Sent from the Solr - User mailing
Well, 6.4.0 had a pretty severe performance issue so if you were using
that release you might see this, 6.4.2 is the most recent 6.4 release.
But I have no clue how changing linux settings would alter that and I
sure can't square that issue with you having such different
performance between local a
tldr: Recently, I tried moving an existing solrcloud configuration from a local
datacenter to EC2. Performance was roughly 1/10th what I’d expected, until I
applied a bunch of linux tweaks.
This should’ve been a straight port: one datacenter server -> one EC2 node.
Solr 5.4, Solrcloud, Ubuntu
> Also we will try to decouple tika to solr.
+1
-Original Message-
From: tstusr [mailto:ulfrhe...@gmail.com]
Sent: Friday, March 31, 2017 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr performance issue on indexing
Hi, thanks for the feedback.
Yes, it is about OOM, ind
decouple tika to
> solr.
>
> By the way, make it available with solr cloud will improve performance? Or
> there will be no perceptible improvement?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html
> Sent from the Solr - User mailing list archive at Nabble.com.
ble improvement?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886p4327914.html
Sent from the Solr - User mailing list archive at Nabble.com.
with 4gb of JVM
> Memory and ~50gb of physical memory (reported through dashboard) we are
> using a single instance.
>
> I don't think is a normal behaviour that handler crashes. So, what are some
> general tips about improving performance for this scenario?
>
>
>
> --
andler crashes. So, what are some
general tips about improving performance for this scenario?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-performance-issue-on-indexing-tp4327886.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks EricK
Regards,
Prateek Jain
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 21 November 2016 04:32 PM
To: solr-user
Subject: Re: solr | performance warning
_when_ are you seeing this? I see this on startup upon occasion, and I _think_
there
_when_ are you seeing this? I see this on startup upon occasion, and I _think_
there's a JIRA about startup opening more than one searcher on startup.
If it _is_ on startup, you can simply ignore it.
If it's after the system is up and running, then you're probably committing too
frequently. "Too f
Hi All,
I am observing following error in logs, any clues about this:
2016-11-06T23:15:53.066069+00:00@solr@@ org.apache.solr.core.SolrCore:1650 -
[my_custom_core] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
Slight web search suggests that it could be a case of too-frequent commits. I
> > Sent: Wednesday, March 09, 2016 9:09 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Disable hyper-threading for better Solr performance?
> >
> > What is the solr version and shard config? Standalone? Multiple cores?
> > Spread over RAID ?
> &g
iginal Message-
> From: Ilan Schwarts [mailto:ila...@gmail.com]
> Sent: Wednesday, March 09, 2016 9:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Disable hyper-threading for better Solr performance?
>
> What is the solr version and shard config? Standalone? Multiple co
reads will perform better.
Thanks,
Avner
-Original Message-
From: Ilan Schwarts [mailto:ila...@gmail.com]
Sent: Wednesday, March 09, 2016 9:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Disable hyper-threading for better Solr performance?
What is the solr version and shard c
ssage-
> From:Avner Levy
> Sent: Wednesday 9th March 2016 8:00
> To: solr-user@lucene.apache.org
> Subject: Disable hyper-threading for better Solr performance?
>
> I have a machine with 16 real cores (32 with HT enabled).
> I'm running on it a Solr server and trying to reach maxi
What is the solr version and shard config? Standalone? Multiple cores?
Spread over RAID ?
On Mar 9, 2016 9:00 AM, "Avner Levy" wrote:
> I have a machine with 16 real cores (32 with HT enabled).
> I'm running on it a Solr server and trying to reach maximum performance
> for indexing and queries (i
I have a machine with 16 real cores (32 with HT enabled).
I'm running on it a Solr server and trying to reach maximum performance for
indexing and queries (indexing 20k documents/sec by a number of threads).
I've read on multiple places that in some scenarios / products disabling the
hyper-thread
where the issue is.
> > >>
> > >> Thanks,
> > >> Emir
> > >>
> > >> --
> > >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> > >> Solr & Elasticsearch Support * http://sematext.com/
> > &g
>>
> >> Thanks,
> >> Emir
> >>
> >> --
> >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> >> Solr & Elasticsearch Support * http://sematext.com/
> >>
> >>
> >>
> >> On 08.02.2016 10:27,
; Thanks,
>> Emir
>>
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>> On 08.02.2016 10:27, sara hajili wrote:
>>
>>> hi all.
nomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 08.02.2016 10:27, sara hajili wrote:
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr
ch Support * http://sematext.com/
>
>
>
> On 08.02.2016 10:27, sara hajili wrote:
>
>> hi all.
>> i have a problem with my solr performance and usage hardware like a
>> ram,cup...
>> i have a lot of document and so indexed file about 1000 doc in solr t
* Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 08.02.2016 10:27, sara hajili wrote:
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr that
every doc has a
hi all.
i have a problem with my solr performance and usage hardware like a
ram,cup...
i have a lot of document and so indexed file about 1000 doc in solr that
every doc has about 8 field in average.
and each field has about 60 char.
i set my field as a storedfield = "false" except o
Thanks for your recommendation Toke.
Will try to ask in the carrot forum.
Regards,
Edwin
On 26 August 2015 at 18:45, Toke Eskildsen wrote:
> On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote:
>
> > Now I've tried to increase the carrot.fragSize to 75 and
> > carrot.summarySnippets t
On Wed, 2015-08-26 at 15:47 +0800, Zheng Lin Edwin Yeo wrote:
> Now I've tried to increase the carrot.fragSize to 75 and
> carrot.summarySnippets to 2, and set the carrot.produceSummary to
> true. With this setting, I'm mostly able to get the cluster results
> back within 2 to 3 seconds when I set
Hi Toke,
Thank you for the link.
I'm using Solr 5.2.1 but I think the carrot2 bundled will be slightly older
version, as I'm using the latest carrot2-workbench-3.10.3, which is only
released recently. I've changed all the settings like fragSize and
desiredCluserCountBase to be the same on both si
On Wed, 2015-08-26 at 10:10 +0800, Zheng Lin Edwin Yeo wrote:
> I'm currently trying out on the Carrot2 Workbench and get it to call Solr
> to see how they did the clustering. Although it still takes some time to do
> the clustering, but the results of the cluster is much better than mine. I
> thin
Hi Toke,
Thank you for your reply.
I'm currently trying out on the Carrot2 Workbench and get it to call Solr
to see how they did the clustering. Although it still takes some time to do
the clustering, but the results of the cluster is much better than mine. I
think its probably due to the differe
On Tue, 2015-08-25 at 10:40 +0800, Zheng Lin Edwin Yeo wrote:
> Would like to confirm, when I set rows=100, does it mean that it only build
> the cluster based on the first 100 records that are returned by the search,
> and if I have 1000 records that matches the search, all the remaining 900
> rec
Thank you Upayavira for your reply.
Would like to confirm, when I set rows=100, does it mean that it only build
the cluster based on the first 100 records that are returned by the search,
and if I have 1000 records that matches the search, all the remaining 900
records will not be considered for c
I honestly suspect your performance issue is down to the number of terms
you are passing into the clustering algorithm, not to memory usage as
such. If you have *huge* documents and cluster across them, performance
will be slower, by definition.
Clustering is usually done offline, for example on a
Hi Alexandre,
I've tried to use just index=true, and the speed is still the same and not
any faster. If I set to store=false, there's no results that came back with
the clustering. Is this due to the index are not stored, and the clustering
requires indexed that are stored?
I've also increase my
Yes, I'm using store=true.
However, this field needs to be stored as my program requires this field to
be returned during normal searching. I tried the lazyLoading=true, but it's
not working.
Will you do a copy field for the content, and not to set stored="true" for
that field. So that field wil
We use 8gb to 10gb for those size indexes all the time.
Bill Bell
Sent from mobile
> On Aug 23, 2015, at 8:52 AM, Shawn Heisey wrote:
>
>> On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote:
>> Hi Shawn,
>>
>> Yes, I've increased the heap size to 4GB already, and I'm using a machine
>> with 32
unsubscribe
On Sat, Aug 22, 2015 at 9:31 PM, Zheng Lin Edwin Yeo
wrote:
> Hi,
>
> I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr.
>
> However, I find that clustering is exceeding slow after I index this 1GB of
> data. It took almost 30 seconds to return the cluster results wh
And be aware that I'm sure the more terms in your documents, the slower
clustering will be. So it isn't just the number of docs, the size of
them counts in this instance.
A simple test would be to build an index with just the first 1000 terms
of your clustering fields, and see if that makes a diff
You're confusing clustering with searching. Sure, Solr can index
and lots of data, but clustering is essentially finding ad-hoc
similarities between arbitrary documents. It must take each of
the documents in the result size you specify from your result
set and try to find commonalities.
For perf i
Are you by any chance doing store=true on the fields you want to search?
If so, you may want to switch to just index=true. Of course, they will
then not come back in the results, but do you really want to sling
huge content fields around.
The other option is to do lazyLoading=true and not request
Hi Shawn and Toke,
I only have 520 docs in my data, but each of the documents is quite big in
size, In the Solr, it is using 221MB. So when i set to read from the top
1000 rows, it should just be reading all the 520 docs that are indexed?
Regards,
Edwin
On 23 August 2015 at 22:52, Shawn Heisey
On 8/22/2015 10:28 PM, Zheng Lin Edwin Yeo wrote:
> Hi Shawn,
>
> Yes, I've increased the heap size to 4GB already, and I'm using a machine
> with 32GB RAM.
>
> Is it recommended to further increase the heap size to like 8GB or 16GB?
Probably not, but I know nothing about your data. How many So
Zheng Lin Edwin Yeo wrote:
> However, I find that clustering is exceeding slow after I index this 1GB of
> data. It took almost 30 seconds to return the cluster results when I set it
> to cluster the top 1000 records, and still take more than 3 seconds when I
> set it to cluster the top 100 record
Hi Shawn,
Yes, I've increased the heap size to 4GB already, and I'm using a machine
with 32GB RAM.
Is it recommended to further increase the heap size to like 8GB or 16GB?
Regards,
Edwin
On 23 Aug 2015 10:23, "Shawn Heisey" wrote:
> On 8/22/2015 7:31 PM, Zheng Lin Edwin Yeo wrote:
> > I'm usin
On 8/22/2015 7:31 PM, Zheng Lin Edwin Yeo wrote:
> I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr.
>
> However, I find that clustering is exceeding slow after I index this 1GB of
> data. It took almost 30 seconds to return the cluster results when I set it
> to cluster the top
1 - 100 of 482 matches
Mail list logo