Mikhail
It was caused by an endless loop in the page's codes that is triggered only under certain conditions.

On 5/11/2016 4:07 PM, Mikhail Khludnev wrote:
On Wed, May 11, 2016 at 10:16 AM, Derek Poh <d...@globalsources.com> wrote:

Hi Erick

Yes we have identified and fixed the page slow loading.

Derek,
Can you elaborate more? What did you fix?


I was wondering if there are any best practices when it comes to deciding
to create a single collection that stores all information in it or create
multiple sub collections. I understand now itdepends on the use-case.
My apologies for not giving it much thoughts before asking the questions.
Thank you for your patience.

- Derek


On 5/10/2016 12:10 PM, Erick Erickson wrote:

Not quite sure where you are at with this. It sounds
like your slow loading is fixed and was a coding
issue on your part, that happens to us all.

bq: Is it advisable to has as less number of
queries to solr in a page?

Of course it is advisable to have as few Solr queries
executed to display a page as possible. Every one
costs you at least _some_ turnaround time. You can
mitigate this (assuming your Solr server isn't running
flat out) by issuing the subsequent queries in parallel
threads.

But it's not really a question to me of advisability, it's a
question of what your application needs to deliver. The
use-case drives all. You can do some tricks like display
partial pages and fill in the rest behind the scenes to
display when your user clicks something and the like.

bq: In my case, by denormalizing,that means putting the
product and supplier information into one collection?
The supplier information are stored but not indexed in the collection.

It Depends(tm). If all you want to do is provide supplier
information when people do product searches then stored-only
is fine.

If you want to perform queries like "show me all the products
supplied by supplier X", then you need to index at least
some values too.

Best,
Erick

On Sun, May 8, 2016 at 10:36 PM, Derek Poh <d...@globalsources.com>
wrote:

Hi Erick

In my case, by denormalizing,that means putting the product and supplier
information into one collection?
The supplier information arestored but not indexed in thecollection.

We haveidentified itwas a combination of a loop and bad source data that
caused an endless loop under certain scenario.

Is it advisable to has as less number of queries to solr in a page?


On 5/6/2016 11:17 PM, Erick Erickson wrote:

Denormalizing the data is usually the first thing to try. That's
certainly the preferred option if it doesn't bloat the index
unacceptably.

But my real question is what have you done to try to figure out _why_
it's slow? Do you have some loop
like
for (each found document)
      extract all the supplier IDs and query Solr for them)

? That's a fundamental design decision that will be expensive.

Have you examined the time each query takes to see if Solr is really
the bottleneck or whether it's "something else"? Mind you, I have no
clue what "something else" is here....

Do you ever return lots of rows (i.e. thousands)?

Solr serves queries very quickly, so I'd concentrate on identifying what
is slow before jumping to a solution....

Best,
Erick

On Wed, May 4, 2016 at 10:28 PM, Derek Poh <d...@globalsources.com>
wrote:

Hi

We have a "product" collection and a "supplier" collection.
The "product" collection contains products information and "supplier"
collection contains the product's suppliers information.
We have a subsidiary page that query on "product" collection for the
search.
The display result include product and supplier information.
This page will query the "product" collection to get the matching
product
records.
   From this query a list of the matching product's supplier id is
extracted
and used in a filter query against the "supplier" collection to get the
necessary supplier's information.

The loading of this page is very slow, it leads to timeout at times as
well.
Beside looking at tweaking the codes of the page we are also looking at
what
tweaking can be done on solr side. Reducing the number of queries
generated
bythis page was one of the optionto try.

The main "product" collection is also use by our site main search page
and
other subsidiary pages as well. So the query load on it is substantial.
It has about 6.5 million documents and index size of 38-39 GB.
It is setup as 1 shard with 5 replicas. Each replica is on it's own
server.
Total of 5 servers.
There are other smaller collections with similar 1 shard 5 replicas
setup
residing on these servers as well.

I am thinking of either
1. Index supplier information into the "product" collection.
2. Create another similar "product" collection for this page to use.
This
collection will have lesser product fields and will include the
required
supplier fields. But the number of documents in it will be the same as
the
main "product" collection. The index size will be smallerthough.

With either 2 options we do not need to query "supplier" collection. So
there is one less query and hopefully it will improve the performance
of
this page.

What is the advise between the 2 options?
Any other advice or options?

Derek

----------------------
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and
you
must not use, disclose to anyone else or copy this e-mail (including
any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and
you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.





----------------------
CONFIDENTIALITY NOTICE This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Reply via email to