Re: ways to check if document is in a huge search result set

2017-09-13 Thread Derek Poh

I see. Thank you.

On 9/13/2017 2:36 PM, Michael Kuhlmann wrote:

Am 13.09.2017 um 04:04 schrieb Derek Poh:

Hi Michael

"Then continue using binary search depending on the returned score
values."

May I know what do you mean by using binary search?

An example algorithm is in Java method java.util.Arrays::binarySearch.

Or more detailed: https://en.wikipedia.org/wiki/Binary_search_algorithm

Best,
Michael





--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: ways to check if document is in a huge search result set

2017-09-13 Thread Michael Kuhlmann
Am 13.09.2017 um 04:04 schrieb Derek Poh:
> Hi Michael
>
> "Then continue using binary search depending on the returned score
> values."
>
> May I know what do you mean by using binary search?

An example algorithm is in Java method java.util.Arrays::binarySearch.

Or more detailed: https://en.wikipedia.org/wiki/Binary_search_algorithm

Best,
Michael



Re: ways to check if document is in a huge search result set

2017-09-12 Thread Derek Poh

Hi Michael

"Then continue using binary search depending on the returned score values."

May I know what do you mean by using binary search?

On 9/12/2017 3:08 PM, Michael Kuhlmann wrote:

So you're looking for a solution to validate the result output.

You have two ways:
1. Assuming you're sorting by the default "score" sort option:
Find the result you're looking for by setting the fq filter clause
accordingly, and add "score" the the fl field list.
Then do the normal unfiltered search, still including "score", and start
with page, let's say, 50,000.
Then continue using binary search depending on the returned score values.

2. Set fl to return only the supplier id, then you'll probably be able
to return several ten-thousand results at once.


But be warned, the result position of these elements can vary with every
single commit, esp. when there're lots of documents with the same score
value.

-Michael


Am 12.09.2017 um 03:21 schrieb Derek Poh:

Some additional information.

I have a query from user that a supplier's product(s) is not in the
search result.
I debugged by adding a fq on the supplier id to the query to verify
the supplier's product is in thesearch result. The products do existin
the search result.
I want to tell user in which page of the search result the supplier's
product appear in. To do this I go through each page of the search
result to find the supplier's product.
It is still fine if the search result has a few hundreds products but
it will be a chore if the result have thousands. In this case there
are more than 100,000 products in the result.

Any advice on easier ways to check which page the supplier's product
or document appear in a search result?

On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:

You can request facet field, query facet, filter or even explainOther.

On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh 
wrote:


Hi

I have a collection of productdocument.
Each productdocument has supplier information in it.

I need to check if a supplier's products is return in a search
resultcontaining over 100,000 products and in which page (assuming
pagination is 20 products per page).
Itis time-consuming and "labour-intensive" to go through each page
to look
for the product of the supplier.

Would like to know if you guys have any better and easier waysto do
this?

Derek

--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer,
and you
must not use, disclose to anyone else or copy this e-mail (including
any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.





--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential
and/or privileged information. If you are not the intended recipient
or have received this e-mail in error, please inform the sender
immediately and delete this e-mail (including any attachments) from
your computer, and you must not use, disclose to anyone else or copy
this e-mail (including any attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.






--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: ways to check if document is in a huge search result set

2017-09-12 Thread Michael Kuhlmann
So you're looking for a solution to validate the result output.

You have two ways:
1. Assuming you're sorting by the default "score" sort option:
Find the result you're looking for by setting the fq filter clause
accordingly, and add "score" the the fl field list.
Then do the normal unfiltered search, still including "score", and start
with page, let's say, 50,000.
Then continue using binary search depending on the returned score values.

2. Set fl to return only the supplier id, then you'll probably be able
to return several ten-thousand results at once.


But be warned, the result position of these elements can vary with every
single commit, esp. when there're lots of documents with the same score
value.

-Michael


Am 12.09.2017 um 03:21 schrieb Derek Poh:
> Some additional information.
>
> I have a query from user that a supplier's product(s) is not in the
> search result.
> I debugged by adding a fq on the supplier id to the query to verify
> the supplier's product is in thesearch result. The products do existin
> the search result.
> I want to tell user in which page of the search result the supplier's
> product appear in. To do this I go through each page of the search
> result to find the supplier's product.
> It is still fine if the search result has a few hundreds products but
> it will be a chore if the result have thousands. In this case there
> are more than 100,000 products in the result.
>
> Any advice on easier ways to check which page the supplier's product
> or document appear in a search result?
>
> On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:
>> You can request facet field, query facet, filter or even explainOther.
>>
>> On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh 
>> wrote:
>>
>>> Hi
>>>
>>> I have a collection of productdocument.
>>> Each productdocument has supplier information in it.
>>>
>>> I need to check if a supplier's products is return in a search
>>> resultcontaining over 100,000 products and in which page (assuming
>>> pagination is 20 products per page).
>>> Itis time-consuming and "labour-intensive" to go through each page
>>> to look
>>> for the product of the supplier.
>>>
>>> Would like to know if you guys have any better and easier waysto do
>>> this?
>>>
>>> Derek
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> This e-mail (including any attachments) may contain confidential and/or
>>> privileged information. If you are not the intended recipient or have
>>> received this e-mail in error, please inform the sender immediately and
>>> delete this e-mail (including any attachments) from your computer,
>>> and you
>>> must not use, disclose to anyone else or copy this e-mail (including
>>> any
>>> attachments), whether in whole or in part.
>>> This e-mail and any reply to it may be monitored for security, legal,
>>> regulatory compliance and/or other appropriate reasons.
>>
>>
>>
>
>
> --
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential
> and/or privileged information. If you are not the intended recipient
> or have received this e-mail in error, please inform the sender
> immediately and delete this e-mail (including any attachments) from
> your computer, and you must not use, disclose to anyone else or copy
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.




Re: ways to check if document is in a huge search result set

2017-09-11 Thread Derek Poh

Some additional information.

I have a query from user that a supplier's product(s) is not in the 
search result.
I debugged by adding a fq on the supplier id to the query to verify the 
supplier's product is in thesearch result. The products do existin the 
search result.
I want to tell user in which page of the search result the supplier's 
product appear in. To do this I go through each page of the search 
result to find the supplier's product.
It is still fine if the search result has a few hundreds products but it 
will be a chore if the result have thousands. In this case there are 
more than 100,000 products in the result.


Any advice on easier ways to check which page the supplier's product or 
document appear in a search result?


On 9/11/2017 2:44 PM, Mikhail Khludnev wrote:

You can request facet field, query facet, filter or even explainOther.

On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh  wrote:


Hi

I have a collection of productdocument.
Each productdocument has supplier information in it.

I need to check if a supplier's products is return in a search
resultcontaining over 100,000 products and in which page (assuming
pagination is 20 products per page).
Itis time-consuming and "labour-intensive" to go through each page to look
for the product of the supplier.

Would like to know if you guys have any better and easier waysto do this?

Derek

--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.







--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: ways to check if document is in a huge search result set

2017-09-11 Thread Mikhail Khludnev
You can request facet field, query facet, filter or even explainOther.

On Mon, Sep 11, 2017 at 5:12 AM, Derek Poh  wrote:

> Hi
>
> I have a collection of productdocument.
> Each productdocument has supplier information in it.
>
> I need to check if a supplier's products is return in a search
> resultcontaining over 100,000 products and in which page (assuming
> pagination is 20 products per page).
> Itis time-consuming and "labour-intensive" to go through each page to look
> for the product of the supplier.
>
> Would like to know if you guys have any better and easier waysto do this?
>
> Derek
>
> --
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.




-- 
Sincerely yours
Mikhail Khludnev


Re: ways to check if document is in a huge search result set

2017-09-11 Thread Michael Kuhlmann
Maybe I don't understand your problem, but why don't you just filter by
"supplier information"?

-Michael

Am 11.09.2017 um 04:12 schrieb Derek Poh:
> Hi
>
> I have a collection of productdocument.
> Each productdocument has supplier information in it.
>
> I need to check if a supplier's products is return in a search
> resultcontaining over 100,000 products and in which page (assuming
> pagination is 20 products per page).
> Itis time-consuming and "labour-intensive" to go through each page to
> look for the product of the supplier.
>
> Would like to know if you guys have any better and easier waysto do this?
>
> Derek
>
> --
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential
> and/or privileged information. If you are not the intended recipient
> or have received this e-mail in error, please inform the sender
> immediately and delete this e-mail (including any attachments) from
> your computer, and you must not use, disclose to anyone else or copy
> this e-mail (including any attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.