Re: [cloudsuite] Re: Web search low utilization

J Ahn Sun, 08 Jun 2014 11:28:16 -0700

I am just wondering how to increase the size of crawled index and segments.
It seems that we need to crawl the large data set again.
Is this right??


In addition, I would like to reproduce the experimental results appeared in
the paper, *clearing the clouds*. The paper used an index size of 2GB and
data segment size of 23GB of content crawled from the public web. Could you
explain me which public sites you crawled ?

Next, I have a question about configuring clients. How many clients are
used in the experiments? and what terms_en.out is used ?

- Jeongseob


2013-06-09 16:16 GMT+09:00 Hailong Yang <[email protected]>:

> Hi Zacharias,
>
> Have you tried to increase the size of your crawled index and segments?
> For example, the clearing cloud paper says they used 2GB index and 23GB
> segments.
>
> Best
>
> Hailong
>
>
> On Fri, May 31, 2013 at 10:24 PM, zhadji01 <[email protected]> wrote:
>
>> Hi,
>>
>> I have a web-search benchmark setup with 4 machines 1 client, 1
>> front-end, 1 search server and 1 segment server for fetching the summaries.
>>
>> All machines are two-socket Xeon E5620 @2.4Ghz, 32GB RAM and they  are
>> connected with 1Gb Ethernet. My crawled data is 400 MB index and 4GB
>> segments.
>>
>> My problem is that the servers' cpu utilization is very low. The max
>> throughput I managed to get using the faban client or apache benchmark was
>> ~400-450 queries/sec with user cpu utlizations: frontend ~5%, search server
>> ~ 10%, segment server ~35-39%.
>>
>> I'm sure that the network is not the bottleneck cause I'm not even close
>> to fill the bandwidth.
>>
>> Can you give any suggestions on how to utilize well the servers or any
>> thoughts on what can be the problem?
>>
>> Thanks Zacharias Hadjilambrou
>>
>
>

Re: [cloudsuite] Re: Web search low utilization

Reply via email to