Re: General guidance on blur-shard server

Ravikumar Govindarajan Thu, 18 Jun 2015 05:56:10 -0700

Apologize for resurrecting this thread…

One problem of lucene is OS buffer-cache pollution during segment merges,
as documented here


http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html

This problem could occur in Blur, when short-circuit reads are enabled...

My take on this…

It may be possible to overcome the problem by simply re-directing
merge-read requests to a node other than local-node instead of fancy stuff
like O_DIRECT, FADVISE etc...

In a mixed setup, this means merge requests need to be diverted to low-end
Rack2 machines {running only data-nodes} while short-circuit read requests
will continue to be served from high-end Rack1 machines {running both
shard-server and data-nodes}

Hadoop 2.x provides a cool read-API "seekToNewSource"
API documentation says "Seek to given position on a node other than the
current node"

>From blur code, it's just enough if we open a new FSDataInputStream for
merge-reads and issue seekToNewSource call. Once merges are done, it can
closed & discarded…

Please let know your view-points on this…

--
Ravi

On Mon, Mar 9, 2015 at 5:45 PM, Ravikumar Govindarajan <
[email protected]> wrote:

>
> On Sat, Mar 7, 2015 at 11:00 AM, Aaron McCurry <[email protected]> wrote:
>
>>
>> I thought the normal hdfs replica rules were once local. One remote rack
>> once same rack.
>>
>
> Yes. One copy is local & other two copies on the same remote rack.
>
> How did
>> land on your current configuration ?
>
>
> When I was evaluating disk-budget, we were looking at 6 expensive drives
> per machine. It lead me to think what those 6 drives would do & how we can
> reduce the cost. Then stumbled on this two-rack setup and now we need only
> 2 such drives...
>
> Apart from reduced disk-budget & write-overhead on cluster, it also helps
> in greater availability as rack-failure would be recoverable...
>
> --
> Ravi
>
>

Re: General guidance on blur-shard server

Reply via email to