Re: Blur on SSDs...

Ravikumar Govindarajan Mon, 01 Jun 2015 22:08:10 -0700

>
> Is this the code for the legacy short circuit reads or the newer version
> that uses named pipes?



The legacy short-circuit reads is for domain-sockets. They have numerous
perf-issues as documented here
https://issues.apache.org/jira/browse/HDFS-347

Mmap APIs are the latest. They are referring it as "zero copy reads" and
don't suffer any of the problems associated with legacy short-circuit reads.

The only thing I find missing is that "unmap" control of blocks is vested
with hadoop-client...

--
Ravi

On Tue, Jun 2, 2015 at 12:50 AM, Aaron McCurry <[email protected]> wrote:

> On Wed, May 27, 2015 at 7:51 AM, Ravikumar Govindarajan <
> [email protected]> wrote:
>
> > I was thinking on how blur can effectively use Mmap short-circuit-reads
> > from hadoop. It's kind of long but please bear...
> >
> > Checked out hadoop-2.3.0 source. I am summarizing logic found in
> > DFSInputStream, ClientMmap & ClientMmapManager source files...
> >
> > 1. New method read(ByteBufferPool bufferPool,
> >
> >       int maxLength, EnumSet<ReadOption> opts) is exposed for
> >
> >      short-circuit Mmap reads
> >
> >
> > 2. Local-blocks are Mmapped and added to LRU
> >
> > 3. A ref-count is maintained for every Mmapped block during reads
> >
> > 4. When ref-count drops to zero for the block, it is UnMapped.This
> happens
> >   when incoming read-offset jumps to a block other than current block.
> >
> > 5. UnMapping actually happens via a separate reaper thread...
> >
> > Step 4 is problematic, because we don't want hadoop to control
> "unmapping"
> > blocks. Ideally blocks should be unmapped when the original IndexInput
> and
> > all clones are closed from blur-side…
> >
> > If someone from hadoop community can tell us if such a control is
> possible,
> > I feel that we can close any perceived perf-gaps between regular lucene
> > *MmapDirectory* and blur's *HdfsDirectory*
> >
> > It should be very trivial to change HdfsDirectory to use the Mmap read
> > apis..
> >
>
> Is this the code for the legacy short circuit reads or the newer version
> that uses named pipes?
>
>
> >
> > --
> > Ravi
> >
> > On Wed, May 27, 2015 at 12:55 PM, Ravikumar Govindarajan <
> > [email protected]> wrote:
> >
> > > My guess is
> > >> that SSDs are only going to help when the blocks for the shard are
> local
> > >> and short circuit reads are enabled.
> > >
> > >
> > > Yes, it's a good-fit for such a use-case alone…
> > >
> > > I would not recommend disabling the block cache.  However you could
> > likely
> > >> lower the size of the cache and reduce the overall memory footprint of
> > >> Blur.
> > >
> > >
> > > Fine. Can we also scale down the machine RAM itself? [Ex: Instead of
> > 128GB
> > > RAM, we can opt for a 64GB or 32GB RAM slot]
> > >
> > >  One interesting thought would be to
> > >> try using the HDFS cache feature that is present in the most recent
> > >> versions of HDFS.  I haven't tried it yet but it would be interesting
> to
> > >> try.
> > >>
> > >
> > > I did try reading the HDFS cache code. Think it was written for
> > Map-Reduce
> > > use-case where blocks are loaded in memory [basically "mmap" followed
> by
> > > "mlock" on data-nodes] just before computation begins and unloaded once
> > > done.
> > >
> > > On the short-circuit reads, I found that HDFS-Client is offering 2
> > options
> > > for block-reads
> > > 1. Domain Socket
> > > 2. Mmap
> > >
> > > I think Mmap is superior and must have the same performance as lucene's
> > > MmapDirectory…
> > >
> > > --
> > > Ravi
> > >
> > > On Tue, May 26, 2015 at 8:00 PM, Aaron McCurry <[email protected]>
> > wrote:
> > >
> > >> On Fri, May 22, 2015 at 3:33 AM, Ravikumar Govindarajan <
> > >> [email protected]> wrote:
> > >>
> > >> > Recently I am trying to consider deploying SSDs on search machines
> > >> >
> > >> > Each machine runs data-nodes + shard-server and local reads of
> hadoop
> > >> are
> > >> > leveraged….
> > >> >
> > >> > SSDs are a great-fit for general lucene/solr kind of setups. But for
> > >> blur,
> > >> > I need some help…
> > >> >
> > >> > 1. Is it a good idea to consider SSDs, especially when block-cache
> is
> > >> > present?
> > >> >
> > >>
> > >> Possibly, I don't have any hard number for this type of setup.  My
> guess
> > >> is
> > >> that SSDs are only going to help when the blocks for the shard are
> local
> > >> and short circuit reads are enabled.
> > >>
> > >>
> > >> > 2. Are there any grids running blur on SSDs and how they compare to
> > >> normal
> > >> > HDDs?
> > >> >
> > >>
> > >> I haven't run any at scale yet.
> > >>
> > >>
> > >> > 3. Can we disable block-cache on SSDs, especially when local-reads
> are
> > >> > enabled?
> > >> >
> > >>
> > >> I would not recommend disabling the block cache.  However you could
> > likely
> > >> lower the size of the cache and reduce the overall memory footprint of
> > >> Blur.
> > >>
> > >>
> > >> > 4. Using SSDs, blur/lucene will surely be CPU bound. But I don't
> know
> > >> what
> > >> > over-heads hadoop local-reads brings to the table…
> > >> >
> > >>
> > >> If you are using short circuit reads I have seen performance of local
> > >> accesses nearing that of native IO.  However if Blur is making remote
> > HDFS
> > >> calls every call is like a cache miss.  One interesting thought would
> be
> > >> to
> > >> try using the HDFS cache feature that is present in the most recent
> > >> versions of HDFS.  I haven't tried it yet but it would be interesting
> to
> > >> try.
> > >>
> > >>
> > >> >
> > >> > Any help is much appreciated because I cannot find any info from web
> > on
> > >> > this topic
> > >> >
> > >> > --
> > >> > Ravi
> > >> >
> > >>
> > >
> > >
> >
>

Re: Blur on SSDs...

Reply via email to