On Wed, Oct 12, 2016 at 01:19:37PM -0400, Zygo Blaxell wrote:
> On Wed, Oct 12, 2016 at 01:48:58PM +0800, Qu Wenruo wrote:
> > In fact, the _concept_ to solve such RMW behavior is quite simple:
> > 
> > Make sector size equal to stripe length. (Or vice versa if you like)
> > 
> > Although the implementation will be more complex, people like Chandan are
> > already working on sub page size sector size support.
> 
> So...metadata blocks would be 256K on the 5-disk RAID5 example above,
> and any file smaller than 256K would be stored inline?  Ouch.  That would
> also imply the compressed extent size limit (currently 128K) has to become
> much larger.
> 
> I had been thinking that we could inject "plug" extents to fill up
> RAID5 stripes.  This lets us keep the 4K block size for allocations,
> but at commit (or delalloc) time we would fill up any gaps in new RAID
> stripes to prevent them from being modified.  As the real data is deleted
> from the RAID stripes, it would be replaced by "plug" extents to keep any
> new data from being allocated in the stripe.  When the stripe consists
> entirely of "plug" extents, the plug extent would be deleted, allowing
> the stripe to be allocated again.  The "plug" data would be zero for
> the purposes of parity reconstruction, regardless of what's on the disk.
> Balance would just throw the plug extents away (no need to relocate them).

Your idea sounds good, but there's one problem: most real users don't
balance.  Ever.  Contrary to the tribal wisdom here, this actually works
fine, unless you had a pathologic load skewed to either data or metadata on
the first write then fill the disk to near-capacity with a load skewed the
other way.

Most usage patterns produce a mix of transient and persistent data (and at
write time you don't know which file is which), meaning that with time every
stripe will contain a smidge of cold data plus a fill of plug extents.

Thus, while the plug extents idea doesn't suffer from problems of big
sectors you just mentioned, we'd need some kind of auto-balance.

-- 
A MAP07 (Dead Simple) raspberry tincture recipe: 0.5l 95% alcohol, 1kg
raspberries, 0.4kg sugar; put into a big jar for 1 month.  Filter out and
throw away the fruits (can dump them into a cake, etc), let the drink age
at least 3-6 months.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to