On 2015-10-05 10:04, Duncan wrote:
On Mon, 5 Oct 2015 13:16:03 +0200
Lionel Bouton <lionel-subscript...@bouton.name> wrote:

To better illustrate my point.

According to Phoronix tests, BTRFS RAID-1 is even faster than md RAID1
most of the time.

http://www.phoronix.com/scan.php?page=article&item=btrfs_raid_mdadm&num=1

The only case where md RAID1 was noticeably faster is sequential reads
with FIO libaio.

So if you base your analysis on Phoronix tests

[...snip...]

Hmm... I think I've begun to see the kernel folks point about people
quoting Phoronix in support of their points, when it's really not
apropos at all.  Yes, I do still consider Phoronix reports in context
to contain useful information, at some level.  However, one really must
be aware of what was actually tested in ordered to understand what the
results actually mean, and unfortunately, it seems most people quoting
it, including here, really can't properly do so in context, and thus
end up using it in support of points that simply are not supported by
the given evidence in the Phoronix articles people are attempting to
use.

Even aside from the obvious past issues with Phoronix reports, people forget that they are news organization (regardless of what they claim, they _are_ a news organization), and as such their employees are not paid to verify existing results, they're paid to make impactful articles that grab people's attention (I'd be willing to bet that this story started in response to the people who pointed out correctly that XFS or ext4 on top of mdraid beats the pants off of BTRFS performance wise, and (incorrectly) assumed that this meant that mdraid was better than BTRFS raid). This when combined with almost no evidence in many cases of actual statistical analysis, really hurts their credibility (at least, for me it does).

The other issue is that so many people tout benchmarks as the pinnacle of testing, when they really aren't. Benchmarks are by definition synthetic workloads, and as such only the _really_ good ones (which there aren't many of) give you more than a very basic idea what performance differences you can expect with a given workload. On top of that, people just accept results without trying to reproduce them themselves (Kernel folks tend to be much better about this than many other people though).

A truly sane person, looking to determine the best configuration for a given workload, will: 1. Look at a wide variety of sources to determine what configurations he should even be testing. (The author of the linked article obviously didn't do this, or just didn't care, the defaults on btrfs are unsuitable for a significant number of cases, including usage on top of mdraid). 2. Using this information, run established benchmarks similar to his use-case to further narrow down the test candidates. 3. Write his own benchmark to simulate to the greatest degree possible the actual workload he expects to run, and then use that for testing the final candidates. 4. Gather some reasonable number of samples with the above mentioned benchmark, and use _real_ statistical analysis to determine what he should be using.

To put this in further perspective, most people just do step one, assume that other people know what they're talking about, and don't do any further testing, and there are other people who just do step two, and then claim their results are infallible.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to