Am Samstag, 27. Dezember 2014, 16:06:13 schrieb Robert White:
> >
> >> I also don't know what kind of tool you are using, but it might be
> >> repeatedly trying and failing to fallocate the file as a single
> >> extent or something equally dumb.
> >
> >     Userspace doesn't as far as I know, get to make that decision. I've
> > just read the fallocate(2) man page, and it says nothing at all about
> > the contiguity of the extent(s) storage allocated by the call.
> 
> Yep, my bad. But as soon as I saw that "fio" was starting two threads, 
> one doing random read/write and another doing sequential read/write, 
> both on the same file, it set off my "not just creating a file" mindset. 
> Given the delayed write into/through the cache normally done by casual 
> file io, It seemed likely that fio would be doing something more 
> aggressive (like using O_DIRECT or repeated fdatasync() which could get 
> very tit-for-tat).

Robert, please get to know about fio or *ask* before jumping to conclusions.

I used this:

[global]
bs=4k
#ioengine=libaio
#iodepth=4
size=4g
#direct=1
runtime=120
filename=ssd.test.file

#[seq-write]
#rw=write
#stonewall

[rand-write]
rw=randwrite
stonewall


At the first test I still tested seq-write, but do you note the "stonewall"
param? It *separates* both jobs from one another. I.e. fio may be starting
two threads as it I think prepares all threads in advance, yet it did
execute only *one* at a time.

>From the manpage of fio:

       stonewall , wait_for_previous
              Wait  for  preceding  jobs  in the job file to exit before
              starting this one.  stonewall implies new_group.

(that said the first stonewall isn´t even needed, but I removed the read
jobs from the ssd-test.fio example fio I used for this job and I didn´t
remember to remove the statement)


Thank you a lot for your input. I learned some from it. For example that
the trees for the data handling are in the metadata section. And now
I am very clear the btrfs fi df does not display any trees but the chunk
reservation and usage. I think I knew this before, but I thought somehow
that was combined with the tree, but it isn´t, at least not in place, but
the trees are stored in the metadata chunks. I´d still not call these
extents tough, cause thats a file-based thing to all I know.

I skip theoretizing about algorithms here. I prefer to let measurements
speak and try to understand these. Best approach to understand the ones
I made, I think, is what Hugo suggested: A developer looks at the sysrq-t
outputs. So I personally won´t speculate any further about given or not
given algorithmic limitations of BTRFS.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to