Hi Jaegeuk,

19.08.2016, 05:41, "Jaegeuk Kim" <[email protected]>:
>  Hello,
>
>  On Thu, Aug 18, 2016 at 02:04:55PM +0300, Alexander Gordeev wrote:
>
>  ...
>
>>   >>>>>>>   Here is also /sys/kernel/debug/f2fs/status for reference:
>>   >>>>>>>   =====[ partition info(sda). #0 ]=====
>>   >>>>>>>   [SB: 1] [CP: 2] [SIT: 4] [NAT: 118] [SSA: 60] [MAIN: 
>> 29646(OverProv:1529
>>   >>>>>>>  Resv:50)]
>>   >>>>>>>
>>   >>>>>>>   Utilization: 94% (13597314 valid blocks)
>>   >>>>>>>     - Node: 16395 (Inode: 2913, Other: 13482)
>>   >>>>>>>     - Data: 13580919
>>   >>>>>>>
>>   >>>>>>>   Main area: 29646 segs, 14823 secs 14823 zones
>>   >>>>>>>     - COLD data: 3468, 1734, 1734
>>   >>>>>>>     - WARM data: 12954, 6477, 6477
>>   >>>>>>>     - HOT data: 28105, 14052, 14052
>>   >>>>>>>     - Dir dnode: 29204, 14602, 14602
>>   >>>>>>>     - File dnode: 19960, 9980, 9980
>>   >>>>>>>     - Indir nodes: 29623, 14811, 14811
>>   >>>>>>>
>>   >>>>>>>     - Valid: 13615
>>   >>>>>>>     - Dirty: 13309
>>   >>>>>>>     - Prefree: 0
>>   >>>>>>>     - Free: 2722 (763)
>>   >>>>>>>
>>   >>>>>>>   GC calls: 8622 (BG: 4311)
>>   >>>>>>>     - data segments : 8560
>>   >>>>>>>     - node segments : 62
>>   >>>>>>>   Try to move 3552161 blocks
>>   >>>>>>>     - data blocks : 3540278
>>   >>>>>>>     - node blocks : 11883
>>   >>>>>>>
>>   >>>>>>>   Extent Hit Ratio: 49 / 4171
>>   >>>>>>>
>>   >>>>>>>   Balancing F2FS Async:
>>   >>>>>>>     - nodes 6 in 141
>>   >>>>>>>     - dents 0 in dirs: 0
>>   >>>>>>>     - meta 13 in 346
>>   >>>>>>>     - NATs 16983 > 29120
>>   >>>>>>>     - SITs: 17
>>   >>>>>>>     - free_nids: 1861
>>   >>>>>>>
>>   >>>>>>>   Distribution of User Blocks: [ valid | invalid | free ]
>>   >>>>>>>     [-----------------------------------------------|-|--]
>>   >>>>>>>
>>   >>>>>>>   SSR: 1230719 blocks in 14834 segments
>>   >>>>>>>   LFS: 15150190 blocks in 29589 segments
>>   >>>>>>>
>>   >>>>>>>   BDF: 89, avg. vblocks: 949
>>   >>>>>>>
>>   >>>>>>>   Memory: 6754 KB = static: 4763 + cached: 1990
>
>  ...
>
>>   >>  Per my understanding of f2fs internals, it should write these "cold" 
>> files and
>>   >>  usual "hot" files to different sections (that should map internally to
>>   >>  different allocation units). So the sections used by "cold" data 
>> should almost
>>   >>  never get "dirty" because most of the time all their blocks become 
>> free at
>>   >>  the same time. Of course, the files are not exactly 4MB in size so the 
>> last
>>   >>  section of the deleted file will become dirty. If it is moved by 
>> garbage
>>   >>  collector and becomes mixed with fresh "cold" data, then indeed it 
>> might cause
>>   >>  some problems, I think. What is your opinion?
>>   >
>>   > If your fs is not fragmented, it may be as what you said, otherwise, SSR 
>> will
>>   > still try to reuse invalid block of other temperture segments, then your 
>> cold
>>   > data will be fixed with warm data too.
>>   >
>>   > I guess, what you are facing is the latter one:
>>   > SSR: 1230719 blocks in 14834 segments
>>
>>   I guess, I need to somehow disable any cleaning or SSR for my archive and 
>> index
>>   files. But keep the cleaning for other data and nodes.
>
>  Could you test a mount option, "mode=lfs", to disable SSR?
>  (I guess sqlite may suffer from logner latency due to GC though.)
>
>  Seems like it's caused by SSR starting to make worse before 95% as you 
> described
>  below.

Thanks, I'll run a test with a couple of SD cards over weekend.
So if I understand it correctly, GC will not cause the problems described 
below, right?
I.e. it will not mix the new data with old data from dirty sections?
Longer SQLite latencies should not be a problem because the database is written 
not
frequently and also it is about 200-250KB in size usually. Maybe forcing IPU as
suggested by Chao would help sqlite, no?
However looks like setting ipu_policy to 1 has no effect when mode=lfs.
The IPU counter is still zero on my test system.

>>   I think the FS can get fragmented quite easily otherwise. The status above 
>> is
>>   captured when the FS already has problems. I think it can become fragmented
>>   this way:
>>   1. The archive is written until the utilization is 95%. It is written 
>> separately from other
>>   data and nodes thanks to the "cold" data feature.
>>   2. After hitting 95% the archive my program starts to rotate the archive. 
>> The rotation
>>   routine checks the free space, reported by statfs(), once a minute. If it 
>> is below 5%
>>   of total, then it deletes several oldest records in the archive.
>>   3. The last deleted record leaves a dirty section. This section holds 
>> several blocks
>>   from a record, which now becomes the oldest one.
>>   4. This section is merged with fresh "cold" or even warmer data by either 
>> GC, or
>>   SSR in one or more newly used sections.
>>   5. Then very soon the new oldest record is again deleted. And now we have 
>> one
>>   or even several dirty sections filled with blocks from a not so old 
>> record. Which are
>>   again merged with other records.
>>   6. All the records get fragmented after one full rotation. The 
>> fragmentation gets
>>   worse and worse.
>>
>>   So I think the best thing to do is to have sections with "cold" data be 
>> completely
>>   out of all the cleaning schemes. It will clean itself by rotating.
>>   Still other data and nodes might need to use some cleaning schemes.
>>   Please correct me if I don't get it right.
>>
>>   > Maybe we can try to alter updating policy from OPU to IPU for your case 
>> to avoid
>>   > performance regression of SSR and more frequently FG-GC:
>>   >
>>   > echo 1 > /sys/fs/f2fs/"yourdevicename"/ipu_policy
>>
>>   Thanks, I'll try it!

-- 
 Alexander

------------------------------------------------------------------------------
_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to