On Wed, Nov 16, 2016 at 11:24:33PM +0100, Niccolò Belli wrote: > On martedì 15 novembre 2016 18:52:01 CET, Zygo Blaxell wrote: > >Like I said, millions of extents per week... > > > >64K is an enormous dedup block size, especially if it comes with a 64K > >alignment constraint as well. > > > >These are the top ten duplicate block sizes from a sample of 95251 > >dedup ops on a medium-sized production server with 4TB of filesystem > >(about one machine-day of data): > > Which software do you use to dedupe your data? I tried duperemove but it > gets killed by the OOM killer because it triggers some kind of memory leak: > https://github.com/markfasheh/duperemove/issues/163
Duperemove does use a lot of memory, but the logs at that URL only show 2G of RAM in duperemove--not nearly enough to trigger OOM under normal conditions on an 8G machine. There's another process with 6G of virtual address space (although much less than that resident) that looks more interesting (i.e. duperemove might just be the victim of some interaction between baloo_file and the OOM killer). On the other hand, the logs also show kernel 4.8. 100% of my test machines failed to finish booting before they were cut down by OOM on 4.7.x kernels. The same problem occurs on early kernels in the 4.8.x series. I am having good results with 4.8.6 and later, but you should be aware that significant changes have been made to the way OOM works in these kernel versions, and maybe you're hitting a regression for your use case. > Niccolò Belli > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
signature.asc
Description: Digital signature