[Let's add some more people and linux-mm to the CC list]

On Wed 20-09-17 11:23:50, XaviLi wrote:
> PageONE (Page Object Non-duplicate Engine) is a multithread kernel page 
> deduplication engine. It is based on a lock-less tree algorithm we currently 
> named as SD (Static and Dynamic) Tree. Normal operations such as 
> insert/query/delete to this tree are block-less. Adding more CPU cores can 
> linearly boost speed as far as we tested. Multithreading gives not only 
> opportunity to work faster. It also allows any CPU to donate spare time for 
> the job. Therefore, it reveals a way to use CPU more efficiently. PPR is from 
> an open source solution named Dynamic VM:
> https://github.com/baibantech/dynamic_vm.git 
> 
> patch can be found here:  
> https://github.com/baibantech/dynamic_vm/tree/master/dynamic_vm_0.5
> 
> One work thread of PageONE can match the speed of KSM daemon. Adding more 
> CPUs can increase speed linearly. Here we can see a brief test:
> 
> Test environment
> DELL R730
> Intel® Xeon® E5-2650 v4 (2.20 GHz, of Cores 12, threads 24); 
> 256GB RAM
> Host OS: Ubuntu server 14.04 Host kernel: 4.4.1
> Qemu: 2.9.0
> Guest OS: Ubuntu server 16.04 Guest kernel: 4.4.76
> 
> We ran 12 VMs together. Each create 16GB data in memory. After all data is 
> ready we start dedup-engine and see how host-side used memory amount changes.
> 
> KSM:
> Configuration: sleep_millisecs = 0, pages_to_scan = 1000000
> Starting used memory: 216.8G
> Result: KSM start merging pages immediately after turned on. KSM daemon took 
> 100% of one CPU for 13:16 until used memory was reduced to 79.0GB.
> 
> PageONE:
> Configuration: merge_period(secs) = 20, work threads = 12
> Starting used memory: 207.3G
> (Which means PageONE scans full physical memory in 20 secs period. Pages was 
> merged if not changed in 2 merge_periods.)
> Result: In the first two periods PageONE only observe and identify unchanged 
> pages. Little CPU was used in this time. As the third period begin all 12 
> threads start using 100% CPU to do real merge job. 00:58 later used memory 
> was reduced to 70.5GB.
> 
> We ran the above test using the data quite easy for red-black tree of KSM. 
> Every difference can be detected by comparing the first 8 bytes. Then we ran 
> another test in which each data was begin with random zero bytes for 
> comparison. The average size of zero data was 128 bytes. Result is shown 
> below:
> 
> KSM:
> Configuration: sleep_millisecs = 0, pages_to_scan = 1000000
> Starting used memory: 216.8G
> Result: 19:49 minutes until used memory was reduced to 78.7GB.
> 
> PageONE:
> Configuration: merge period(secs) = 20, work threads = 12
> Starting used memory: 210.3G
> Result: First 2 periods same as above. 1:09 after merge job start memory was 
> reduced to 72GB.
> 
> PageONE shows little difference in the two tests because SD tree search 
> compare each key bit just once in most cases.

-- 
Michal Hocko
SUSE Labs

Reply via email to