Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/28/2013 03:02 PM, Marco Stornelli wrote: Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be "persistent". Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? In x86 you should have the write protection enabled. Turn it off or mount it with noprotect option. I tried. This helps, but the write rate is still too low: with write protect: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 17.6007 s, 23.8 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 4.32923 s, 96.9 MB/s w/o write protect: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096,noprotect none /mnt/pramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.07748 s, 46.2 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04596 s, 138 MB/s Also tried turning off CONFIG_PRAMFS_WRITE_PROTECT, the result is the same: the rate does not exceed 150 MB/s, which is too slow comparing to ramfs: # mount -t ramfs none /mnt/ramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/ramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 0.200809 s, 2.1 GB/s 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. From this point of view you are right. Pramfs (or other solution like that) are out of page cache, so you can't do any memory transfer. It's like to have a disk but it's actually a separate piece
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be "persistent". Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? In x86 you should have the write protection enabled. Turn it off or mount it with noprotect option. 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. From this point of view you are right. Pramfs (or other solution like that) are out of page cache, so you can't do any memory transfer. It's like to have a disk but it's actually a separate piece of RAM. We could talk about it again when this kind of implementation will be done. Marco -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be "persistent". Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be persistent. Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be persistent. Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? In x86 you should have the write protection enabled. Turn it off or mount it with noprotect option. 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. From this point of view you are right. Pramfs (or other solution like that) are out of page cache, so you can't do any memory transfer. It's like to have a disk but it's actually a separate piece of RAM. We could talk about it again when this kind of implementation will be done. Marco -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/28/2013 03:02 PM, Marco Stornelli wrote: Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be persistent. Thank you, we'll look at PRAMFS closer, but right now, after trying it I have a couple of concerns I'd appreciate if you could clarify: 1) As you advised, I tried to reserve a range of memory (passing memmap=4G$4G at boot) and mounted PRAMFS using the following options: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs And it turned out that PRAMFS is very slow as compared to ramfs: # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.23498 s, 45.4 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04692 s, 138 MB/s We need it to be as fast as usual RAM, because otherwise the benefit of it over hdd disappears. So before diving into the code, I'd like to ask you if it's intrinsic to PRAMFS, or can it be fixed? Or, perhaps, I used wrong mount/boot/config options (btw, I enabled only CONFIG_PRAMFS)? In x86 you should have the write protection enabled. Turn it off or mount it with noprotect option. I tried. This helps, but the write rate is still too low: with write protect: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096 none /mnt/pramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 17.6007 s, 23.8 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 4.32923 s, 96.9 MB/s w/o write protect: # mount -t pramfs -o physaddr=0x1,init=4G,bs=4096,noprotect none /mnt/pramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.07748 s, 46.2 MB/s # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/pramfs/dummy bs=4096 count=$[100*1024] conv=notrunc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.04596 s, 138 MB/s Also tried turning off CONFIG_PRAMFS_WRITE_PROTECT, the result is the same: the rate does not exceed 150 MB/s, which is too slow comparing to ramfs: # mount -t ramfs none /mnt/ramfs # dd if=/dev/zero of=/mnt/pramfs if=/dev/zero of=/mnt/ramfs/dummy bs=4096 count=$[100*1024] 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 0.200809 s, 2.1 GB/s 2) To enable saving application dump files in memory using PRAMFS, one should reserve half of RAM for it. That's too expensive. While with ramfs, once SPLICE_F_MOVE flag is implemented, one could move anonymous memory pages to ramfs page cache and after kexec move it back so that almost no extra memory space costs would be required. Of course, SPLICE_F_MOVE is to be yet implemented, but with PRAMFS significant memory costs are inevitable... or am I wrong? Thanks. From this point of view you are right. Pramfs (or other solution like that) are out of page cache, so you can't do any memory transfer. It's like to have a disk but it's actually a separate piece of
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be "persistent". Regards, Marco -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ Marco -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ Marco -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] pram: persistent over-kexec memory file system
Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. http://pramfs.sourceforge.net/ AFAIU it's a bit different thing: PRAMFS as well as pstore, which has already been merged, requires hardware support for over-reboot persistency, so called non-volatile RAM, i.e. RAM which is not directly accessible and so is not used by the kernel. On the contrary, what we'd like to have is preserving usual RAM on kexec. It is possible, because RAM is not reset during kexec. This would allow leaving applications working set as well as filesystem caches in place, speeding the reboot process as a whole and reducing the downtime significantly. Thanks. Actually not. You can use normal system RAM reserved at boot with mem parameter without any kernel change. Until an hard reset happens, that area will be persistent. Regards, Marco -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC] pram: persistent over-kexec memory file system
Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. A typical usage scenario would look like this: 1) Boot kernel with 'pram_banned=MEMRANGE' boot option (MEMRANGE=MEMMIN-MEMMAX). This is to prevent kexec from overwriting persistent data while loading the new kernel. Later on kexec will be forced to load kernel to the range specified. MEMRANGE=0-128M should be enough. 2) Mount pram file system and save dump files there: # mount -t pram none /mnt # criu dump -D /mnt -t $PID 3) Run kexec passing pram location to the new kernel and forcing it to load the kernel image to MEMRAGE: # kexec --load /vmlinuz --initrd=initrd.img \ --append="$(cat /proc/cmdline | sed -e 's/pram=[^ ]*//g') pram=$(cat /sys/kernel/pram)" \ --mem-min=$MEMMIN --mem-max=$MEMMAX # reboot 4) After reboot mount pram, restore processes, and cleanup: # mount -t pram none /mnt # criu restore -d -D /mnt # rm -f /mnt/* # umount /mnt In this patch I introduce the pram pseudo file system that keeps its memory in place during kexec. pram is based on ramfs, but it serializes and leaves in memory its content on unmount, and restores it on the next mount. To survive over kexec, pram finds the serialized content, whose location should be specified by 'pram' boot param (exported via /sys/kernel/pram), and reserves it at early boot. To avoid conflicts with other parts of the kernel that make early reservation too, pram tracks all memory regions that have ever been reserved and avoids using them for storing its data. Plus, it adds 'pram_banned' boot param, which can be used to explicitly disallow pram to use specified memory regions. This may be useful for avoiding conflicts with kexec loading the new kernel image (as it is done in the usage scenario). This implementation serves as a proof of concept and so has a number of limitations: * pram only supports regular files; directories, symlinks, etc are ignored * pram is implemented only for x86 * pram does not support swapping out * pram checksums serialized content and drops it in case it is corrupted with no possibility of restore What do you think about it, does it make sense to go on with this approach or should we reconsider it as a whole? Thanks. --- arch/x86/kernel/setup.c |2 + fs/Kconfig | 13 + fs/ramfs/Makefile |1 + fs/ramfs/inode.c|2 +- fs/ramfs/persistent.c | 699 +++ include/linux/pram.h| 18 ++ include/linux/ramfs.h |1 + kernel/ksysfs.c | 13 + mm/bootmem.c|5 + mm/memblock.c |7 +- mm/nobootmem.c |2 + mm/page_alloc.c |3 + 12 files changed, 764 insertions(+), 2 deletions(-) create mode 100644 fs/ramfs/persistent.c create mode 100644 include/linux/pram.h diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index f8ec578..7d22ad0 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -50,6 +50,7 @@ #include #include #include +#include #include #include @@ -1137,6 +1138,7 @@ void __init setup_arch(char **cmdline_p) early_acpi_boot_init(); initmem_init(); + pram_reserve(); memblock_find_dma_reserve(); #ifdef CONFIG_KVM_GUEST diff --git a/fs/Kconfig b/fs/Kconfig index c229f82..8d6943f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -168,6 +168,19 @@ config HUGETLB_PAGE source "fs/configfs/Kconfig" +config PRAM + bool "Persistent over-kexec memory storage" + depends on X86 + select CRC32 + select LIBCRC32C + select CRYPTO_CRC32C + select CRYPTO_CRC32C_INTEL + default n + help + pram is a filesystem that saves its content on unmount to be restored + on the next mount after kexec. It can be used for speeding up system + reboot by saving application memory images there. + endmenu menuconfig MISC_FILESYSTEMS diff --git a/fs/ramfs/Makefile b/fs/ramfs/Makefile index c71e65d..e6953d4 100644 --- a/fs/ramfs/Makefile +++ b/fs/ramfs/Makefile @@ -7,3 +7,4 @@ obj-y += ramfs.o file-mmu-y := file-nommu.o file-mmu-$(CONFIG_MMU) := file-mmu.o ramfs-objs += inode.o $(file-mmu-y) +ramfs-$(CONFIG_PRAM) += persistent.o diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c index c24f1e1..86f9f9a 100644 --- a/fs/ramfs/inode.c
[PATCH RFC] pram: persistent over-kexec memory file system
Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-place during reboot. This should eliminate most io operations the services would produce during initialization. To achieve this, we have implemented a pseudo file system that preserves its content during kexec. We propose saving CRIU dump files to this file system, kexec'ing and then restoring the processes in the newly booted kernel. A typical usage scenario would look like this: 1) Boot kernel with 'pram_banned=MEMRANGE' boot option (MEMRANGE=MEMMIN-MEMMAX). This is to prevent kexec from overwriting persistent data while loading the new kernel. Later on kexec will be forced to load kernel to the range specified. MEMRANGE=0-128M should be enough. 2) Mount pram file system and save dump files there: # mount -t pram none /mnt # criu dump -D /mnt -t $PID 3) Run kexec passing pram location to the new kernel and forcing it to load the kernel image to MEMRAGE: # kexec --load /vmlinuz --initrd=initrd.img \ --append=$(cat /proc/cmdline | sed -e 's/pram=[^ ]*//g') pram=$(cat /sys/kernel/pram) \ --mem-min=$MEMMIN --mem-max=$MEMMAX # reboot 4) After reboot mount pram, restore processes, and cleanup: # mount -t pram none /mnt # criu restore -d -D /mnt # rm -f /mnt/* # umount /mnt In this patch I introduce the pram pseudo file system that keeps its memory in place during kexec. pram is based on ramfs, but it serializes and leaves in memory its content on unmount, and restores it on the next mount. To survive over kexec, pram finds the serialized content, whose location should be specified by 'pram' boot param (exported via /sys/kernel/pram), and reserves it at early boot. To avoid conflicts with other parts of the kernel that make early reservation too, pram tracks all memory regions that have ever been reserved and avoids using them for storing its data. Plus, it adds 'pram_banned' boot param, which can be used to explicitly disallow pram to use specified memory regions. This may be useful for avoiding conflicts with kexec loading the new kernel image (as it is done in the usage scenario). This implementation serves as a proof of concept and so has a number of limitations: * pram only supports regular files; directories, symlinks, etc are ignored * pram is implemented only for x86 * pram does not support swapping out * pram checksums serialized content and drops it in case it is corrupted with no possibility of restore What do you think about it, does it make sense to go on with this approach or should we reconsider it as a whole? Thanks. --- arch/x86/kernel/setup.c |2 + fs/Kconfig | 13 + fs/ramfs/Makefile |1 + fs/ramfs/inode.c|2 +- fs/ramfs/persistent.c | 699 +++ include/linux/pram.h| 18 ++ include/linux/ramfs.h |1 + kernel/ksysfs.c | 13 + mm/bootmem.c|5 + mm/memblock.c |7 +- mm/nobootmem.c |2 + mm/page_alloc.c |3 + 12 files changed, 764 insertions(+), 2 deletions(-) create mode 100644 fs/ramfs/persistent.c create mode 100644 include/linux/pram.h diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index f8ec578..7d22ad0 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -50,6 +50,7 @@ #include linux/init_ohci1394_dma.h #include linux/kvm_para.h #include linux/dma-contiguous.h +#include linux/pram.h #include linux/errno.h #include linux/kernel.h @@ -1137,6 +1138,7 @@ void __init setup_arch(char **cmdline_p) early_acpi_boot_init(); initmem_init(); + pram_reserve(); memblock_find_dma_reserve(); #ifdef CONFIG_KVM_GUEST diff --git a/fs/Kconfig b/fs/Kconfig index c229f82..8d6943f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -168,6 +168,19 @@ config HUGETLB_PAGE source fs/configfs/Kconfig +config PRAM + bool Persistent over-kexec memory storage + depends on X86 + select CRC32 + select LIBCRC32C + select CRYPTO_CRC32C + select CRYPTO_CRC32C_INTEL + default n + help + pram is a filesystem that saves its content on unmount to be restored + on the next mount after kexec. It can be used for speeding up system + reboot by saving application memory images there. + endmenu menuconfig MISC_FILESYSTEMS diff --git a/fs/ramfs/Makefile b/fs/ramfs/Makefile index c71e65d..e6953d4 100644 --- a/fs/ramfs/Makefile +++ b/fs/ramfs/Makefile @@ -7,3 +7,4 @@ obj-y += ramfs.o file-mmu-y := file-nommu.o file-mmu-$(CONFIG_MMU) := file-mmu.o ramfs-objs += inode.o $(file-mmu-y) +ramfs-$(CONFIG_PRAM) += persistent.o diff