Re: [Users] flashcache
Let's bump thread again - has anyone tried dm-cache on openvz/centos 6 kernels? looks like some support is included: root@mu2:~# fgrep CONFIG_DM_CACHE /boot/config-2.6.32-042stab112.15-el6-openvz CONFIG_DM_CACHE=m CONFIG_DM_CACHE_MQ=m CONFIG_DM_CACHE_CLEANER=m From userspace utils view, I'm using Debian 8 as host, so should have some support, but not sure is dm-cache itself considered as somehow stable at all. On Mon, Nov 16, 2015 at 5:21 PM, Nick Knutovwrote: > I'v heard this from large VPS hosting provider. > > Anyway, even our intenal projects require more then 100MB/s in peak and more > then 100 GB of storage (while only 100 GB are free). So local SSDs are > cheaper for us, then 10G network and commercial version of pstorage. > > 16.11.2015 15:22, Corrado Fiore пишет: > >> Hi Nick, >> >> could you elaborate more on the second point? As far as I understood, >> pstorage is in fact targeted towards clusters with hundreds of containers, >> so I am a bit curious to understand where you got that information. >> >> If there's anyone on the list that has used pstorage in clusters > 7 - 9 >> nodes and wishes to share his or her experience, that's more than welcome. >> >> Thanks, >> Corrado >> >> >> On 16/11/2015, at 4:44 AM, Nick Knutov wrote: >> >>> Unfortunately, pstorage has two major disadvantages: >>> >>> 1) it's not free >>> 2) it not usable for more then 1-4 CT over 1 gigabit network in real >>> world cases (as far as I know) >> >> >> ___ >> Users mailing list >> Users@openvz.org >> https://lists.openvz.org/mailman/listinfo/users > > > -- > Best Regards, > Nick Knutov > http://knutov.com > ICQ: 272873706 > Voice: +7-904-84-23-130 > > ___ > Users mailing list > Users@openvz.org > https://lists.openvz.org/mailman/listinfo/users -- Best regards, [COOLCOLD-RIPN] ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hello, CT's requirements in I/O depend on an user application. 1 Gbit link is enough to supply up to 100 MB/s I/O in PStorage. Moreover according our statistic HSPs usually have 10-20 MB/s I/O per node, running tens of containers, because I/O is almost random. So, 1G network should be enough for usual scenarios. We have many customers with 1G storage backend in production. BTW, with PStorage you have an additional benefit: According our statistic 20% of nodes process 80% of I/O in DCs. When you unite disks into one cluster you have better I/O balance. Thanks, Alexander Kirov Odin Virtuozzo Storage, PM Odin -Original Message- From: users-boun...@openvz.org [mailto:users-boun...@openvz.org] On Behalf Of Corrado Fiore Sent: Monday, November 16, 2015 1:22 PM To: OpenVZ users <users@openvz.org> Subject: Re: [Users] flashcache Hi Nick, could you elaborate more on the second point? As far as I understood, pstorage is in fact targeted towards clusters with hundreds of containers, so I am a bit curious to understand where you got that information. If there's anyone on the list that has used pstorage in clusters > 7 - 9 nodes and wishes to share his or her experience, that's more than welcome. Thanks, Corrado On 16/11/2015, at 4:44 AM, Nick Knutov wrote: > Unfortunately, pstorage has two major disadvantages: > > 1) it's not free > 2) it not usable for more then 1-4 CT over 1 gigabit network in real world > cases (as far as I know) ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
I'v heard this from large VPS hosting provider. Anyway, even our intenal projects require more then 100MB/s in peak and more then 100 GB of storage (while only 100 GB are free). So local SSDs are cheaper for us, then 10G network and commercial version of pstorage. 16.11.2015 15:22, Corrado Fiore пишет: Hi Nick, could you elaborate more on the second point? As far as I understood, pstorage is in fact targeted towards clusters with hundreds of containers, so I am a bit curious to understand where you got that information. If there's anyone on the list that has used pstorage in clusters > 7 - 9 nodes and wishes to share his or her experience, that's more than welcome. Thanks, Corrado On 16/11/2015, at 4:44 AM, Nick Knutov wrote: Unfortunately, pstorage has two major disadvantages: 1) it's not free 2) it not usable for more then 1-4 CT over 1 gigabit network in real world cases (as far as I know) ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hi guys, i'm not sure about flashcache 3.x (if anybody used it and thus it had been ever compilable against OpenVZ kernels), but for flashcache 2.x i know for sure that it compiled ok several months ago => if something got broken now it is most probably some simple issue. So i suggest 1) file issues to bugs.openvz.org 2) try to fix it. :) At least for flashcache 2.x it should not be a big deal. Anyway in case 1) is done there can be someone who could check it/try to get it working. No issues in jira - chances are much lower. Hope that helps. -- Best regards, Konstantin Khorenko, Virtuozzo Linux Kernel Team On 11/13/2015 03:37 PM, Nick Knutov wrote: No. Even 2.x flashcashe is not possible to compile with recent openvz rhel6 kernels. 13.11.2015 15:57, CoolCold пишет: Bumping up - anyone still on flashcache & openvz kernels? Tried to compile flashcache 3.1.3 dkms against 2.6.32-042stab112.15 , getting errors: ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hi Nick, could you elaborate more on the second point? As far as I understood, pstorage is in fact targeted towards clusters with hundreds of containers, so I am a bit curious to understand where you got that information. If there's anyone on the list that has used pstorage in clusters > 7 - 9 nodes and wishes to share his or her experience, that's more than welcome. Thanks, Corrado On 16/11/2015, at 4:44 AM, Nick Knutov wrote: > Unfortunately, pstorage has two major disadvantages: > > 1) it's not free > 2) it not usable for more then 1-4 CT over 1 gigabit network in real world > cases (as far as I know) ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Unfortunately, pstorage has two major disadvantages: 1) it's not free 2) it not usable for more then 1-4 CT over 1 gigabit network in real world cases (as far as I know) 14.11.2015 16:12, Corrado Fiore пишет: You might want to use Odin Cloud Storage (pstorage) instead, as it goes beyond SSD acceleration, i.e. it is distributed and it offers file system corruption prevention (background scrubbing). -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hi, even if FlashCache compiled correctly, I would suggest you to not use it as the performance will most likely be sub-optimal (at least in my experience). You might want to use Odin Cloud Storage (pstorage) instead, as it goes beyond SSD acceleration, i.e. it is distributed and it offers file system corruption prevention (background scrubbing). Another alternative would be to use Btier (www.lessfs.com). It's been extremely stable and very fast in our experience. Best, Corrado Fiore On 13/11/2015, at 8:37 PM, Nick Knutov wrote: > > No. Even 2.x flashcashe is not possible to compile with recent openvz rhel6 > kernels. > > > 13.11.2015 15:57, CoolCold пишет: >> Bumping up - anyone still on flashcache & openvz kernels? Tried to >> compile flashcache 3.1.3 dkms against 2.6.32-042stab112.15 , getting >> errors: > > -- > Best Regards, > Nick Knutov > http://knutov.com > ICQ: 272873706 > Voice: +7-904-84-23-130 > > ___ > Users mailing list > Users@openvz.org > https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Bumping up - anyone still on flashcache & openvz kernels? Tried to compile flashcache 3.1.3 dkms against 2.6.32-042stab112.15 , getting errors: DKMS make.log for flashcache-1.0-227-gc0eeb3d1e539 for kernel 2.6.32-042stab112.15-el6-openvz (x86_64) Fri Nov 13 13:56:24 MSK 2015 make[1]: Entering directory '/var/lib/dkms/flashcache/1.0-227-gc0eeb3d1e539/build' grep: /etc/redhat-release: No such file or directory make -C /lib/modules/2.6.32-042stab112.15-el6-openvz/build M=/var/lib/dkms/flashcache/1.0-227-gc0eeb3d1e539/build modules V=0 make[2]: Entering directory '/usr/src/linux-headers-2.6.32-042stab112.15-el6-openvz' grep: /etc/redhat-release: No such file or directory CC [M] /var/lib/dkms/flashcache/1.0-227-gc0eeb3d1e539/build/flashcache_conf.o In file included from /usr/src/linux-headers-2.6.32-042stab112.15-el6-openvz/arch/x86/include/asm/timex.h:5:0, from include/linux/timex.h:171, from include/linux/jiffies.h:8, from include/linux/ktime.h:25, from include/linux/timer.h:5, from include/linux/workqueue.h:8, from include/linux/mmzone.h:19, from include/linux/gfp.h:4, from include/linux/kmod.h:22, from include/linux/module.h:13, from /var/lib/dkms/flashcache/1.0-227-gc0eeb3d1e539/build/flashcache_conf.c:26: /usr/src/linux-headers-2.6.32-042stab112.15-el6-openvz/arch/x86/include/asm/tsc.h: In function ‘vget_cycles’: /usr/src/linux-headers-2.6.32-042stab112.15-el6-openvz/arch/x86/include/asm/tsc.h:45:2: error: implicit declaration of function ‘__native_read_tsc’ [-Werror=implicit-function-declaration] return (cycles_t)__native_read_tsc(); ^ In file included from include/linux/sched.h:72:0, from include/linux/kmod.h:28, from include/linux/module.h:13, from /var/lib/dkms/flashcache/1.0-227-gc0eeb3d1e539/build/flashcache_conf.c:26: include/linux/signal.h: In function ‘sigaddset’: include/linux/signal.h:41:6: error: ‘_NSIG_WORDS’ undeclared (first use in this function) if (_NSIG_WORDS == 1) ... On Fri, Jul 11, 2014 at 4:34 AM, Nick Knutovwrote: > I think you are speaking here about different cases. > > One is making HA backup node. When we are backing up full node to > another node (1:1) - zfs send/receive is much better (and the goal is to > save data, not running processes). Without zfs - ploop snapshotting and > vzmigrate is good enough (over SSD), and rsync with ext4 (simfs inside > CT) is really pain. > > The other case is migrating large amount of CTs over large amount of > nodes for resource usage balancing [with zero downtime]. There is no > alternatives to vzmigrate here although zfs send/receive with > per-container ZVOL can speed up this process [if it's important to > transfer between nodes faster with less network usage] > > 10.07.2014 15:35, Pavel Odintsov пишет: >>> Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, >>> >I thought the point of migration is to don't have the CT notice any >>> >change, I don't see why the inode numbers should change. >> Do you have really working zero downtime vzmigrate on ZFS? >> > > -- > Best Regards, > Nick Knutov > http://knutov.com > ICQ: 272873706 > Voice: +7-904-84-23-130 > ___ > Users mailing list > Users@openvz.org > https://lists.openvz.org/mailman/listinfo/users -- Best regards, [COOLCOLD-RIPN] ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
No. Even 2.x flashcashe is not possible to compile with recent openvz rhel6 kernels. 13.11.2015 15:57, CoolCold пишет: Bumping up - anyone still on flashcache & openvz kernels? Tried to compile flashcache 3.1.3 dkms against 2.6.32-042stab112.15 , getting errors: -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
On 07/09/2014 06:58 PM, Kir Kolyshkin wrote: On 07/08/2014 11:54 PM, Pavel Snajdr wrote: On 07/08/2014 07:52 PM, Scott Dowdle wrote: Greetings, - Original Message - (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. From what I've gathered from a few presentations, ZFS on Linux (http://zfsonlinux.org/) is as stable but more performant than it is on the OpenSolaris forks... so you can build your own if you can spare the people to learn the best practices. I don't have a use for ZFS myself so I'm not really advocating it. TYL, Hi all, we run tens of OpenVZ nodes (bigger boxes, 256G RAM, 12cores+, 90 CTs at least). We've used to run ext4+flashcache, but ext4 has proven to be a bottleneck. That was the primary motivation behind ploop as far as I know. We've switched to ZFS on Linux around the time Ploop was announced and I didn't have second thoughts since. ZFS really *is* in my experience the best filesystem there is at the moment for this kind of deployment - especially if you use dedicated SSDs for ZIL and L2ARC, although the latter is less important. You will know what I'm talking about when you try this on boxes with lots of CTs doing LAMP load - databases and their synchronous writes are the real problem, which ZFS with dedicated ZIL device solves. Also there is the ARC caching, which is smarter then linux VFS cache - we're able to achieve about 99% of hitrate at about 99% of the time, even under high loads. Having said all that, I recommend everyone to give ZFS a chance, but I'm aware this is yet another out-of-mainline code and that doesn't suit everyone that well. Are you using per-container ZVOL or something else? That would mean I'd need to do another filesystem on top of ZFS, which would in turn mean I'd add another unnecessary layer of indirection. ZFS is a pooled storage like BTRFS is, we're giving one dataset to each container. vzctl tries to move the VE_PRIVATE folder around, so we had to add one more directory to put the VE_PRIVATE data into (see the first ls). Example from production: [r...@node2.prg.vpsfree.cz] ~ # zpool status vz pool: vz state: ONLINE scan: scrub repaired 0 in 1h24m with 0 errors on Tue Jul 8 16:22:17 2014 config: NAMESTATE READ WRITE CKSUM vz ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 sda ONLINE 0 0 0 sdb ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 sdg ONLINE 0 0 0 sdh ONLINE 0 0 0 logs mirror-3 ONLINE 0 0 0 sdc3ONLINE 0 0 0 sdd3ONLINE 0 0 0 cache sdc5 ONLINE 0 0 0 sdd5 ONLINE 0 0 0 errors: No known data errors [r...@node2.prg.vpsfree.cz] ~ # zfs list NAME USED AVAIL REFER MOUNTPOINT vz432G 2.25T36K /vz vz/private427G 2.25T 111K /vz/private vz/private/101 17.7G 42.3G 17.7G /vz/private/101 snip vz/root 104K 2.25T 104K /vz/root vz/template 5.38G 2.25T 5.38G /vz/template [r...@node2.prg.vpsfree.cz] ~ # zfs get compressratio vz/private/101 NAMEPROPERTY VALUE SOURCE vz/private/101 compressratio 1.38x - [r...@node2.prg.vpsfree.cz] ~ # ls /vz/private/101 private [r...@node2.prg.vpsfree.cz] ~ # ls /vz/private/101/private/ aquota.group aquota.user b bin boot dev etc git home lib snip [r...@node2.prg.vpsfree.cz] ~ # cat /etc/vz/conf/101.conf | grep -P PRIVATE|ROOT VE_ROOT=/vz/root/101 VE_PRIVATE=/vz/private/101/private ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Pavel Odintsov pavel.odint...@gmail.com writes: Hello! Yep, Read cache is nice and safe solution but not write cache :) No, we do not use ZFS in production yet. We done only very specific tests like this: https://github.com/zfsonlinux/zfs/issues/2458 But you can do some performance tests and share :) Why is everyone insisting on ext4 and even ext4 in individual zvols? I have done some testing with root and private directly on a zfs file system and so far everything seems to work just fine. What am I to expect down the road? [...] ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 But one customer can eat almost all L2ARC cache and displace another customers data. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. But ext4 has about few thouasands developers and the fix this issues asap but ZFS on Linux has only 3-5 developers which VERY slow. Because of this I recommends using ext4 with ploop because this solution is rock stable or ZFS with ZVOL's with ext4 because this solution if more reliable and more predictable then placing ZFS containers on ZFS volumes. On Thu, Jul 10, 2014 at 1:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 10:34 AM, Pavel Odintsov wrote: Hello! You scheme is fine but you can't divide I/O load with cgroup blkio (ioprio/iolimit/iopslimit) between different folders but between different ZVOL you do. Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. I could imagine following problems for per folder scheme: 1) Can't limit number of inodes in different folders (but there are not an inode limit for ZFS like ext4 but bug amount of files in container could broke node; How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) http://serverfault.com/questions/503658/can-you-set-inode-quotas-in-zfs) 2) Problems with system cache which used by all containers in HWN together This exactly isn't a problem, but a *HUGE* benefit, you'd need to see it in practice :) Linux VFS cache is really dumb in comparison to ARC. ARC's hitrates just can't be done with what linux currently offers. 3) Problems with live migration because you _should_ change inode numbers on diffferent nodes Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. 4) ZFS behaviour with linux software in some cases is very STRANGE (DIRECT_IO) How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). 5) ext4 has good support from vzctl (fsck, resize2fs) Yeah, but ext4 sucks big time. At least in my use-case. We've implemented most of vzctl create/destroy/etc. functionality in our vpsAdmin software instead. Guys, can I ask you to keep your mind open instead of fighting with pointless arguments? :) Give ZFS a try and then decide for yourselves. I think the community would benefit greatly if ZFS woudn't be fought as something alien in the Linux world, which I in my experience is what every Linux zealot I talk to about ZFS is doing. This is just not fair. It's primarily about technology, primarily about the best tool for the job. If we can implement something like this in Linux but without having ties to CDDL and possibly Oracle patents, that would be awesome, yet nobody has done such a thing yet. BTRFS is nowhere near ZFS when it comes to running larger scale deployments and in some regards I don't think it will ever match ZFS, just looking at the way it's been designed. I'm not trying to flame here, I'm trying to open you guys to the fact, that there really is a better alternative than you're currently seeing. And if it has some technological drawbacks like these that you're trying to point out, instead of pointing at them as something, which can't be
Re: [Users] flashcache
On 07/10/2014 11:35 AM, Pavel Odintsov wrote: Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases If you can give me concrete tests to run, sure, I'm curious to see if you're right - then we'd have something concrete to fix :) How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. zfs snapshot dataset@snapname zfs send dataset@snapname your-file or | ssh backuper zfs recv backupdataset That's done on block level. No need to run rsync anymore, it's a lot faster this way. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? Nope, vzmigrate isn't zero downtime. Due to vzctl/vzmigrate not supporting ZFS, we're implementing this our own way in vpsAdmin, which in it's 2.0 re-implementation will go opensource under GPL. How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. As I said, this is under development and will improve. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 I can't account for caches per CT, but I didn't have any need to do so. L2ARC != ARC, ARC is in system RAM, L2ARC is intended to be on SSD for the content of ARC that is the least significant in case of low memory - it gets pushed from ARC to L2ARC. ARC has two primary lists of cached data - most frequently used and most recently used and these two lists are divided by a boundary marking which data can be pushed away in low mem situation. It doesn't happen like with Linux VFS cache that you're copying one big file and it pushes out all of the other useful data there. Thanks to this distinction of MRU and MFU ARC achieves far better hitrates. But one customer can eat almost all L2ARC cache and displace another customers data. Yes, but ZFS keeps track on what's being used, so useful data can't be pushed away that easily, things naturally balance themselves due to the way how ARC mechanism works. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. I haven't encountered any problems especially with vzquota disabled (no need for it, ZFS has its own quotas, which never need to be recalculated as with vzquota). But ext4 has about few thouasands developers and the fix this issues asap but ZFS on Linux has only 3-5 developers which VERY slow. Because of this I recommends using ext4 with ploop because this solution is rock stable or ZFS with ZVOL's with ext4 because this solution if more reliable and more predictable then placing ZFS containers on ZFS volumes. ZFS itself is a stable and mature filesystem, it first shipped as production with Solaris in 2006. And it's still being developed upstream as OpenZFS, that code is shared between the primary version - Illumos and the ports - FreeBSD, OS X, Linux. So what really needs and still is being developed is the way how ZFS is run under Linux kernel, but with recent release of 0.6.3, things have gotten mature enough to be used in production without any fears. Of course, no software is without bugs, but I can say with absolute certainty that ZFS will never eat your data, the only problem you can encounter is with the memory management, which is done really differently in Linux than in ZFS's original habitat - Solaris. /snajpa On Thu, Jul 10, 2014 at 1:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 10:34 AM, Pavel Odintsov wrote: Hello! You scheme is fine but you can't divide I/O load with cgroup blkio (ioprio/iolimit/iopslimit) between different folders but between different ZVOL you do. Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. I could imagine following problems for per folder scheme: 1) Can't limit number of inodes in different folders (but there are not an inode limit for ZFS like
Re: [Users] flashcache
Thank you for your answers! It's really useful information. On Thu, Jul 10, 2014 at 2:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 11:35 AM, Pavel Odintsov wrote: Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases If you can give me concrete tests to run, sure, I'm curious to see if you're right - then we'd have something concrete to fix :) How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. zfs snapshot dataset@snapname zfs send dataset@snapname your-file or | ssh backuper zfs recv backupdataset That's done on block level. No need to run rsync anymore, it's a lot faster this way. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? Nope, vzmigrate isn't zero downtime. Due to vzctl/vzmigrate not supporting ZFS, we're implementing this our own way in vpsAdmin, which in it's 2.0 re-implementation will go opensource under GPL. How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. As I said, this is under development and will improve. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 I can't account for caches per CT, but I didn't have any need to do so. L2ARC != ARC, ARC is in system RAM, L2ARC is intended to be on SSD for the content of ARC that is the least significant in case of low memory - it gets pushed from ARC to L2ARC. ARC has two primary lists of cached data - most frequently used and most recently used and these two lists are divided by a boundary marking which data can be pushed away in low mem situation. It doesn't happen like with Linux VFS cache that you're copying one big file and it pushes out all of the other useful data there. Thanks to this distinction of MRU and MFU ARC achieves far better hitrates. But one customer can eat almost all L2ARC cache and displace another customers data. Yes, but ZFS keeps track on what's being used, so useful data can't be pushed away that easily, things naturally balance themselves due to the way how ARC mechanism works. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. I haven't encountered any problems especially with vzquota disabled (no need for it, ZFS has its own quotas, which never need to be recalculated as with vzquota). But ext4 has about few thouasands developers and the fix this issues asap but ZFS on Linux has only 3-5 developers which VERY slow. Because of this I recommends using ext4 with ploop because this solution is rock stable or ZFS with ZVOL's with ext4 because this solution if more reliable and more predictable then placing ZFS containers on ZFS volumes. ZFS itself is a stable and mature filesystem, it first shipped as production with Solaris in 2006. And it's still being developed upstream as OpenZFS, that code is shared between the primary version - Illumos and the ports - FreeBSD, OS X, Linux. So what really needs and still is being developed is the way how ZFS is run under Linux kernel, but with recent release of 0.6.3, things have gotten mature enough to be used in production without any fears. Of course, no software is without bugs, but I can say with absolute certainty that ZFS will never eat your data, the only problem you can encounter is with the memory management, which is done really differently in Linux than in ZFS's original habitat - Solaris. /snajpa On Thu, Jul 10, 2014 at 1:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 10:34 AM, Pavel Odintsov wrote: Hello! You scheme is fine but you can't divide I/O load with cgroup blkio (ioprio/iolimit/iopslimit) between different folders but between different ZVOL you do. Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. I
Re: [Users] flashcache
Could you share your patches to vzmigrate and vzctl? On Thu, Jul 10, 2014 at 2:25 PM, Pavel Odintsov pavel.odint...@gmail.com wrote: Thank you for your answers! It's really useful information. On Thu, Jul 10, 2014 at 2:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 11:35 AM, Pavel Odintsov wrote: Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases If you can give me concrete tests to run, sure, I'm curious to see if you're right - then we'd have something concrete to fix :) How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. zfs snapshot dataset@snapname zfs send dataset@snapname your-file or | ssh backuper zfs recv backupdataset That's done on block level. No need to run rsync anymore, it's a lot faster this way. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? Nope, vzmigrate isn't zero downtime. Due to vzctl/vzmigrate not supporting ZFS, we're implementing this our own way in vpsAdmin, which in it's 2.0 re-implementation will go opensource under GPL. How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. As I said, this is under development and will improve. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 I can't account for caches per CT, but I didn't have any need to do so. L2ARC != ARC, ARC is in system RAM, L2ARC is intended to be on SSD for the content of ARC that is the least significant in case of low memory - it gets pushed from ARC to L2ARC. ARC has two primary lists of cached data - most frequently used and most recently used and these two lists are divided by a boundary marking which data can be pushed away in low mem situation. It doesn't happen like with Linux VFS cache that you're copying one big file and it pushes out all of the other useful data there. Thanks to this distinction of MRU and MFU ARC achieves far better hitrates. But one customer can eat almost all L2ARC cache and displace another customers data. Yes, but ZFS keeps track on what's being used, so useful data can't be pushed away that easily, things naturally balance themselves due to the way how ARC mechanism works. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. I haven't encountered any problems especially with vzquota disabled (no need for it, ZFS has its own quotas, which never need to be recalculated as with vzquota). But ext4 has about few thouasands developers and the fix this issues asap but ZFS on Linux has only 3-5 developers which VERY slow. Because of this I recommends using ext4 with ploop because this solution is rock stable or ZFS with ZVOL's with ext4 because this solution if more reliable and more predictable then placing ZFS containers on ZFS volumes. ZFS itself is a stable and mature filesystem, it first shipped as production with Solaris in 2006. And it's still being developed upstream as OpenZFS, that code is shared between the primary version - Illumos and the ports - FreeBSD, OS X, Linux. So what really needs and still is being developed is the way how ZFS is run under Linux kernel, but with recent release of 0.6.3, things have gotten mature enough to be used in production without any fears. Of course, no software is without bugs, but I can say with absolute certainty that ZFS will never eat your data, the only problem you can encounter is with the memory management, which is done really differently in Linux than in ZFS's original habitat - Solaris. /snajpa On Thu, Jul 10, 2014 at 1:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 10:34 AM, Pavel Odintsov wrote: Hello! You scheme is fine but you can't divide I/O load with cgroup blkio (ioprio/iolimit/iopslimit) between different folders but between different ZVOL you do. Not true, IO limits are working as they should
Re: [Users] flashcache
On 07/10/2014 12:32 PM, Pavel Odintsov wrote: Could you share your patches to vzmigrate and vzctl? We don't have any, where vzctl/vzmigrate didn't satisfy our needs, we've went the way around these utilities and let vpsAdmin on the hwnode manage things. You can take a look here: https://github.com/vpsfreecz/vpsadmind I wouldn't recommend anyone outside of our organization to use vpsAdmin yet, as the 2.0 transition to self-describing RESTful API is still underway. As soon as it's finished and well documented, I'll post a note here as well. The 2.0 version will be primarily controled via a CLI tool, which autogenerates itself from the API description. A running version of the API can be seen here: https://api.vpsfree.cz/v1/ Github repos: https://github.com/vpsfreecz/vpsadminapi (the API) https://github.com/vpsfreecz/vpsadminctl (the CLI tool) https://github.com/vpsfreecz/vpsadmind (deamon run on hwnode) https://github.com/vpsfreecz/vpsadmindctl (CLI tool to control the daemon) https://github.com/vpsfreecz/vpsadmin The last repo is the vpsAdmin 1.x, which all 2.0 things still require to run, it's a pain to get this running yourself, but stay tuned, once we get rid of 1.x and document 2.0 properly, it's going to be a great thing. /snajpa On Thu, Jul 10, 2014 at 2:25 PM, Pavel Odintsov pavel.odint...@gmail.com wrote: Thank you for your answers! It's really useful information. On Thu, Jul 10, 2014 at 2:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 11:35 AM, Pavel Odintsov wrote: Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases If you can give me concrete tests to run, sure, I'm curious to see if you're right - then we'd have something concrete to fix :) How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. zfs snapshot dataset@snapname zfs send dataset@snapname your-file or | ssh backuper zfs recv backupdataset That's done on block level. No need to run rsync anymore, it's a lot faster this way. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? Nope, vzmigrate isn't zero downtime. Due to vzctl/vzmigrate not supporting ZFS, we're implementing this our own way in vpsAdmin, which in it's 2.0 re-implementation will go opensource under GPL. How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. As I said, this is under development and will improve. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 I can't account for caches per CT, but I didn't have any need to do so. L2ARC != ARC, ARC is in system RAM, L2ARC is intended to be on SSD for the content of ARC that is the least significant in case of low memory - it gets pushed from ARC to L2ARC. ARC has two primary lists of cached data - most frequently used and most recently used and these two lists are divided by a boundary marking which data can be pushed away in low mem situation. It doesn't happen like with Linux VFS cache that you're copying one big file and it pushes out all of the other useful data there. Thanks to this distinction of MRU and MFU ARC achieves far better hitrates. But one customer can eat almost all L2ARC cache and displace another customers data. Yes, but ZFS keeps track on what's being used, so useful data can't be pushed away that easily, things naturally balance themselves due to the way how ARC mechanism works. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. I haven't encountered any problems especially with vzquota disabled (no need for it, ZFS has its own quotas, which never need to be recalculated as with vzquota). But ext4 has about few thouasands developers and the fix this issues asap but ZFS on Linux has only 3-5 developers which VERY slow. Because of this I recommends using ext4 with ploop because this solution is rock
Re: [Users] flashcache
On 07/10/2014 12:50 PM, Pavel Snajdr wrote: On 07/10/2014 12:32 PM, Pavel Odintsov wrote: Could you share your patches to vzmigrate and vzctl? We don't have any, where vzctl/vzmigrate didn't satisfy our needs, we've went the way around these utilities and let vpsAdmin on the hwnode manage things. You can take a look here: https://github.com/vpsfreecz/vpsadmind I wouldn't recommend anyone outside of our organization to use vpsAdmin yet, as the 2.0 transition to self-describing RESTful API is still underway. As soon as it's finished and well documented, I'll post a note here as well. The 2.0 version will be primarily controled via a CLI tool, which autogenerates itself from the API description. A running version of the API can be seen here: https://api.vpsfree.cz/v1/ Github repos: https://github.com/vpsfreecz/vpsadminapi (the API) https://github.com/vpsfreecz/vpsadminctl (the CLI tool) https://github.com/vpsfreecz/vpsadmind (deamon run on hwnode) https://github.com/vpsfreecz/vpsadmindctl (CLI tool to control the daemon) https://github.com/vpsfreecz/vpsadmin The last repo is the vpsAdmin 1.x, which all 2.0 things still require to run, it's a pain to get this running yourself, but stay tuned, once we get rid of 1.x and document 2.0 properly, it's going to be a great thing. /snajpa Though, if you don't mind managing things via a web interface, vpsAdmin 1.x can be installed through these scripts: https://github.com/vpsfreecz/vpsadmininstall /snajpa On Thu, Jul 10, 2014 at 2:25 PM, Pavel Odintsov pavel.odint...@gmail.com wrote: Thank you for your answers! It's really useful information. On Thu, Jul 10, 2014 at 2:08 PM, Pavel Snajdr li...@snajpa.net wrote: On 07/10/2014 11:35 AM, Pavel Odintsov wrote: Not true, IO limits are working as they should (if we're talking vzctl set --iolimit/--iopslimit). I've kicked the ZoL guys around to add IO accounting support, so it is there. You can share tests with us? For standard folders like simfs this limits works bad in big number of cases If you can give me concrete tests to run, sure, I'm curious to see if you're right - then we'd have something concrete to fix :) How? ZFS doesn't have a limit on number of files (2^48 isn't a limit really) It's ok when your customer create 1 billion of small files on 10GB VPS and you will try to archive it for backup? On slow disk system it's really nightmare because a lot of disk operations which kills your I/O. zfs snapshot dataset@snapname zfs send dataset@snapname your-file or | ssh backuper zfs recv backupdataset That's done on block level. No need to run rsync anymore, it's a lot faster this way. Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? Nope, vzmigrate isn't zero downtime. Due to vzctl/vzmigrate not supporting ZFS, we're implementing this our own way in vpsAdmin, which in it's 2.0 re-implementation will go opensource under GPL. How exactly? I haven't seen a problem with any userspace software, other than MySQL default setting to AIO (it fallbacks to older method), which ZFS doesn't support (*yet*, they have it in their plans). I speaks about MySQL primarily. I have thousands of containers and I can tune MySQL for another mode for all customers, it's impossible. As I said, this is under development and will improve. L2ARC cache really smart Yep, fine, I knew. But can you account L2ARC cache usage per customer? OpenVZ can it via flag: sysctl -a|grep pagecache_isola ubc.pagecache_isolation = 0 I can't account for caches per CT, but I didn't have any need to do so. L2ARC != ARC, ARC is in system RAM, L2ARC is intended to be on SSD for the content of ARC that is the least significant in case of low memory - it gets pushed from ARC to L2ARC. ARC has two primary lists of cached data - most frequently used and most recently used and these two lists are divided by a boundary marking which data can be pushed away in low mem situation. It doesn't happen like with Linux VFS cache that you're copying one big file and it pushes out all of the other useful data there. Thanks to this distinction of MRU and MFU ARC achieves far better hitrates. But one customer can eat almost all L2ARC cache and displace another customers data. Yes, but ZFS keeps track on what's being used, so useful data can't be pushed away that easily, things naturally balance themselves due to the way how ARC mechanism works. I'm not agains ZFS but I'm against of usage ZFS as underlying system for containers. We caught ~100 kernel bugs with simfs on EXT4 when customers do some strange thinks. I haven't encountered any problems especially with vzquota disabled (no need for it, ZFS has its own quotas, which never need to
Re: [Users] flashcache
There are two important moments here: 1) As Pavel wrote IO can't be separated easily with one fs (now, but I think it can change with cgroups in future) 2) per-user quota inside CT is supported for ext4 only now This two moments can be important for you or not. In our real life we never had any issues with IO in production (and we are migrating to SSD, so IO is always enough now), but most of our/customers CTs are shared hosting in someway, so having per-user quota is critical. 10.07.2014 14:42, Aleksandar Ivanisevic пишет: Why is everyone insisting on ext4 and even ext4 in individual zvols? I have done some testing with root and private directly on a zfs file system and so far everything seems to work just fine. What am I to expect down the road? -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
I think you are speaking here about different cases. One is making HA backup node. When we are backing up full node to another node (1:1) - zfs send/receive is much better (and the goal is to save data, not running processes). Without zfs - ploop snapshotting and vzmigrate is good enough (over SSD), and rsync with ext4 (simfs inside CT) is really pain. The other case is migrating large amount of CTs over large amount of nodes for resource usage balancing [with zero downtime]. There is no alternatives to vzmigrate here although zfs send/receive with per-container ZVOL can speed up this process [if it's important to transfer between nodes faster with less network usage] 10.07.2014 15:35, Pavel Odintsov пишет: Why? ZFS send/receive is able to do bit-by-bit identical copy of the FS, I thought the point of migration is to don't have the CT notice any change, I don't see why the inode numbers should change. Do you have really working zero downtime vzmigrate on ZFS? -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
On 07/08/2014 07:52 PM, Scott Dowdle wrote: Greetings, - Original Message - (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. From what I've gathered from a few presentations, ZFS on Linux (http://zfsonlinux.org/) is as stable but more performant than it is on the OpenSolaris forks... so you can build your own if you can spare the people to learn the best practices. I don't have a use for ZFS myself so I'm not really advocating it. TYL, Hi all, we run tens of OpenVZ nodes (bigger boxes, 256G RAM, 12cores+, 90 CTs at least). We've used to run ext4+flashcache, but ext4 has proven to be a bottleneck. That was the primary motivation behind ploop as far as I know. We've switched to ZFS on Linux around the time Ploop was announced and I didn't have second thoughts since. ZFS really *is* in my experience the best filesystem there is at the moment for this kind of deployment - especially if you use dedicated SSDs for ZIL and L2ARC, although the latter is less important. You will know what I'm talking about when you try this on boxes with lots of CTs doing LAMP load - databases and their synchronous writes are the real problem, which ZFS with dedicated ZIL device solves. Also there is the ARC caching, which is smarter then linux VFS cache - we're able to achieve about 99% of hitrate at about 99% of the time, even under high loads. Having said all that, I recommend everyone to give ZFS a chance, but I'm aware this is yet another out-of-mainline code and that doesn't suit everyone that well. snajpa ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
On 07/08/2014 11:54 PM, Pavel Snajdr wrote: On 07/08/2014 07:52 PM, Scott Dowdle wrote: Greetings, - Original Message - (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. From what I've gathered from a few presentations, ZFS on Linux (http://zfsonlinux.org/) is as stable but more performant than it is on the OpenSolaris forks... so you can build your own if you can spare the people to learn the best practices. I don't have a use for ZFS myself so I'm not really advocating it. TYL, Hi all, we run tens of OpenVZ nodes (bigger boxes, 256G RAM, 12cores+, 90 CTs at least). We've used to run ext4+flashcache, but ext4 has proven to be a bottleneck. That was the primary motivation behind ploop as far as I know. We've switched to ZFS on Linux around the time Ploop was announced and I didn't have second thoughts since. ZFS really *is* in my experience the best filesystem there is at the moment for this kind of deployment - especially if you use dedicated SSDs for ZIL and L2ARC, although the latter is less important. You will know what I'm talking about when you try this on boxes with lots of CTs doing LAMP load - databases and their synchronous writes are the real problem, which ZFS with dedicated ZIL device solves. Also there is the ARC caching, which is smarter then linux VFS cache - we're able to achieve about 99% of hitrate at about 99% of the time, even under high loads. Having said all that, I recommend everyone to give ZFS a chance, but I'm aware this is yet another out-of-mainline code and that doesn't suit everyone that well. Are you using per-container ZVOL or something else? ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hi all! I thought it's really not good idea because technology like ssd caching should be tested _thoroughly_ before production use. But you could try it with simfs but beware of ploop because it's really not an standard ext4 with custom caches and unexpected behaviour in some cases. On Tue, Jul 8, 2014 at 1:59 PM, Aleksandar Ivanisevic aleksan...@ivanisevic.de wrote: Hi, is anyone using flashcache vith openvz? If so, which version and with which kernel? Versions lower than 3 do not compile against the latest el6 kernel and version 3.11 and the latest git oopses in flashcache_md_write_kickoff with a null pointer. I see provisions to detect ovz kernel source in flashcache makefile, so someone must be compiling and using it. Any other SSD caching software that works with openvz? regards, ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users -- Sincerely yours, Pavel Odintsov ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
I am actually planning on using it only on test systems where i have commodity SATA disks that are getting a bit overwhelmed. I hope to get better value from a SATA+SSD combination that I would with SAS disks and the appropriate controllers and fancy RAID levels that cost 3 times more at least. Anyway, looks like that bug also got fixed in 092.1, at least it doesn't oops immediately any more. Pavel Odintsov pavel.odint...@gmail.com writes: Hi all! I thought it's really not good idea because technology like ssd caching should be tested _thoroughly_ before production use. But you could try it with simfs but beware of ploop because it's really not an standard ext4 with custom caches and unexpected behaviour in some cases. On Tue, Jul 8, 2014 at 1:59 PM, Aleksandar Ivanisevic aleksan...@ivanisevic.de wrote: Hi, is anyone using flashcache vith openvz? If so, which version and with which kernel? Versions lower than 3 do not compile against the latest el6 kernel and version 3.11 and the latest git oopses in flashcache_md_write_kickoff with a null pointer. I see provisions to detect ovz kernel source in flashcache makefile, so someone must be compiling and using it. Any other SSD caching software that works with openvz? regards, ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
I knew about few incidents with ___FULL___ data loss from customers of flashcache. Beware of it in production. If you want speed you can try ZFS with l2arc/zvol cache because it's native solution. On Tue, Jul 8, 2014 at 8:05 PM, Nick Knutov m...@knutov.com wrote: We are using latest flashcache 2.* with 2.6.32-042stab083.2 in production for a long time. Planning to migrate 3.0 with latest 090.5 but did not tried yet. 08.07.2014 15:59, Aleksandar Ivanisevic пишет: Hi, is anyone using flashcache vith openvz? If so, which version and with which kernel? Versions lower than 3 do not compile against the latest el6 kernel and version 3.11 and the latest git oopses in flashcache_md_write_kickoff with a null pointer. I see provisions to detect ovz kernel source in flashcache makefile, so someone must be compiling and using it. Any other SSD caching software that works with openvz? -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users -- Sincerely yours, Pavel Odintsov ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
We are using this for only cashing reads (mode thru), not writes. (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. Anyway, we are doing completely migrate to SSD. It's just cheaper. 08.07.2014 22:23, Pavel Odintsov пишет: I knew about few incidents with ___FULL___ data loss from customers of flashcache. Beware of it in production. If you want speed you can try ZFS with l2arc/zvol cache because it's native solution. -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Greetings, - Original Message - (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. From what I've gathered from a few presentations, ZFS on Linux (http://zfsonlinux.org/) is as stable but more performant than it is on the OpenSolaris forks... so you can build your own if you can spare the people to learn the best practices. I don't have a use for ZFS myself so I'm not really advocating it. TYL, -- Scott Dowdle 704 Church Street Belgrade, MT 59714 (406)388-0827 [home] (406)994-3931 [work] ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
I know about this project, but what about stability/compatibility ZFS on Linux with OpenVZ kernel? Has anyone ever tested it? Also, with ext4 I can always at any [our] datacenter boot to rescue mode and, for example, move data to another/new server. I have no idea how to get ZFS data if something happen wrong with hardware or recently installed kernel with usual From the other side, we are using flashcache in production for about two years. With zero problems during all this time. It is not as fast as Bcache (which is not compatible with OpenVZ I think), but it solves problem well. 08.07.2014 23:52, Scott Dowdle пишет: Greetings, - Original Message - (offtopic) We can not use ZFS. Unfortunately, NAS with something like Nexenta is to expensive for us. From what I've gathered from a few presentations, ZFS on Linux (http://zfsonlinux.org/) is as stable but more performant than it is on the OpenSolaris forks... so you can build your own if you can spare the people to learn the best practices. I don't have a use for ZFS myself so I'm not really advocating it. TYL, -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users
Re: [Users] flashcache
Hello! Yep, Read cache is nice and safe solution but not write cache :) No, we do not use ZFS in production yet. We done only very specific tests like this: https://github.com/zfsonlinux/zfs/issues/2458 But you can do some performance tests and share :) On Wed, Jul 9, 2014 at 12:55 AM, Nick Knutov m...@knutov.com wrote: I read http://www.stableit.ru/2014/07/using-zfs-with-openvz-openvzfs.html . Do you use it in production? Can you share speed tests or some other experience with zfs and openvz? 08.07.2014 22:23, Pavel Odintsov пишет: I knew about few incidents with ___FULL___ data loss from customers of flashcache. Beware of it in production. If you want speed you can try ZFS with l2arc/zvol cache because it's native solution. -- Best Regards, Nick Knutov http://knutov.com ICQ: 272873706 Voice: +7-904-84-23-130 ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users -- Sincerely yours, Pavel Odintsov ___ Users mailing list Users@openvz.org https://lists.openvz.org/mailman/listinfo/users