RE: [OpenAFS] openafs fileservers in VMware ESX
Here I just move my raid card and 4 drives. Linux 2.6 doesn't have support for the older IDE raid cards. New systems have soft raid which allows a drive only move. Its particularly easy with sata drives. tedc -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rodney M Dyer Sent: Thursday, April 21, 2005 8:50 AM To: Derek Atkins; Matthew Cocker Cc: openafs-info@openafs.org Subject: Re: [OpenAFS] openafs fileservers in VMware ESX At 11:51 PM 4/20/2005, Derek Atkins wrote: I've never seen any reason to virtualize an AFS server. Ever. The key is IO bandwith, which isn't increased by virtualization. You really want separate PHYSICAL servers for AFS servers. Virtualization does not give you any benefits due to hardware failure, power failure, or any other failure. It just adds overhead. I agree. I've never understood the big honking box syndrom. It has always seemed to me to be an indication of Pointy Haired Bosses being marketed to that causes this situation. The point of AFS is speed, distribution, and scaleability, but you also get redundancy. Putting an AFS cell on a virtualized server is just IMHO...silly. I can't see how in the long run you save money with specialized boxes when server PCs from DELL, which can be used for AFS file servers, cost less than $500 a pop. Of course if you go rack mount, things get more pricey, but still, compared to some specialized box from a proprietary vendor? Rodney ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
On Thursday, April 21, 2005 02:35:01 PM +1200 Matthew Cocker [EMAIL PROTECTED] wrote: Hi We have just invested in a Fibre Channel SANs and several FC attached ESX servers (brillant product, just love vmotion and virtual center) and are playing with Virtualised Openafs Fileservers. All is working very well except if we put to many volumes on a server at which point vos listvol takes a very long time to return. If we have say 5000-7000 volumes (about 50Gb) on a vice partition performance is equivalent to hardware server. At 10k volumes to 40k volumes 100-300Gb we have problems with vos listvol. This is not a huge problem for us as we wanted to do more smaller machines any way to take advantage of the VM environment but it does make me wonder why this occurs. What exactly does vos listvol do? does it scan the vice partitions and return all the volumes it finds (du -sh /vicepa takes a huge amount of time too so maybe this is a vm issue)? Is any network traffic exchanged with the DBs? When we start vos listvol on the virtualised server with lots of volumes it just seems to stop working with the cpu usage for the afs process not jumping above 1-2%. An strace (available if anyone interested) shows the vos listvol is doing something (although very slowly). If the virtualised server has less volumes cpu usage jumps up to 30-50% and every thing works. The only thing effected seems to be vos listvol as accessing a volume stored on the server is quick (from user point of view). vos backup stuff all seems to work. Hardware server with same number of volumes works OK. SANS monitoring suggests there is not a data access issue on that side. Not sure this is an AFS issue but any suggestion to help me understand why vos listvol is effected so badly apprepriated. You haven't told us what kernel version or architecture are involved, or what OpenAFS versions your servers or vos client are. That makes it hard to tell which known-and-fixed bugs you might be running into. Note that 'vos' doesn't do anything other than talk to servers. So if you run strace on vos, what you're going to see is that it just sits there waiting for a response from the server it's talking to. The RPC that 'vos listvol' makes returns an array of volIntInfo structures, one per volume. Each structure is about 115 bytes on the wire (slightly more in memory), and they _all_ need to be allocated and filled in before the server will start returning any data. For 40K volumes, that's about 5MB of memory, allocated 128 bytes at a time. That's not too excessive, but it's worth noting that that much data will take some time to allocate, marshall, transfer, and unmarshall. Perhaps more importantly, that same RPC needs to attach each volume in order to read its header, and depending on the version of OpenAFS you're running, that operation involves a buffer sync on every attach. Which means that running listvol against a partition with 4 volumes is equivalent to running 'sync' 4 times. Running a new enough version will fix that, but at the expense of details reported by 'vos examine' and 'vos listvol -long' being out of date by as much as 25 minutes. -- Jeffrey T. Hutzelman (N3NHS) [EMAIL PROTECTED] Sr. Research Systems Programmer School of Computer Science - Research Computing Facility Carnegie Mellon University - Pittsburgh, PA ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Hi You haven't told us what kernel version or architecture are involved, or what OpenAFS versions your servers or vos client are. That makes it hard to tell which known-and-fixed bugs you might be running into. kernel is 2.4.29 from kernel.org afs version is 1.2.13 OS is debian stable vos lisvol is being run on the server as vos listvol localhost -local Cheers Matt ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Matthew Cocker wrote: Please don't mention Dell servers around me. For the last three years we have had dell servers with attached storage. It has been a nightmare from day one. First we had to have all the scsi disks (100 of them) replaced becasue they were incompatible with the Dell backplanes (disks were supplied by Dell), then we have had major issues with the Dell raid cards not detecting dead raid disks, not rebuilding from a single dead disk, the dell powervaults just turning themselves off etc. Wow, your experience is opposite mine. I swear by Dell servers, and have used nothing but in the past 7 years or so. No major problems at all, and over that time we have only lost one drive in a RAID system, and it rebuilt just fine. We cycle out the critical servers every 3-4 years, but we have some of the older servers doing non-critical things, and they are still going strong. Strange that we are having such opposite experiences. -Dj -- Dj Merrill [EMAIL PROTECTED] TSA: Totally Screwing Aviation ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
ted creedon wrote: Here I just move my raid card and 4 drives. Linux 2.6 doesn't have support for the older IDE raid cards. New systems have soft raid which allows a drive only move. Its particularly easy with sata drives. tedc We have tried sata disks but they have very low mean times between failure and are rated for only ~35% spin time, not a profile that matches our requirements. They also perform very badly with non serial data reads and are terrible for write performance. The SANs/ESX solution we have in place is for more than just afs, but within a VM we are getting read and write to disk performance on a par with our HPDL380 Smart5 raid systems when we compare disk performance with something like IOmeter. In fact the sans based esx VM outperforms local disk for reads. As a CS department we are not the central IT provider to the university but we are one of the major inovation test beds witin the uni. We started testing virtualisation for the central provider late last year and while we were very doubtful at the time we have been proven wrong and have been very happy with the results. You need to very carefully look at the service cpu/memory/network/diskIO profiles to chose suitable system to virtualise, but if the candiates are chose well it works very well and we have shown that it reduces costs. In the end only time will tell if we have made a bad decision. Cheers Matt ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Do you use the Dell powervaults and any of the scsi perc raid cards? We do and they have been a huge problem. Not just for us but all over the university. They seem to work better for windows but one of the windows shops also lost a server last year to a single failed drive in a raid5 set which then failed to rebuild. We had some thing like 80 dell servers and the plain old servers work fine, it was just the add in raid cards and powervaults that caused the issues. Cheers Matt Wow, your experience is opposite mine. I swear by Dell servers, and have used nothing but in the past 7 years or so. No major problems at all, and over that time we have only lost one drive in a RAID system, and it rebuilt just fine. We cycle out the critical servers every 3-4 years, but we have some of the older servers doing non-critical things, and they are still going strong. Strange that we are having such opposite experiences. -Dj ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Matthew Cocker wrote: Do you use the Dell powervaults and any of the scsi perc raid cards? We do and they have been a huge problem. Not just for us but all over the university. They seem to work better for windows but one of the windows shops also lost a server last year to a single failed drive in a raid5 set which then failed to rebuild. We had some thing like 80 dell servers and the plain old servers work fine, it was just the add in raid cards and powervaults that caused the issues. Current Dell count is 81 servers, but not all are file servers. The file servers are mostly PowerEdge 2450, 2550, and 2650 machines. Rest are 1650, 1750, and 1850 machines. We use the internal scsi Perc RAID cards with internal drives, not external. Since we are using AFS extensively, we went with the method of spreading our data across multiple servers to reduce the impact of a failure of a single box. We only use the internal drives, fill the boxes with the largest hard drives that Dell offers at the time, and when we need more space, we add another server. We do not use any of their external disk devices. -Dj -- Dj Merrill [EMAIL PROTECTED] TSA: Totally Screwing Aviation ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Current Dell count is 81 servers, but not all are file servers. The file servers are mostly PowerEdge 2450, 2550, and 2650 machines. Rest are 1650, 1750, and 1850 machines. We use the internal scsi Perc RAID cards with internal drives, not external. That is a very good plan. Funny as I emailed saying the dell powervaults were behaving another one just died. We were on site so hopefully we can recover without going to tape again. Cheers Matt ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
At 04:06 PM 4/21/2005, you wrote: Please don't mention Dell servers around me. Oh no, please don't accuse me of pushing a vendor, especially DELL. I'm just suggesting that the big box solution to emulate smaller boxes just isn't worth the time or money. The big boxes end up costing exponentially more money in price, servicing, and maintenance than the performance you expect them to produce. Big boxes only make sense in very specialized applications, and I don't see file serving as one of them. Rodney ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
here is some of the strace. It seems that the gettimeofday function is having issues. Would this cause the vos listvol to slow? If this is the case then would I be save to say it is a OS level issue not afs issue. Of cause now I have to move all the volumes onto a REDHAT server (we use debian) before I can bug vmware. Cheers Matt gettimeofday({1114053992, 397630}, NULL) = 0 select(4, [3], NULL, NULL, {0, 939534}) = 0 (Timeout) gettimeofday({1114053993, 336018}, NULL) = 0 select(4, [3], NULL, NULL, {0, 1146}) = 0 (Timeout) gettimeofday({1114053993, 346133}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474834, 33}}) = 0 gettimeofday({1114053993, 346467}, NULL) = 0 select(4, [3], NULL, NULL, {12, 989666}) = 0 (Timeout) gettimeofday({1114054006, 341534}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474821, 34}}) = 0 select(4, [3], NULL, NULL, {0, 0}) = 0 (Timeout) getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474821, 34}}) = 0 gettimeofday({1114054006, 342237}, NULL) = 0 select(4, [3], NULL, NULL, {14, 999297}) = 0 (Timeout) gettimeofday({1114054021, 352019}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474806, 34}}) = 0 sendmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(7005), sin_addr=inet_addr(130.216.35.4)}}, msg_iov(2)=[{Bg\35g\t\220\311t\0\0\0\1\0\0\0\0\0\0\0\2\2#\0\0\0\0\0..., 28}, {\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\6\0\0\0\0\0\0\26\0\0..., 37}], msg_controllen=0, msg_flags=0}, 0) = 65 select(4, [3], NULL, NULL, {0, 0}) = 0 (Timeout) getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474806, 34}}) = 0 gettimeofday({1114054021, 353190}, NULL) = 0 select(4, [3], NULL, NULL, {14, 998829}) = 1 (in [3], left {14, 96}) recvmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(7005), sin_addr=inet_addr(130.216.35.4)}}, msg_iov(7)=[{Bg\35g\t\220\311t\0\0\0\1\0\0\0\0\0\0\0\2\2 \0\0\307\37..., 28}, {\0\0\0\0\0\0\0\2\0\0\0\1\0\0\0\2\7\0\266I\6\0\0\26\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1420}], msg_controllen=0, msg_flags=0}, 0) = 65 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474806, 29}}) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474806, 29}}) = 0 gettimeofday({1114054021, 398538}, NULL) = 0 select(4, [3], NULL, NULL, {14, 904652}) = 0 (Timeout) gettimeofday({1114054036, 315442}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474791, 38}}) = 0 select(4, [3], NULL, NULL, {0, 0}) = 0 (Timeout) getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474791, 38}}) = 0 gettimeofday({1114054036, 316288}, NULL) = 0 select(4, [3], NULL, NULL, {0, 39154}) = 0 (Timeout) gettimeofday({1114054036, 355875}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474791, 34}}) = 0 gettimeofday({1114054036, 356213}, NULL) = 0 select(4, [3], NULL, NULL, {14, 999662}) = 0 (Timeout) gettimeofday({1114054051, 367262}, NULL) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474776, 34}}) = 0 sendmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(7005), sin_addr=inet_addr(130.216.35.4)}}, msg_iov(2)=[{Bg\35g\t\220\311t\0\0\0\1\0\0\0\0\0\0\0\3\2#\0\0\0\0\0..., 28}, {\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\6\0\0\0\0\0\0\26\0\0..., 37}], msg_controllen=0, msg_flags=0}, 0) = 65 select(4, [3], NULL, NULL, {0, 0}) = 0 (Timeout) getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474776, 34}}) = 0 gettimeofday({1114054051, 368232}, NULL) = 0 select(4, [3], NULL, NULL, {14, 999030}) = 1 (in [3], left {14, 98}) recvmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(7005), sin_addr=inet_addr(130.216.35.4)}}, msg_iov(7)=[{Bg\35g\t\220\311t\0\0\0\1\0\0\0\0\0\0\0\3\2 \0\0\31\354..., 28}, {\0\0\0\0\0\0\0\2\0\0\0\1\0\0\0\3\7\0\1\1\1\0\0\26\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1416}, {\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0..., 1420}], msg_controllen=0, msg_flags=0}, 0) = 65 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474776, 32}}) = 0 getitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={21474776, 32}}) = 0 gettimeofday({1114054051, 389715}, NULL) = 0 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
I've never seen any reason to virtualize an AFS server. Ever. The key is IO bandwith, which isn't increased by virtualization. You really want separate PHYSICAL servers for AFS servers. Virtualization does not give you any benefits due to hardware failure, power failure, or any other failure. It just adds overhead. -derek Quoting Matthew Cocker [EMAIL PROTECTED]: Hi We have just invested in a Fibre Channel SANs and several FC attached ESX servers (brillant product, just love vmotion and virtual center) and are playing with Virtualised Openafs Fileservers. All is working very well except if we put to many volumes on a server at which point vos listvol takes a very long time to return. If we have say 5000-7000 volumes (about 50Gb) on a vice partition performance is equivalent to hardware server. At 10k volumes to 40k volumes 100-300Gb we have problems with vos listvol. This is not a huge problem for us as we wanted to do more smaller machines any way to take advantage of the VM environment but it does make me wonder why this occurs. What exactly does vos listvol do? does it scan the vice partitions and return all the volumes it finds (du -sh /vicepa takes a huge amount of time too so maybe this is a vm issue)? Is any network traffic exchanged with the DBs? When we start vos listvol on the virtualised server with lots of volumes it just seems to stop working with the cpu usage for the afs process not jumping above 1-2%. An strace (available if anyone interested) shows the vos listvol is doing something (although very slowly). If the virtualised server has less volumes cpu usage jumps up to 30-50% and every thing works. The only thing effected seems to be vos listvol as accessing a volume stored on the server is quick (from user point of view). vos backup stuff all seems to work. Hardware server with same number of volumes works OK. SANS monitoring suggests there is not a data access issue on that side. Not sure this is an AFS issue but any suggestion to help me understand why vos listvol is effected so badly apprepriated. Cheers Matt ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH [EMAIL PROTECTED]PGP key available ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] openafs fileservers in VMware ESX
Matthew Cocker wrote: The question is how much does the overhead of virtualisation (which with afs is not much) actually matter with an AFS fileserver and the client side caching. That should read The question is how much does the overhead of virtualisation (which with esx is not much) actually matter with an AFS fileserver and the client side caching. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info