Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
On Fri, Oct 9, 2009 at 9:25 PM, Derek Anderson de...@rockymtndata.net wrote: GigE wasn't giving me the performance I had hoped for so I spring for some 10Gbe cards. So what am I doing wrong. My setup is a Dell 2950 without a raid controller, just a SAS6 card. The setup is as such : mirror rpool (boot) SAS 10K raidz SSD 467 GB on 3 Samsung 256 MLC SSD (220MB/s each) to create the raidz I did a simple zpool create raidz SSD c1x c1xx c1x. I have a single 10GBe card with a single IP on it. I created a NFS filesystem for vmware by using : zfs create SSD/vmware . I had to set permissoins for Vmware anon=0, but thats it. Below is what zpool iostat reads: File copy 10Gbe to SSD - 40M max file copy 1gbe to SSD - 5.4M max File copy SAS to SSD internal - 90M File copy SSD to SAS internal - 55M Top shows not matter what I always have 2.5 G free and every other test says the same thing. Can anyone tell me why this is seems to be slow? Does 90M mean MegaBytes or MegaBits? Thanks, Derek - I think you made a bad choice with the Samsung disks. I'd recommend the Intel 160Gb drives if its not too late to return the Samsungs. The Intel drives currently offer the best compromise between different work loads. There are plenty of SSD reviews and the Samsungs always come out poorly in comparison testing. Regards, -- Al Hopper Logical Approach Inc,Plano,TX a...@logical-approach.com Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
Thank you for your input folks. The MTU 9000 idea worked like a charm. I have the Intel X25 also, but the capacity was not what I am after for a 6 device array. I have looked and looked at review after review and thats why I started with the Intel path, albeit that firmware upgrade in May was a pain to pull off. I have seen glowing things about the Samsung's and Intels both. What tipped me over the edge is a youtube video, ( surely paid for by Samsung ). Check it out : http://www.youtube.com/watch?v=96dWOEa4Djs Figuring out how to do jumbo frames on the ixgbe was fun given my newness to suns platform. Thanks, Derek -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] NFS sgid directory interoperability with Linux
I only have ZFS filesystems exported right now, but I assume it would behave the same for ufs. The underlying issue seems to be the Sun NFS server expects the NFS client to apply the sgid bit itself and create the new directory with the parent directory's group, while the Linux NFS client expects the server to enforce the sgid bit. If you look at the code in ufs and zfs, you'll see that they both create the mode correctly and the same code is used through NFS. There's another scenario: the Linux client updates the attributes after creating the file/directory/ Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How many errors are too many?
I'm wondering how to interpret what ZFS is telling me in regard to the errors being reported. 1 of my disks (in a 5 disc raidZ array) reports about 4-5 write/read errors every few days. All 5 are directly connected to the motheboard SATA ports, no raid controller card in between. How bad is it? Should I think about replacing the drive? (I imagine it will be difficult to get it RMAed when most OS's won't even realise its screwing up) Or are these small enough not to bother with, and I should just keep zpool clearing and ignoring it until something major happens? (As you might be able to tell, I'm new to Opensolaris/ZFS) An example of my output pool: storage state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 0h0m with 0 errors on Tue Oct 13 18:34:39 2009 config: NAMESTATE READ WRITE CKSUM storage ONLINE 0 0 0 raidz1ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c8d1ONLINE 0 4 0 2.60M resilvered c9d0ONLINE 0 0 0 c9d1ONLINE 0 0 0 c10d0 ONLINE 0 0 0 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] NFS sgid directory interoperability with Linux
Paul B. Henson hen...@acm.org wrote: We're running Solaris 10 with ZFS to provide home and group directory file space over NFSv4. We've run into an interoperability issue between the Solaris NFS server and the Linux NFS client regarding the sgid bit on directories and assigning appropriate group ownership on newly created subdirectories. The correct behavior would be to assign the group ownership of the parent directory to a new directory (instead of using the current process credentials) in case that the sgid bit is set in the parent directory. Is this your problem? Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terrible ZFS performance on a Dell 1850 w/ PERC 4e/Si (Sol10U6)
Is this minutes:seconds.millisecs ? if so, you're looking at 3-4MB/s .. I would say something is wrong. Ack, you're right. I was concentrating so much on the WTFOMG problem that I completely missed the WTF problem. In other news, with the Poweredge put into SCSI mode instead of RAID mode (which is in the system setup, NOT the Megaraid setup, which is why I missed it before) and addressing the disks directly, I can do that same 512M write in 9 seconds. I'm going to rebuild the box with ZFS boot/root and see how it behaves. I'm also going to ask the Linux admins and see if they're seeing problems. It's possible that nobody's noticed. (I tend to run the high I/O tasked machines, the rest are typically CPU-bound...) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does ZFS work with SAN-attached devices?
In life there are many things that we should do (but often don't). There are always trade-offs. If you need your pool to be able to operate with a device missing, then the pool needs to have sufficient redundancy to keep working. If you want your pool to survive if a disk gets crushed by a wayward fork lift, then you need to have redundant storage so that the data continues to be available. If the devices are on a SAN and you want to be able to continue operating while there is a SAN failure, then you need to have redundant SAN switches, redundant paths, and redundant storage devices, preferably in a different chassis. Yes, of course. This is part of normal SAN design. The ZFS file systems is what is different here. If a either a HBA, fibre cable, redundant controller fail or firmware issues on a array redundant controller occur then SSTM (MPXIO) will see the issue and try and fail things over to the other controller. Of course this reaction at the SSTM level takes time. UFS simply allows this to happen. It is my understanding ZFS can have issues with this hence the reason why a zfs mirror or raidz device is required. Still not clear how the above mentioned BUGS change the behavior of zfs and if they change the recommendations of the zpool man page. Bob -- Bob Friesenhahn bfriesen at simple dot dallas dot tx dot us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ ___ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to resize ZFS partion or add a new one?
Hi, I have the following partions on my laptop, Inspiron 6000, from fdisk: 1 Other OS 011 12 0 2 EXT LBA 12 25612550 26 3 ActiveSolaris2 2562 97287167 74 First one is for Dell utilities. Second one is NTFS and the third is ZFS. I am currently using OpenSolaris 2009.06 installed on the partition 3#. When I installed OpenSolaris I kept the 2# just in case I wanted also to install Windows, but I have realized I don't need it nevermore! So, I would like to merge 2# and 3# to get more disk space in OpenSolaris. Is it possible to eliminate the NTFS partition and add it to the ZFS partition? Thanks in advance and regards, Julio -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
I think after some time we gonna see Derek screaming for f... zfs that toasted the data on his ssd array :) Hopefully this setup was non for production. -- Roman Naumenko PS -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs disk encryption
Does anyone know when this will be available? Project says Q4 2009 but does not give a build. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
Before you all start taking bets, I am having a difficult time understanding why you would. If you think I am nuts because SSD's have a limited lifespan, I would agree with you, however we all know that SSD's are going to get cheaper and cheaper as the days go by. The Intels I bought in April are half the price now they were then. So are the Samsungs. I suspect that by next spring, I will replace them all with new ones and they will be half the cost they are now. Why would anyone spend 3K on disks and just toss it in the river? Simple answer: Man hour math. I have 150 virtual machines on these disks for shared storage. They hold no actual data so who really cares if they get lost. However 150 users of these virtual machines will save 5 minutes or so every day of work, which translates to $250. So $3,000 in SSD's which are easily replaced one by one with zfs saves the company $250,000 in labor. So when I replace these drives in 6 months, for somewhere around $1500 its a fantastic deal. The only bad part is I cannot estimate how much of the old disks have life is left because in a few months, I am going to have a handful of the fastest SSD's around and not sure if I would trust them for much of anything. Am I really that wrong? Derek -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs disk encryption
Mike DeMarco wrote: Does anyone know when this will be available? Project says Q4 2009 but does not give a build. Yes. Not giving a build is deliberate because builds are very narrow windows and there has been much flux in the build schedule for what may or may not be restricted content builds recently. We also depend on the ZFS Fast System Attributes project and can't integrate until that has done so. When I can commit to more detailed dates I will do. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SSD value [was SSD over 10gbe not any faster than 10K SAS over GigE]
On 13 oct. 2009, at 15:24, Derek Anderson wrote: Simple answer: Man hour math. I have 150 virtual machines on these disks for shared storage. They hold no actual data so who really cares if they get lost. However 150 users of these virtual machines will save 5 minutes or so every day of work, which translates to $250. So $3,000 in SSD's which are easily replaced one by one with zfs saves the company $250,000 in labor. So when I replace these drives in 6 months, for somewhere around $1500 its a fantastic deal. Overall, I think this is a reasonable model for the medium sized enterprise to work with. As in most cases the mythical 5 minutes saved with be invisible to the overall operations, and difficult to justify to management, but if you can squeeze it into an annual operating budget rather than a capital expense that requires separate justification you should be good. The only bad part is I cannot estimate how much of the old disks have life is left because in a few months, I am going to have a handful of the fastest SSD's around and not sure if I would trust them for much of anything. As for what to do with the SSDs - you can resell them or give them to employees (being clear on their usage and provenance) since they represent a risk in a high volume enterprise environment, but could probably supply several years worth of service in a single-user mode. I'd be very happy to get a top of the line SSD at half price for my laptop for a year's projected use...knowing of course that I backup daily as a matter of religious observance :-) Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] where did ztest go
according to this page : http://www.opensolaris.org/os/community/zfs/ztest/;jsessionid=ED73B91DAC77211E7A9EB687D3EF7F91 its supposed to be in /usr/bin i run snv_124 Thanks, Dirk -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed
All's gone quiet on this issue, and the bug is closed, but I'm having exactly the same problem; pulling a disk on this card, under OpenSolaris 111, is pausing all IO (including, weirdly, network IO), and using the ZFS utilities (zfs list, zpool list, zpool status) causes a hang until I replace the disk. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How many errors are too many?
On Tue, 13 Oct 2009, Ren Pillay wrote: I'm wondering how to interpret what ZFS is telling me in regard to the errors being reported. 1 of my disks (in a 5 disc raidZ array) reports about 4-5 write/read errors every few days. All 5 are directly connected to the motheboard SATA ports, no raid controller card in between. How bad is it? Should I think about replacing the drive? (I imagine it will be difficult to get it RMAed when most OS's won't even realise its screwing up) Recurring problems usually indicate failing hardware and since you are only using raidz1 you should be concerned (but not alarmed) about it. It is wise to obtain a replacement drive. You didn't mention if you periodically do a zfs scrub of your pool, but if you haven't been, you may find that many more issues are turned up by 'zfs scrub'. The failing drive may be riddled with errors. It is wise to do a full 'zfs scrub' before voluntarily replacing the suspect drive in case there is some undetected data error on one of the other drives which can still be corrected. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedup video
Someone posted this link: https://slx.sun.com/1179275620 for a video on ZFS deduplication. But the site isn't responding (which is typical of Sun, since I've been dealing with them for the last 12 years). Does anyone know of a mirror site, or if the video is on YouTube? Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to resize ZFS partion or add a new one?
Hi-- Unfortunately, you cannot change the partitioning underneath your pool. I don't see any way of resizing this partition except for backing up your data, repartitioning the disk, and reinstalling Opensolaris 2009.06. Maybe someone else has a better idea... Cindy On 10/13/09 06:32, Julio wrote: Hi, I have the following partions on my laptop, Inspiron 6000, from fdisk: 1 Other OS 011 12 0 2 EXT LBA 12 25612550 26 3 ActiveSolaris2 2562 97287167 74 First one is for Dell utilities. Second one is NTFS and the third is ZFS. I am currently using OpenSolaris 2009.06 installed on the partition 3#. When I installed OpenSolaris I kept the 2# just in case I wanted also to install Windows, but I have realized I don't need it nevermore! So, I would like to merge 2# and 3# to get more disk space in OpenSolaris. Is it possible to eliminate the NTFS partition and add it to the ZFS partition? Thanks in advance and regards, Julio ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD value [was SSD over 10gbe not any faster than 10K SAS over GigE]
I did bad math, I meant 25,000 in labor dollars saved over 6 months. There is one applicatoin called FRx, a reporting engine for their accounting. Even if their executives save 10 minutes a day running just that bloated application, then this plan has payed for itself in just a few weeks. ZFS is pretty cool. I have spent just over $6k on a Dell server with 1 TB of SSD storage and 10Gbe. It houses 150 Virtual Machines (WinXP) that are connected to by $35 thin clients we picked up an an auction. Eat that Netapp! Derek -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed
On Tue, Oct 13, 2009 at 8:54 AM, Aaron Brady bra...@gmail.com wrote: All's gone quiet on this issue, and the bug is closed, but I'm having exactly the same problem; pulling a disk on this card, under OpenSolaris 111, is pausing all IO (including, weirdly, network IO), and using the ZFS utilities (zfs list, zpool list, zpool status) causes a hang until I replace the disk. -- Did you set your failmode to continue? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does ZFS work with SAN-attached devices?
On Tue, 13 Oct 2009, Shawn Joy wrote: The ZFS file systems is what is different here. If a either a HBA, fibre cable, redundant controller fail or firmware issues on a array redundant controller occur then SSTM (MPXIO) will see the issue and try and fail things over to the other controller. Of course this reaction at the SSTM level takes time. UFS simply allows this to happen. It is my understanding ZFS can have issues with this hence the reason why a zfs mirror or raidz device is required. ZFS does not seem so different than UFS when it comes to a SAN. ZFS depends on the underlying device drivers to detect and report problems. UFS does the same. MPXIO's response will also depend on the underlying device drivers. My own reliability concerns regarding a SAN are due to the big-LUN that SAN hardware usually emulates and not due to communications in the SAN. A big-LUN is comprised of multiple disk drives. If the SAN storage array has an error, then it is possible that the data on one of these disk drives will be incorrect, and it will be hidden somewhere in that big LUN. The data could be old data rather than just being corrupted. Without redundancy ZFS will detect this corruption but will be unable to repair it. The difference from UFS is that UFS might not even notice the corruption, or fsck will just paper it over. UFS filesystems are usually much smaller than ZFS pools. There are performance concerns when using a big-LUN because ZFS won't be able to intelligently schedule I/O for multiple drives, so performance is reduced. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed
Hi Tim, that doesn't help in this case - it's a complete lockup apparently caused by driver issues. However, the good news ofr Insom is that the bug is closed because the problem now appears fixed. I tested it and found that it's no longer occuring in OpenSolaris 2008.11 or 2009.06. If you move to a newer build of OpenSolaris you should be fine. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed
I did, but as tcook suggests running a later build, I'll try an image-update (though, 111 2008.11, right?) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
On Tue, Oct 13, 2009 at 8:24 AM, Derek Anderson de...@rockymtndata.netwrote: Before you all start taking bets, I am having a difficult time understanding why you would. If you think I am nuts because SSD's have a limited lifespan, I would agree with you, however we all know that SSD's are going to get cheaper and cheaper as the days go by. The Intels I bought in April are half the price now they were then. So are the Samsungs. I suspect that by next spring, I will replace them all with new ones and they will be half the cost they are now. Why would anyone spend 3K on disks and just toss it in the river? Simple answer: Man hour math. I have 150 virtual machines on these disks for shared storage. They hold no actual data so who really cares if they get lost. However 150 users of these virtual machines will save 5 minutes or so every day of work, which translates to $250. So $3,000 in SSD's which are easily replaced one by one with zfs saves the company $250,000 in labor. So when I replace these drives in 6 months, for somewhere around $1500 its a fantastic deal. The only bad part is I cannot estimate how much of the old disks have life is left because in a few months, I am going to have a handful of the fastest SSD's around and not sure if I would trust them for much of anything. Am I really that wrong? Derek I'll take them when you're done :) --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
Hi, I am trying to find out some definite answers on what needs to be done on an STK 2540 to set the Ingnore Cache Sync Option. The best I could find is Bob's Sun StorageTek 2540 / ZFS Performance Summary (Dated Feb 28, 2008, thank you, Bob), in which he quotes a posting of Joel Miller: To set new values: service -d arrayname -c set -q nvsram region=0xf2 offset=0x17 value=0x01 host=0x00 service -d arrayname -c set -q nvsram region=0xf2 offset=0x18 value=0x01 host=0x00 service -d arrayname -c set -q nvsram region=0xf2 offset=0x21 value=0x01 host=0x00 Host region 00 is Solaris (w/Traffic Manager) Is this information still current for F/W 07.35.44.10 ? I have an LSI/Sun presentation stating that it should be sufficient to set byte 0x21 - what is correct? Bonus question: Is there a way to determine the setting which is currently active, if I don't know if the controller has been booted since the nvsram potentially got modified? Thank you, Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FW: Supermicro AOC-SAT2-MV8 hang when drive removed
On Tue, Oct 13, 2009 at 9:42 AM, Aaron Brady bra...@gmail.com wrote: I did, but as tcook suggests running a later build, I'll try an image-update (though, 111 2008.11, right?) It should be, yes. b111 was released in April of 2009. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does ZFS work with SAN-attached devices?
Bob Friesenhahn wrote: My own reliability concerns regarding a SAN are due to the big-LUN that SAN hardware usually emulates and not due to communications in the SAN. A big-LUN is comprised of multiple disk drives. If the SAN storage array has an error, then it is possible that the data on one of these disk drives will be incorrect, and it will be hidden somewhere in that big LUN. The data could be old data rather than just being corrupted. Without redundancy ZFS will detect this corruption but will be unable to repair it. The difference from UFS is that UFS might not even notice the corruption, or fsck will just paper it over. UFS filesystems are usually much smaller than ZFS pools. There are performance concerns when using a big-LUN because ZFS won't be able to intelligently schedule I/O for multiple drives, so performance is reduced. Also, ZFS does things like putting the ZIL data (when not on a dedicated device) at the outer edge of disks, that being faster. When you have a LUN which doesn't map on to standard performance profile of a disk, this optimsation is lost. I give talks on ZFS to Enterprise customers, and this area is something I cover. Where possible, give ZFS visibility of redundancy, and as many LUNs as you can. However, we have to recognise that this isn't always possible. In many enterprises, storage is managed by separate teams from servers (this is a legal requirement in some industry sectors in some countries, typically finance), often with very little cooperation between teams, indeed even rivalry. If we said ZFS _had_ to handle lots of LUNs and the data redundancy, it would never get through many data centre doors, so we do have to work in this environment. Even where customers can't make use of some of the features such as self healing data corruptions, I/O scheduling, etc, because of their company storage infrastructure limitations, there's still a ton of other goodness in there too with ease of creating filesystems, snapshots, etc. and we will at least let them know when their multi-million dollar storage system silently drops a bit, which they tend to far more often than most customers realise. -- Andrew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] corrupt metadata on upgrade from 2008.11 to 2009.06
I just upgraded my machine from 2008.11 to 2009.06 with pkg install image-update, and that all seemed to go fine. Now, however, my 5 disk raidz is complaining about corrupted metadata. However, if I reboot back into 2008.11, it still works fine. I even can do things which I think might check consistency, like export/import, resilver, even scrub. After each try I reboot back into 2009.06 but still no love. Is this the uberblock issue I've read about? Is there a good link somebody can send me on how to fix it? Thanks! -Oliver Here's the message: pool: tank state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-72 scrub: none requested config: NAMESTATE READ WRITE CKSUM tankFAULTED 0 0 1 corrupted data raidz1ONLINE 0 0 6 c9t1d0 ONLINE 0 0 1 c6d0ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
On Tue, 13 Oct 2009, Nils Goroll wrote: I am trying to find out some definite answers on what needs to be done on an STK 2540 to set the Ingnore Cache Sync Option. The best I could find is Bob's Sun StorageTek 2540 / ZFS Performance Summary (Dated Feb 28, 2008, thank you, Bob), in which he quotes a posting of Joel Miller: I should update this paper since the performance is now radically different and the StorageTek 2540 CAM configurables have changed. Is this information still current for F/W 07.35.44.10 ? I suspect that the settings don't work the same as before, but don't know how to prove it. Bonus question: Is there a way to determine the setting which is currently active, if I don't know if the controller has been booted since the nvsram potentially got modified? From what I can tell, the controller does not forget these settings due to a reboot or firmware update. However, new firmware may not provide the same interpretation of the values. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] NFS sgid directory interoperability with Linux
On Tue, 13 Oct 2009 casper@sun.com wrote: If you look at the code in ufs and zfs, you'll see that they both create the mode correctly and the same code is used through NFS. There's another scenario: the Linux client updates the attributes after creating the file/directory/ I don't think that is the case. My colleague Brian captured the network traffic and analyzed it, and if I understood him correctly the Linux client issues the mkdir op with no group specified, which per RFC indicates the server should set the appropriate group. On the Solaris client, the nfs mkdir op explicitly specifies the group. Brian is going to follow up shortly with more technical detail. Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] NFS sgid directory interoperability with Linux
On Tue, 13 Oct 2009, Joerg Schilling wrote: The correct behavior would be to assign the group ownership of the parent directory to a new directory (instead of using the current process credentials) in case that the sgid bit is set in the parent directory. Is this your problem? Yes, that is exactly our problem -- when a Linux NFSv4 client creates a directory on a Solaris NFSv4 server when the parent directory has the sgid bit set and a different group owner then the user's primary group, the new directory is incorrectly created with the primary group as group owner rather than the parent directory group. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
Hi Bob and all, I should update this paper since the performance is now radically different and the StorageTek 2540 CAM configurables have changed. That would be great, I think you'd do the community (and Sun, probably) a big favor. Is this information still current for F/W 07.35.44.10 ? I suspect that the settings don't work the same as before, but don't know how to prove it. So this sounds like we need to wait for someone to come with a definite answer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
We're currently using the Sun bundled Samba to provide CIFS access to our ZFS user/group directories. I found a bug in active directory integration mode, where if a user is in more than 32 active directory groups, samba calls setgroups with a group list of greater than 32, which fails, resulting in the user having absolutely no group privileges beyond their primary group. I opened a Sun service request, #71547904, to try and get this resolved. When I initially opened it, I did not know what the underlying problem was. However, I wasn't making any progress through Sun tech support, so I ended up installing the Sun samba source code package and diagnosing the problem myself. In addition, I provided Sun technical report with a simple two line patch that fixes the problem. Unfortunately, I am getting the complete run around on this issue and after almost 2 months have been unable to get the problem fixed. They keep telling me that support for more than 32 groups in Solaris is not a bug, but rather an RFE. I completely agree -- I'm not asking for Solaris to support more than 32 groups (although, as an aside, it sure would be nice if it did -- 32 is pretty small nowadays; I doubt this will get fixed in Solaris 10, but anyone have any idea about possible progress on that in openSolaris?); all I'm asking is that samba be fixed so the user at least gets the first 32 groups they are in rather than none at all. That is the behavior of a local login or over NFS, the effective group privileges are that of the first 32 groups. Evidently the samba engineering group is in Prague. I don't know if it is a language problem, or where the confusion is coming from, but even after escalating this through our regional support manager, they are still refusing to fix this bug and claiming it is an RFE. I think based on the information I provided it should be blindingly obvious that this is a bug, with a fairly trivial fix. I'm pretty sure if they had just fixed it rather than spent all this time arguing about it would have taken less time and resources than they've already wasted 8-/. While not directly a ZFS problem, I was hoping one of the many intelligent and skilled Sun engineers that hang out on this mailing list :) might do me a big favor, look at SR#71547904, confirm that it is actually a bug, and use their internal contacts to somehow convince the samba sustaining engineering group to fix it? Please? Thanks much... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
We're currently using the Sun bundled Samba to provide CIFS access to our ZFS user/group directories. So why not the built-in CIFS support in OpenSolaris? Probably has a similar issue, but still. I found a bug in active directory integration mode, where if a user is in more than 32 active directory groups, samba calls setgroups with a group list of greater than 32, which fails, resulting in the user having absolutely no group privileges beyond their primary group. That's not nice and that should be fixed even when the OS doesn't support more than 32 bits. How many groups do you want? They keep telling me that support for more than 32 groups in Solaris is not a bug, but rather an RFE. I completely agree -- I'm not asking for Solaris to support more than 32 groups (although, as an aside, it sure would be nice if it did -- 32 is pretty small nowadays; I doubt this will get fixed in Solaris 10, but anyone have any idea about possible progress on that in openSolaris?); all I'm asking is that samba be fixed so the user at least gets the first 32 groups they are in rather than none at all. That is the behavior of a local login or over NFS, the effective group privileges are that of the first 32 groups. I'm actually working on fixing this in OpenSolaris and we may even backport this to S10. Evidently the samba engineering group is in Prague. I don't know if it is a language problem, or where the confusion is coming from, but even after escalating this through our regional support manager, they are still refusing to fix this bug and claiming it is an RFE. What's the bug number? Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
On Tue, October 13, 2009 08:24, Derek Anderson wrote: The only bad part is I cannot estimate how much of the old disks have life is left because in a few months, I am going to have a handful of the fastest SSD's around and not sure if I would trust them for much of anything. In the long run, this information should be exposed in a standard way to SMART and probably direct query commands. With that in place, it would mean you could run them in production longer, since you'd get warning when they were reaching their write life limit. It would also mean you could precisely characterize the remaining life for resale. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to resize ZFS partion or add a new one?
I think zpool add should work for you google it zpool add rpool yourNTFSpartition -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to resize ZFS partion or add a new one?
Except that you can't add a disk or partition to a root pool: # zpool add rpool c1t1d0s0 cannot add to 'rpool': root pool can not have multiple vdevs or separate logs He could try to attach the partition to his existing pool, I'm not sure how, and this would only create a mirrored root pool, it would not expand the root pool partition space. cs On 10/13/09 10:34, dirk schelfhout wrote: I think zpool add should work for you google it zpool add rpool yourNTFSpartition ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] NFS sgid directory interoperability with Linux
On 10/12/2009 04:38 PM, Paul B. Henson wrote: I only have ZFS filesystems exported right now, but I assume it would behave the same for ufs. The underlying issue seems to be the Sun NFS server expects the NFS client to apply the sgid bit itself and create the new directory with the parent directory's group, while the Linux NFS client expects the server to enforce the sgid bit. When the clients send the opcode CREATE, the Solaris client specifies the parent directory's group in attr_vals whereas the Linux client doesn't specify a group. There appears to be a disparity between the servers in what to do in an SGID directory when attr_vals does not specify a group. On Solaris, this leads the server to use the process' group, but on Linux, the SGID is enforced and it takes the group of the parent directory. The problem arises when the Linux client expects the Linux server's behavior, leading it to not send the group to a Solaris server, leading the Solaris server to assume the client wanted to ignore the SGID bit. This issue has been frustrating because there didn't appear to be any official word on which client was right. However, I did find this in the RFC which may indicate that the Solaris server might be at fault. In 14.2.4, for the opcode CREATE, it says this about situations where the group isn't specified: Similarly, if createattrs includes neither the group attribute nor a group ACE, and if the server's filesystem both supports and requires the notion of a group attribute (or group ACE), the server MUST derive the group attribute (or the corresponding owner ACE) for the file. This could be from the RPC call's credentials, such as the group principal if the credentials include it (such as with AUTH_SYS), from the group identifier associated with the principal in the credentials (for e.g., POSIX systems have a passwd database that has the group identifier for every user identifier), inherited from directory the object is created in, or whatever else the server's operating environment or filesystem semantics dictate. This applies to the OPEN operation too. The important phrase being inherited from directory the object is created in, which says to me that the server should enforce the SGID bit if no group is specified. However, reading this closer makes me wonder if this sentence is too open-ended. It appears that the Solaris server uses a group principle or group identifier and the Linux server inherits from the parent directory. Both of these are valid choices from the list...they just happen to make incompatible implementations. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iscsi/comstar performance
After a recent upgrade to b124, decided to switch to COMSTAR for iscsi targets for VirtualBox hosted on AMD64 Fedora C10. Both target and initiator are running zfs under b124. This combination seems unbelievably slow compared to the old iscsi subsystem. A scrub of a local 20GB disk on the target took 16 minutes. A scrub of a 20GB iscsi disk took 106 minutes! It seems to take much longer to boot from iscsi, so it seems to be reading more slowly too. There are a lot of variables - switching to Comstar, snv124, VBox 3.08, etc., but such a dramatic loss of performance probably has a single cause. Is anyone willing to speculate? Thanks -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to resize ZFS partion or add a new one?
My answer is incomplete. You can use the zpool attach command to attach another disk slice to a root pool's disk slice to expand the pool size after the smaller disk is detached. On Julio's laptop, I don't think think he can attach another fdisk partition to his root pool. I think he needs to backup his pool, reconfigure and expand the solaris2 partition, and then reinstall OpenSolaris. Cindy On 10/13/09 10:47, Cindy Swearingen wrote: Except that you can't add a disk or partition to a root pool: # zpool add rpool c1t1d0s0 cannot add to 'rpool': root pool can not have multiple vdevs or separate logs He could try to attach the partition to his existing pool, I'm not sure how, and this would only create a mirrored root pool, it would not expand the root pool partition space. cs On 10/13/09 10:34, dirk schelfhout wrote: I think zpool add should work for you google it zpool add rpool yourNTFSpartition ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Does ZFS work with SAN-attached devices?
Also, ZFS does things like putting the ZIL data (when not on a dedicated device) at the outer edge of disks, that being faster. No, ZFS does not do that. It will chain the intent log from blocks allocated from the same metaslabs that the pool is allocating from. This actually works out well because there isn't a large seek back to the beginning of the device. When the pool gets near full then there will be a noticeable slowness - but then all file systems performance suffer when searching for space. When the log is on a separate device it uses the same allocation scheme but those blocks will tend to be allocated at the outer edge of the disk. They only exist for a short time before getting freed, so the same blocks gets re-used. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to resize ZFS partion or add a new one?
On Tue, Oct 13, 2009 at 05:32:35AM -0700, Julio wrote: Hi, I have the following partions on my laptop, Inspiron 6000, from fdisk: 1 Other OS 011 12 0 2 EXT LBA 12 25612550 26 3 ActiveSolaris2 2562 97287167 74 First one is for Dell utilities. Second one is NTFS and the third is ZFS. I am currently using OpenSolaris 2009.06 installed on the partition 3#. When I installed OpenSolaris I kept the 2# just in case I wanted also to install Windows, but I have realized I don't need it nevermore! So, I would like to merge 2# and 3# to get more disk space in OpenSolaris. Is it possible to eliminate the NTFS partition and add it to the ZFS partition? Possible, but not easy. The Solaris partition must be continuous and at the beginning of the partition. You'd need a tool that could move the data in partition 3 to start at cylinder 12. Since the move distance is less than the size to be moved, you have to do it carefully. I'm not sure what tools are out there that would make that move easy (something like Partition Magic?) You could do it naively with 'dd' (while booted from another disk), but if you crashed in the middle, you'd have problems knowing how to pick up. Better would be to do it in chunks and keep track. Assuming you did that, you could then go through all the hoops to expand the fdisk partition, then expand the VTOC label inside, then make use of the added space. Not something I'd want to attempt without a backup and some testing. -- Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terrible ZFS performance on a Dell 1850 w/ PERC 4e/Si (Sol10U6)
Before you do a dd test try first to do: echo zfs_vdev_max_pending/W0t1 | mdb -kw I did actually try this about a month ago when I first made an attempt at figuring this out. Changing the pending values did make some small difference, but even the best was far, far short of acceptable performance. In other news, switching to zfs boot+root, in a ZFS-mirrored configuration, in passthrough mode, has made a huge difference. Writing the 512M file took 15 minutes before... I'm down to twelve seconds. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup video
it seems to be 33:32ish when they start talking about dedup On Tue, Oct 13, 2009 at 10:05 AM, Paul Archer p...@paularcher.org wrote: Someone posted this link: https://slx.sun.com/1179275620 for a video on ZFS deduplication. But the site isn't responding (which is typical of Sun, since I've been dealing with them for the last 12 years). Does anyone know of a mirror site, or if the video is on YouTube? Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs on FDE
Any reason why ZFS would not work on a FDE (Full Data Encryption) Hard drive? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] memory use
Second question: would it make much difference to have 12 or 22 ZFS filesystems? What's the memory footprint of a ZFS filesystem I remember a figure of 64KB kernel memory per file system. -mg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] memory use
Every ZFS filesystem uses system memory, but is this also true for -NOT- mounted filesystems (with the canmount=noauto option set)? Second question: would it make much difference to have 12 or 22 ZFS filesystems? What's the memory footprint of a ZFS filesystem -- Dick Hoogendijk -- PGP/GnuPG key: 01D2433D + http://nagual.nl/ | SunOS 10u8 10/09 | OpenSolaris 2010.02 b123 + All that's really worth doing is what we do for others (Lewis Carrol) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, 13 Oct 2009 casper@sun.com wrote: So why not the built-in CIFS support in OpenSolaris? Probably has a similar issue, but still. I wouldn't think it has this same issue; presumably it won't support more than the kernel limit of 32 groups, but I can't imagine that in the case when a user is in more than 32 active directory groups it would simply discard all group membership :(. I haven't tested it, but I would guess it would behave like the underlying operating system and simply truncate the group list at 32, with the user losing any additional privileges granted by the rest of the groups. I definitely have my eye on transitioning to OpenSolaris, hopefully sometime in mid to late next year. Unfortunately, OpenSolaris wasn't quite enterprise ready when we went into production with this system, and while I think by now it's pretty close if not there, it's going to take some time to put together a prototype, sell management on it, and migrate production services. That's not nice and that should be fixed even when the OS doesn't support more than 32 bits. How many groups do you want? All of them :). I think currently the most groups any single user is in is about 100. 64 would probably cover everyone except a handful of users. Linux currently supports a maximum of 65536 groups per user, while I won't make the mistake of saying no one would ever need more than that ;), I don't think we would exceed that any time soon. I'm actually working on fixing this in OpenSolaris and we may even backport this to S10. Really? Cool. Any timeline on getting it into a development build? What's the current maximum number of groups you're working towards? Better group support would be another bullet point for transitioning to openSolaris. Regarding Solaris 10, my understanding was that the current 32 group limit could only be changed by modifying internal kernel structures that would break backwards compatibility, which wouldn't happen because Solaris guarantees backwards binary compatibility. I could most definitely be mistaken though. What's the bug number? There is no bug number :(, as they refuse to classify it as a bug -- they keep insisting it is an RFE, and pointing towards the existing RFE #'s for increasing the number of groups supported by Solaris. The service request is #71547904, although now that I think about it they haven't been keeping the ticket updated. I'll send you a copy of the thread I've had with the support engineers directly. Here's the patch I submitted. It adds three lines, one of which is blank 8-/. I'm just really confused why they'd rather spend months arguing it isn't a bug rather than just spending five minutes applying this simple patch sigh. I'd just run the version I compiled locally, but it's fairly clear that the source code provided is not the same as the source code used to generate the production binary, so I'd really prefer an official fix. r...@niblet /usr/sfw/src/samba/source/auth # diff -u auth_util.c.orig auth_util.c --- auth_util.c.origFri Sep 11 16:18:46 2009 +++ auth_util.c Fri Sep 11 16:25:56 2009 @@ -1042,6 +1042,7 @@ TALLOC_CTX *mem_ctx; NTSTATUS status; size_t i; + int ngroups_max = groups_max(); mem_ctx = talloc_new(NULL); @@ -1099,6 +1100,8 @@ } add_gid_to_array_unique(server_info, gid, server_info-groups, server_info-n_groups); + + if (server_info-n_groups == ngroups_max) break; } debug_nt_user_token(DBGC_AUTH, 10, server_info-ptok); -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
Hi Bob and all, So this sounds like we need to wait for someone to come with a definite answer. I've received some helpful information on this: Byte 17 is for Ignore Force Unit Access. Byte 18 is for Ignore Disable Write Cache. Byte 21 is for Ignore Cache Sync. Change ALL settings to 1 to make sure all bad commands are ignored. Byte 21 is the most important one, the other two settings are for safety. note: Personally, I think that talking about safety in this context can be a little misleading, my understanding of what is meant here is to make sure that the cache is always being used - which can mean the contrary to (data) safety (I've just learned from wikipedia that Force Unit Access means to bypass any read cache). Newer Solaris (05/08 and higher) should automatically detect a Sun Storage array and should handle the ICS correctly without any modification be reading the Sync-NV bit. Can anyone make a definite statement on this? My understanding is that it does NOT yet work as it should, see also: http://www.opensolaris.org/jive/thread.jspa?messageID=245256 In other words, my understanding is that we DO still need the Hacks on the 61xx/25xx or zfs:zfs_nocacheflush=1 for optimal performance. Regarding my bonus question: I haven't found yet a definite answer if there is a way to read the currently active controller setting. I still assume that the nvsram settings which can be read with service -d arrayname -c read -q nvsram region=0xf2 host=0x00 do not necessarily reflect the current configuration and that the only way to make sure the controller is running with that configuration is to reset it. Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
Regarding Solaris 10, my understanding was that the current 32 group limit could only be changed by modifying internal kernel structures that would break backwards compatibility, which wouldn't happen because Solaris guarantees backwards binary compatibility. I could most definitely be mistaken though. That's not entirely true; the issue is similar having more than 16 groups as it breaks AUTH_SYS over-the-wire authentication but we already have that now. But see: http://opensolaris.org/jive/thread.jspa?threadID=114685 For now, we're aiming for 1024 groups but also make sure that the userland will work without any dependencies. What's the bug number? There is no bug number :(, as they refuse to classify it as a bug -- they keep insisting it is an RFE, and pointing towards the existing RFE #'s for increasing the number of groups supported by Solaris. The change request, then. It must have a bug id. The service request is #71547904, although now that I think about it they haven't been keeping the ticket updated. I'll send you a copy of the thread I've had with the support engineers directly. Here's the patch I submitted. It adds three lines, one of which is blank 8-/. I'm just really confused why they'd rather spend months arguing it isn't a bug rather than just spending five minutes applying this simple patch sigh. I'd just run the version I compiled locally, but it's fairly clear that the source code provided is not the same as the source code used to generate the production binary, so I'd really prefer an official fix. Well, I can understand the sense of that. (Not for OpenSolaris, but for S10) A backport cost a bit so perhaps that's what they want to avoid. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
Paul B. Henson wrote: So why not the built-in CIFS support in OpenSolaris? Probably has a similar issue, but still. I wouldn't think it has this same issue; presumably it won't support more than the kernel limit of 32 groups, but I can't imagine that in the case when a user is in more than 32 active directory groups it would simply discard all group membership :(. I haven't tested it, but I would guess it would behave like the underlying operating system and simply truncate the group list at 32, with the user losing any additional privileges granted by the rest of the groups. Ah. No. If you're using idmap and are mapping to an AD server, the windows SIDs (which are both users and groups) are stored in a cred struct (in cr_ksid) which allows more than 32 groups, up to 64k iirc. Playing around with idmap to map UID/GIDs to SIDs and vice versa can be done locally without an AD or LDAP server too. -Drew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, 13 Oct 2009 casper@sun.com wrote: That's not entirely true; the issue is similar having more than 16 groups as it breaks AUTH_SYS over-the-wire authentication but we already have that now. [...] For now, we're aiming for 1024 groups but also make sure that the userland will work without any dependencies. Good to know; I'm definitely looking forward to this. 1024 will hopefully suffice for at least a while :). The change request, then. It must have a bug id. The only number I have unique to my request is the SR #. There has been no bug opened, and as I mentioned they are referring to an existing RFE regarding increasing the maximum number of groups supported by the operating system (these references are in the thread I forwarded you directly) which is simply not relevant. In fact, it appears my service request has been marked as canceled without my knowledge, leaving pretty much no official trail of my request :(. Well, I can understand the sense of that. (Not for OpenSolaris, but for S10) A backport cost a bit so perhaps that's what they want to avoid. I can't see the cost of applying a three line patch as being particularly high, but I guess there is some inherent cost in quality control, testing, and packaging a patch. But upstream just released some security fixes for the 3.0.x branch, which hopefully they're going to incorporate and release in a patch, and the incremental cost of adding in my simple fix must be negligible. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] STK 2540 and Ignore Cache Sync (ICS)
On Tue, 13 Oct 2009, Nils Goroll wrote: Regarding my bonus question: I haven't found yet a definite answer if there is a way to read the currently active controller setting. I still assume that the nvsram settings which can be read with service -d arrayname -c read -q nvsram region=0xf2 host=0x00 do not necessarily reflect the current configuration and that the only way to make sure the controller is running with that configuration is to reset it. I believe that in the STK 2540, the controllers operate Active/Active except that each controller is Active for half the drives and Standby for the others. Each controller has a copy of the configuration information. Whichever one you communicate with is likely required to mirror the changes to the other. In my setup I load-share the fiber channel traffic by assigning six drives as active on one controller and six drives as active on the other controller, and the drives are individually exported with a LUN per drive. I used CAM to do that. MPXIO sees the changes and does map 1/2 the paths down each FC link for more performance than one FC link offers. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on FDE
Mike DeMarco wrote: Any reason why ZFS would not work on a FDE (Full Data Encryption) Hard drive? None providing the drive is available to the OS by normal means. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD over 10gbe not any faster than 10K SAS over GigE
Well, your plan on storage usage goes to 1% of those who doesn't need reliability and roomy media back-end. So, it can work out well - but unfortunately this is not a silver bullet. -- Roman -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, 13 Oct 2009, Drew Balfour wrote: Ah. No. If you're using idmap and are mapping to an AD server, the windows SIDs (which are both users and groups) are stored in a cred struct (in cr_ksid) which allows more than 32 groups, up to 64k iirc. Ah, yes, I neglected to consider that given the CIFS server in OpenSolaris runs in-kernel it's not subject to the same OS limitations as a user level process. Once Casper finishes his work and access via NFS is no longer limited to 32 groups that will be quite sweet... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iscsi/comstar performance
On Tue, Oct 13, 2009 at 01:00:35PM -0400, Frank Middleton wrote: After a recent upgrade to b124, decided to switch to COMSTAR for iscsi targets for VirtualBox hosted on AMD64 Fedora C10. Both target and initiator are running zfs under b124. This combination seems unbelievably slow compared to the old iscsi subsystem. A scrub of a local 20GB disk on the target took 16 minutes. A scrub of a 20GB iscsi disk took 106 minutes! It seems to take much longer to boot from iscsi, so it seems to be reading more slowly too. There are a lot of variables - switching to Comstar, snv124, VBox 3.08, etc., but such a dramatic loss of performance probably has a single cause. Is anyone willing to speculate? Maybe this will help: http://mail.opensolaris.org/pipermail/storage-discuss/2009-September/007118.html -- albert chin (ch...@thewrittenword.com) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, Oct 13, 2009 at 09:20:23AM -0700, Paul B. Henson wrote: We're currently using the Sun bundled Samba to provide CIFS access to our ZFS user/group directories. ... Evidently the samba engineering group is in Prague. I don't know if it is a language problem, or where the confusion is coming from, but even after escalating this through our regional support manager, they are still refusing to fix this bug and claiming it is an RFE. Havn't tested the bundle samba stuff for a long time, since I don't trust it: The bundled stuff didn't work when tested; packages are IMHO awefully assembled; Problems are not understood by the involved engineers (or they are not willingly to understand); The team seems to follow the dogma, fix the symptoms and not the root cause. So at least if the bundled stuff is modified according to their RFEs on bugzilla, don't be suprised, if your environment gets screwed up - especially when you have a mixed users group, i.e. Windows and *ix based user, which are using workgroup directories for sharing their stuff. So we still use the original samba and it causes no headaches. Once we had a problem when switching some desktops to Vista, MS Office 2007 due to the new win strategy save changes to a tmp file, than rename to the original file - wrong ACLs, however this has been fixed within ONE DAY: Just did some code scanning, talked to Jeremy Allison via smb IRC channel and viola, he came up with a fix pretty fast. So I didn't need to waste my time explaining the problem again and again to SUN support, creating explorer archives, which usually hang the NFS services which couldn't be fixed without a reboot!, and waiting several months to get it fixed (BTW: IIRC, I opened a case for this via sun support, so if it hasn't be silently closed, its probably still open ...). Since we guess, that CIFS gets screwed up by the same team, we don't use it either (well, and can't because we've no ADS ;-)). My 10¢. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On 14/10/2009, at 2:27 AM, casper@sun.com wrote: So why not the built-in CIFS support in OpenSolaris? Probably has a similar issue, but still. In my case, it’s at least two reasons: * Crossing mountpoints requires separate shares - Samba can share an entire hierarchy regardless of ZFS filesystems beneath the sharepoint. * LDAP integration - the in-kernel CIFS only supports real AD (LDAP +krb5) for directory binding otherwise all users must have a separately managed local system accounts. Until these features are available via the in-kernel CIFS implementation, I’m forced to stick with Samba for our CIFS needs. cheers, James ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
Jens Elkner wrote: On Tue, Oct 13, 2009 at 09:20:23AM -0700, Paul B. Henson wrote: We're currently using the Sun bundled Samba to provide CIFS access to our ZFS user/group directories. ... Evidently the samba engineering group is in Prague. I don't know if it is a language problem, or where the confusion is coming from, but even after escalating this through our regional support manager, they are still refusing to fix this bug and claiming it is an RFE. Havn't tested the bundle samba stuff for a long time, since I don't trust it: The bundled stuff didn't work when tested; packages are IMHO awefully assembled; Problems are not understood by the involved engineers (or they are not willingly to understand); The team seems to follow the dogma, fix the symptoms and not the root cause. For Opensolaris, Solaris CIFS != samba. Solaris now has a native in kernel CIFS server which has nothing to do with samba. Apart from having it's commands start with smb, which can be confusing. http://www.opensolaris.org/os/project/cifs-server/ -Drew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to use ZFS on x4270
ag == Andrew Gabriel agabr...@opensolaris.org writes: ag I can't speak for qmail which I've never used, but MTA's ag should sync data to disk before acknowledging receipt, yeah, I saw a talk by one of the Postfix developers. They've taken pains to limit the amount of sync'ing so it's only one or two calls to fsync (on files in the queue subdirectories) per incoming mail (I forget whether it's one or two). One of their performance advices was to be careful of syslog because some implementations call fsync on every line logged which will add a couple more sync's per message received and halve performance. I'm sure qmail also sync's at least once per message received. pgptuvIN0ooQx.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss