Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-11 Thread Heinrich van Riel
I  dont think they just throw a vanilla copy of the OS on vanilla hardware
for that. Like all storage providers they will have a set of specifics
around the drivers/os and firmware down the disk level/model in most cases.
We dont have access to that tested interoperability matrix and I am sure
there are tons of other custom bits. Last I checked emc vnx still runs
windows, but what does that really mean?

Looking at the Comstar documentation for FC, it says to put the adapter in
target mode and allocated LUNs. We are running emc vnx/cx/datadomain/cisco
ucs (combo modules), tons of VMware and windows systems connected to the
same fabric with no problems using the same adapters. I cant believe it is
the fabric.

When the slowdown happens and I stop all IO from the initiator side the
Solaris system is unable to reboot. It will say it is stopping system
services, but will be stuck in this state with no mention of any problems
during the shutdown or before from the reboot.

I took one last stab as one person mentioned that they are using FC with
OmniOS and it does work for me, but only with the Qlogic card. With Emulex
it also drops the link.
I will just have to accept that I can only connect to the a single switch
currently since we are an emulex shop and I have only this one qlt.



On Tue, Jun 11, 2013 at 8:51 AM, Michael Stapleton <
michael.staple...@techsologic.com> wrote:

> I have no idea what the problem is, but it is worth noting that last
> time I checked, Oracles storage arrays were running Solaris and Comstar.
>
> Mike
>
> On Mon, 2013-06-10 at 20:36 -0400, Heinrich van Riel wrote:
>
> > spoke to soon died again.
> > Give up. Just posting the result in case someone else run into issues
> with
> > fc target and find this. Solaris is not even the answer. When it slows
> down
> > I kill the copies and wait until there is no more IO and can see that
> from
> > VMware side and pool io. When I try to reboot it is not able to the same
> as
> > OI. clearly a problem with comstar's ability to deal with fc. after a
> hard
> > reset it will work for again a short bit
> > Last post
> > Cheers
> >
> >
> >
> > On Mon, Jun 10, 2013 at 7:36 PM, Heinrich van Riel <
> > heinrich.vanr...@gmail.com> wrote:
> >
> > > switch to the qlogic adpater using solaris 11.1. Problem resolved
> well
> > > for now. Not as fast as OI with the emulex adapter, perhaps it is the
> older
> > > pool/fs version since I want to keep my options open for now. I am
> getting
> > > around 200MB/s when cloning. At least backups can run for now. Getting
> a
> > > license for 11.1 for one year. I will worry about it again after that.
> > > Never had problems with any device connected fc like this, that is
> usually
> > > the beauty of it but expensive. Downside right now is  qlt card I have
> only
> > > has a single port.
> > > thanks,
> > >
> > >
> > >
> > > On Mon, Jun 10, 2013 at 2:46 PM, Heinrich van Riel <
> > > heinrich.vanr...@gmail.com> wrote:
> > >
> > >> Just want to provide an update here.
> > >>
> > >> Installed Solaris 11.1 reconfigured everything. Went back to Emulex
> card
> > >> since it is a dual port for connect to both switches. Same problem,
> well
> > >> the link does not fail, but it is writing at 20k/s.
> > >>
> > >>
> > >> I am really not sure what to do anymore other that to accept fc
> target is
> > >> no longer an option, but I will post in the ora solaris forum. Either
> this
> > >> has been an issue for some time or it is a hardware combination or
> perhaps
> > >> I am doing something seriously wrong.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Sat, Jun 8, 2013 at 6:57 PM, Heinrich van Riel <
> > >> heinrich.vanr...@gmail.com> wrote:
> > >>
> > >>> I took a look at every server that I knew I could power down or that
> is
> > >>> slated for removal in the future and I found a qlogic adapter not in
> use.
> > >>>
> > >>> HBA Port WWN: 211b3280b
> > >>> Port Mode: Target
> > >>> Port ID: 12000
> > >>> OS Device Name: Not Applicable
> > >>> Manufacturer: QLogic Corp.
> > >>> Model: QLE2460
> > >>> Firmware Version: 5.2.1
> > >>> FCode/BIOS Version: N/A
> > >>> Serial Number: not available
> > >>> Drive

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-10 Thread Heinrich van Riel
spoke to soon died again.
Give up. Just posting the result in case someone else run into issues with
fc target and find this. Solaris is not even the answer. When it slows down
I kill the copies and wait until there is no more IO and can see that from
VMware side and pool io. When I try to reboot it is not able to the same as
OI. clearly a problem with comstar's ability to deal with fc. after a hard
reset it will work for again a short bit
Last post
Cheers



On Mon, Jun 10, 2013 at 7:36 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> switch to the qlogic adpater using solaris 11.1. Problem resolved well
> for now. Not as fast as OI with the emulex adapter, perhaps it is the older
> pool/fs version since I want to keep my options open for now. I am getting
> around 200MB/s when cloning. At least backups can run for now. Getting a
> license for 11.1 for one year. I will worry about it again after that.
> Never had problems with any device connected fc like this, that is usually
> the beauty of it but expensive. Downside right now is  qlt card I have only
> has a single port.
> thanks,
>
>
>
> On Mon, Jun 10, 2013 at 2:46 PM, Heinrich van Riel <
> heinrich.vanr...@gmail.com> wrote:
>
>> Just want to provide an update here.
>>
>> Installed Solaris 11.1 reconfigured everything. Went back to Emulex card
>> since it is a dual port for connect to both switches. Same problem, well
>> the link does not fail, but it is writing at 20k/s.
>>
>>
>> I am really not sure what to do anymore other that to accept fc target is
>> no longer an option, but I will post in the ora solaris forum. Either this
>> has been an issue for some time or it is a hardware combination or perhaps
>> I am doing something seriously wrong.
>>
>>
>>
>>
>>
>> On Sat, Jun 8, 2013 at 6:57 PM, Heinrich van Riel <
>> heinrich.vanr...@gmail.com> wrote:
>>
>>> I took a look at every server that I knew I could power down or that is
>>> slated for removal in the future and I found a qlogic adapter not in use.
>>>
>>> HBA Port WWN: 211b3280b
>>> Port Mode: Target
>>> Port ID: 12000
>>> OS Device Name: Not Applicable
>>> Manufacturer: QLogic Corp.
>>> Model: QLE2460
>>> Firmware Version: 5.2.1
>>> FCode/BIOS Version: N/A
>>> Serial Number: not available
>>> Driver Name: COMSTAR QLT
>>> Driver Version: 20100505-1.05
>>> Type: F-port
>>> State: online
>>> Supported Speeds: 1Gb 2Gb 4Gb
>>> Current Speed: 4Gb
>>> Node WWN: 201b3280b
>>>
>>>
>>> Link does not go down but useless, right from the start it is as slow as
>>> the emulex after I made the xfer change.
>>> So it is not a driver issue.
>>>
>>> alloc free read write read write
>>> - - - - - -
>>> 681G 53.8T 5 12 29.9K 51.3K
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 88 0 221K
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 163 0 812K
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 198 0 1.13M
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 88 0 221K
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 187 0 1.02M
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>> 681G 53.8T 0 0 0 0
>>>
>>> This is a clean install of a7 with nothing done other than nic config in
>>> lacp. I did not attempt a reinstall of a5 yet and prob won't either.
>>> I dont know what to do anymore I was going to try OmniOS but there is no
>>> way of knowing if it would work.
>>>
>>>
>>> I will see if I can get approved for a solaris license for one year, if
>>> not I am switching back to windows storage spaces. Cant backup the current
>>> lab on the EMC array to this node in any event since there is no ip
>>> connectivity and fc is a dream.
>>>
>>> Guess I am the only one trying to use it as an fc target and these
>>> problems are not n

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-10 Thread Heinrich van Riel
switch to the qlogic adpater using solaris 11.1. Problem resolved well
for now. Not as fast as OI with the emulex adapter, perhaps it is the older
pool/fs version since I want to keep my options open for now. I am getting
around 200MB/s when cloning. At least backups can run for now. Getting a
license for 11.1 for one year. I will worry about it again after that.
Never had problems with any device connected fc like this, that is usually
the beauty of it but expensive. Downside right now is  qlt card I have only
has a single port.
thanks,



On Mon, Jun 10, 2013 at 2:46 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> Just want to provide an update here.
>
> Installed Solaris 11.1 reconfigured everything. Went back to Emulex card
> since it is a dual port for connect to both switches. Same problem, well
> the link does not fail, but it is writing at 20k/s.
>
>
> I am really not sure what to do anymore other that to accept fc target is
> no longer an option, but I will post in the ora solaris forum. Either this
> has been an issue for some time or it is a hardware combination or perhaps
> I am doing something seriously wrong.
>
>
>
>
>
> On Sat, Jun 8, 2013 at 6:57 PM, Heinrich van Riel <
> heinrich.vanr...@gmail.com> wrote:
>
>> I took a look at every server that I knew I could power down or that is
>> slated for removal in the future and I found a qlogic adapter not in use.
>>
>> HBA Port WWN: 211b3280b
>> Port Mode: Target
>> Port ID: 12000
>> OS Device Name: Not Applicable
>> Manufacturer: QLogic Corp.
>> Model: QLE2460
>> Firmware Version: 5.2.1
>> FCode/BIOS Version: N/A
>> Serial Number: not available
>> Driver Name: COMSTAR QLT
>> Driver Version: 20100505-1.05
>> Type: F-port
>> State: online
>> Supported Speeds: 1Gb 2Gb 4Gb
>> Current Speed: 4Gb
>> Node WWN: 201b3280b
>>
>>
>> Link does not go down but useless, right from the start it is as slow as
>> the emulex after I made the xfer change.
>> So it is not a driver issue.
>>
>> alloc free read write read write
>> - - - - - -
>> 681G 53.8T 5 12 29.9K 51.3K
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 88 0 221K
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 163 0 812K
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 198 0 1.13M
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 88 0 221K
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 187 0 1.02M
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>> 681G 53.8T 0 0 0 0
>>
>> This is a clean install of a7 with nothing done other than nic config in
>> lacp. I did not attempt a reinstall of a5 yet and prob won't either.
>> I dont know what to do anymore I was going to try OmniOS but there is no
>> way of knowing if it would work.
>>
>>
>> I will see if I can get approved for a solaris license for one year, if
>> not I am switching back to windows storage spaces. Cant backup the current
>> lab on the EMC array to this node in any event since there is no ip
>> connectivity and fc is a dream.
>>
>> Guess I am the only one trying to use it as an fc target and these
>> problems are not noticed.
>>
>>
>>
>> On Sat, Jun 8, 2013 at 4:55 PM, Heinrich van Riel <
>> heinrich.vanr...@gmail.com> wrote:
>>
>>> changing max-xfer-size causes the link to stay up and no problem are
>>> reported from stmf.
>>>
>>> #   Memory_model   max-xfer-size
>>> # 
>>> #   Small  131072 - 339968
>>> #   Medium 339969 - 688128
>>> #   Large  688129 - 1388544
>>> #
>>> # Range:  Min:131072   Max:1388544   Default:339968
>>> #
>>> max-xfer-size=339968;
>>>
>>> as soon as I changed it to 339969 the there is no link loss, but I would
>>> be so lucky that is solves my problem. after a few min it would grind to a
>>> crawl, so much so that in vmware it will take well over a min to just
>>> b

Re: [OpenIndiana-discuss] Zpool version compatibility

2013-06-10 Thread Heinrich van Riel
Thanks, I missed that. I am good now.



On Mon, Jun 10, 2013 at 12:59 PM, Jan Owoc  wrote:

> On Mon, Jun 10, 2013 at 10:50 AM, Jan Owoc  wrote:
> >
> > On Mon, Jun 10, 2013 at 10:44 AM, Heinrich van Riel
> >  wrote:
> >> Since there is a split pool version is version 28 the highest that
> would be
> >> compatible between the releases or can I import a version 29 created in
> >> Oracle into Illumos?
> >
> > No. Zpool version 29 is closed-source, so it is unlikely that anything
> > other than Solaris 11+ will be able to read it.
> >
> > The incompatibility goes both ways, so a zpool created with the
> > current OpenIndiana (or anything Illumos-based that uses feature
> > flags) is unlikely to be readable by Solaris, unless you explicitly
> > create the zpool as version 28.
>
> One additional thing to keep in mind is that recently the zfs
> filesystem (i.e the version of the filesystem, as opposed to the
> zpool) was also incremented in Solaris 11.1. You may need to specify
> zpool version 28 AND zfs version 5 when creating the pool to ensure
> Illumos (or anything other than Solaris 11.1+) can read it in the
> future.
>
> Cheers,
> Jan
>
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-10 Thread Heinrich van Riel
Just want to provide an update here.

Installed Solaris 11.1 reconfigured everything. Went back to Emulex card
since it is a dual port for connect to both switches. Same problem, well
the link does not fail, but it is writing at 20k/s.


I am really not sure what to do anymore other that to accept fc target is
no longer an option, but I will post in the ora solaris forum. Either this
has been an issue for some time or it is a hardware combination or perhaps
I am doing something seriously wrong.





On Sat, Jun 8, 2013 at 6:57 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> I took a look at every server that I knew I could power down or that is
> slated for removal in the future and I found a qlogic adapter not in use.
>
> HBA Port WWN: 211b3280b
> Port Mode: Target
> Port ID: 12000
> OS Device Name: Not Applicable
> Manufacturer: QLogic Corp.
> Model: QLE2460
> Firmware Version: 5.2.1
> FCode/BIOS Version: N/A
> Serial Number: not available
> Driver Name: COMSTAR QLT
> Driver Version: 20100505-1.05
> Type: F-port
> State: online
> Supported Speeds: 1Gb 2Gb 4Gb
> Current Speed: 4Gb
> Node WWN: 201b3280b
>
>
> Link does not go down but useless, right from the start it is as slow as
> the emulex after I made the xfer change.
> So it is not a driver issue.
>
> alloc free read write read write
> - - - - - -
> 681G 53.8T 5 12 29.9K 51.3K
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 88 0 221K
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 163 0 812K
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 198 0 1.13M
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 88 0 221K
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 187 0 1.02M
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
> 681G 53.8T 0 0 0 0
>
> This is a clean install of a7 with nothing done other than nic config in
> lacp. I did not attempt a reinstall of a5 yet and prob won't either.
> I dont know what to do anymore I was going to try OmniOS but there is no
> way of knowing if it would work.
>
>
> I will see if I can get approved for a solaris license for one year, if
> not I am switching back to windows storage spaces. Cant backup the current
> lab on the EMC array to this node in any event since there is no ip
> connectivity and fc is a dream.
>
> Guess I am the only one trying to use it as an fc target and these
> problems are not noticed.
>
>
>
> On Sat, Jun 8, 2013 at 4:55 PM, Heinrich van Riel <
> heinrich.vanr...@gmail.com> wrote:
>
>> changing max-xfer-size causes the link to stay up and no problem are
>> reported from stmf.
>>
>> #   Memory_model   max-xfer-size
>> # 
>> #   Small  131072 - 339968
>> #   Medium 339969 - 688128
>> #   Large  688129 - 1388544
>> #
>> # Range:  Min:131072   Max:1388544   Default:339968
>> #
>> max-xfer-size=339968;
>>
>> as soon as I changed it to 339969 the there is no link loss, but I would
>> be so lucky that is solves my problem. after a few min it would grind to a
>> crawl, so much so that in vmware it will take well over a min to just
>> browse a folder, we talking are a few k/s.
>>
>> Setting it to the max causes the the link to go down again and smtf
>> reports the following again:
>> FROM STMF:0062568: abort_task_offline called for LPORT: lport abort timed
>> out
>>
>> I also played around with the buffer settings.
>>
>> Any ideas?
>> Thanks,
>>
>>
>>
>>  On Fri, Jun 7, 2013 at 8:38 PM, Heinrich van Riel <
>> heinrich.vanr...@gmail.com> wrote:
>>
>>> New card, different PCI-E slot (removed the other one) different FC
>>> switch (same model with same code) older hba firmware (2.72a2)  = same
>>> result.
>>>
>>> On the setting changes when it boots it complains about this option,
>>> does not exist: szfs_txg_synctime
>>> The changes still allowed for a constant write, but at a max of 100Mb/s
>>> so not much better than iscsi over 1Gbe. I guess I would need to increase
>>> write_limit_override. if i disable the settings again it shows 240MB/s
>>> wi

Re: [OpenIndiana-discuss] Zpool version compatibility

2013-06-10 Thread Heinrich van Riel
Thank you for clearing it up
I created version 28 while management decide is they will license oracle
solaris so that I can go back to OI.
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


[OpenIndiana-discuss] Zpool version compatibility

2013-06-10 Thread Heinrich van Riel
Since there is a split pool version is version 28 the highest that would be
compatible between the releases or can I import a version 29 created in
Oracle into Illumos?

Thanks,
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-08 Thread Heinrich van Riel
I took a look at every server that I knew I could power down or that is
slated for removal in the future and I found a qlogic adapter not in use.

HBA Port WWN: 211b3280b
Port Mode: Target
Port ID: 12000
OS Device Name: Not Applicable
Manufacturer: QLogic Corp.
Model: QLE2460
Firmware Version: 5.2.1
FCode/BIOS Version: N/A
Serial Number: not available
Driver Name: COMSTAR QLT
Driver Version: 20100505-1.05
Type: F-port
State: online
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: 4Gb
Node WWN: 201b3280b


Link does not go down but useless, right from the start it is as slow as
the emulex after I made the xfer change.
So it is not a driver issue.

alloc free read write read write
- - - - - -
681G 53.8T 5 12 29.9K 51.3K
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 88 0 221K
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 163 0 812K
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 198 0 1.13M
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 88 0 221K
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 187 0 1.02M
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0
681G 53.8T 0 0 0 0

This is a clean install of a7 with nothing done other than nic config in
lacp. I did not attempt a reinstall of a5 yet and prob won't either.
I dont know what to do anymore I was going to try OmniOS but there is no
way of knowing if it would work.


I will see if I can get approved for a solaris license for one year, if not
I am switching back to windows storage spaces. Cant backup the current lab
on the EMC array to this node in any event since there is no ip
connectivity and fc is a dream.

Guess I am the only one trying to use it as an fc target and these problems
are not noticed.



On Sat, Jun 8, 2013 at 4:55 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> changing max-xfer-size causes the link to stay up and no problem are
> reported from stmf.
>
> #   Memory_model   max-xfer-size
> # 
> #   Small  131072 - 339968
> #   Medium 339969 - 688128
> #   Large  688129 - 1388544
> #
> # Range:  Min:131072   Max:1388544   Default:339968
> #
> max-xfer-size=339968;
>
> as soon as I changed it to 339969 the there is no link loss, but I would
> be so lucky that is solves my problem. after a few min it would grind to a
> crawl, so much so that in vmware it will take well over a min to just
> browse a folder, we talking are a few k/s.
>
> Setting it to the max causes the the link to go down again and smtf
> reports the following again:
> FROM STMF:0062568: abort_task_offline called for LPORT: lport abort timed
> out
>
> I also played around with the buffer settings.
>
> Any ideas?
> Thanks,
>
>
>
> On Fri, Jun 7, 2013 at 8:38 PM, Heinrich van Riel <
> heinrich.vanr...@gmail.com> wrote:
>
>> New card, different PCI-E slot (removed the other one) different FC
>> switch (same model with same code) older hba firmware (2.72a2)  = same
>> result.
>>
>> On the setting changes when it boots it complains about this option, does
>> not exist: szfs_txg_synctime
>> The changes still allowed for a constant write, but at a max of 100Mb/s
>> so not much better than iscsi over 1Gbe. I guess I would need to increase
>> write_limit_override. if i disable the settings again it shows 240MB/s
>> with bursts up to 300, both stats are from VMware's disk perf monitoring
>> while cloning the same VM.
>>
>> All iSCSI luns remain active with no impact.
>> So I will conclude, I guess, it seems to be the problem that was there in
>> 2009 from build ~100 to 128. When I search the error messages all posts
>> date back to 2009.
>>
>> I will try one more thing to reinstall with 151a5 since a server that was
>> removed from the env was running this with no issues, but was using an
>> older emulex HBA, LP10000 PCIX
>> Looking at the notable changes in the release notes past a5 I do see
>> anything that changed that I would think would cause the behavior. Would
>> this just be a waste of time?
>>
>>
>>
>> On Fri, Jun 7, 2013 at 6:36 PM, Heinrich van Riel <
>> heinrich.vanr...@gmail.com> wrote:
>>
>>> In the debug info I see 1000's of the following events:
>>>
>>> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort
>>> timed out
>>> FROM STMF:

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-08 Thread Heinrich van Riel
changing max-xfer-size causes the link to stay up and no problem are
reported from stmf.

#   Memory_model   max-xfer-size
# 
#   Small  131072 - 339968
#   Medium 339969 - 688128
#   Large  688129 - 1388544
#
# Range:  Min:131072   Max:1388544   Default:339968
#
max-xfer-size=339968;

as soon as I changed it to 339969 the there is no link loss, but I would be
so lucky that is solves my problem. after a few min it would grind to a
crawl, so much so that in vmware it will take well over a min to just
browse a folder, we talking are a few k/s.

Setting it to the max causes the the link to go down again and smtf reports
the following again:
FROM STMF:0062568: abort_task_offline called for LPORT: lport abort timed
out

I also played around with the buffer settings.

Any ideas?
Thanks,



On Fri, Jun 7, 2013 at 8:38 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> New card, different PCI-E slot (removed the other one) different FC switch
> (same model with same code) older hba firmware (2.72a2)  = same result.
>
> On the setting changes when it boots it complains about this option, does
> not exist: szfs_txg_synctime
> The changes still allowed for a constant write, but at a max of 100Mb/s so
> not much better than iscsi over 1Gbe. I guess I would need to increase
> write_limit_override. if i disable the settings again it shows 240MB/s
> with bursts up to 300, both stats are from VMware's disk perf monitoring
> while cloning the same VM.
>
> All iSCSI luns remain active with no impact.
> So I will conclude, I guess, it seems to be the problem that was there in
> 2009 from build ~100 to 128. When I search the error messages all posts
> date back to 2009.
>
> I will try one more thing to reinstall with 151a5 since a server that was
> removed from the env was running this with no issues, but was using an
> older emulex HBA, LP1 PCIX
> Looking at the notable changes in the release notes past a5 I do see
> anything that changed that I would think would cause the behavior. Would
> this just be a waste of time?
>
>
>
> On Fri, Jun 7, 2013 at 6:36 PM, Heinrich van Riel <
> heinrich.vanr...@gmail.com> wrote:
>
>> In the debug info I see 1000's of the following events:
>>
>> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
>> out
>> emlxs1:0149228: port state change from 11 to 11
>> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
>> out
>> :0149228: fct_port_shutdown: port-ff1157ff1278, fct_process_logo:
>> unable to
>> clean up I/O. iport-ff1157ff1378, icmd-ff1195463110
>> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
>> out
>> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
>> out
>>
>>
>> And then the following as the port recovers.
>>
>> emlxs1:0150128: port state change from 11 to 11
>> emlxs1:0150128: port state change from 11 to 0
>> emlxs1:0150128: port state change from 0 to 11
>> emlxs1:0150128: port state change from 11 to 0
>> :0150850: fct_port_initialize: port-ff1157ff1278, emlxs initialize
>> emlxs1:0150950: port state change from 0 to e
>> emlxs1:0150953: Posting sol ELS 3 (PLOGI) rp_id=fd lp_id=22000
>> emlxs1:0150953: Processing sol ELS 3 (PLOGI) rp_id=fd
>> emlxs1:0150953: Sol ELS 3 (PLOGI) completed with status 0, did/fd
>> emlxs1:0150953: Posting sol ELS 62 (SCR) rp_id=fd lp_id=22000
>> emlxs1:0150953: Processing sol ELS 62 (SCR) rp_id=fd
>> emlxs1:0150953: Sol ELS 62 (SCR) completed with status 0, did/fd
>> e

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-07 Thread Heinrich van Riel
New card, different PCI-E slot (removed the other one) different FC switch
(same model with same code) older hba firmware (2.72a2)  = same result.

On the setting changes when it boots it complains about this option, does
not exist: szfs_txg_synctime
The changes still allowed for a constant write, but at a max of 100Mb/s so
not much better than iscsi over 1Gbe. I guess I would need to increase
write_limit_override. if i disable the settings again it shows 240MB/s with
bursts up to 300, both stats are from VMware's disk perf monitoring while
cloning the same VM.

All iSCSI luns remain active with no impact.
So I will conclude, I guess, it seems to be the problem that was there in
2009 from build ~100 to 128. When I search the error messages all posts
date back to 2009.

I will try one more thing to reinstall with 151a5 since a server that was
removed from the env was running this with no issues, but was using an
older emulex HBA, LP1 PCIX
Looking at the notable changes in the release notes past a5 I do see
anything that changed that I would think would cause the behavior. Would
this just be a waste of time?



On Fri, Jun 7, 2013 at 6:36 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> In the debug info I see 1000's of the following events:
>
> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
> out
> emlxs1:0149228: port state change from 11 to 11
> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
> out
> :0149228: fct_port_shutdown: port-ff1157ff1278, fct_process_logo:
> unable to
> clean up I/O. iport-ff1157ff1378, icmd-ff1195463110
> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
> out
> FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
> out
>
>
> And then the following as the port recovers.
>
> emlxs1:0150128: port state change from 11 to 11
> emlxs1:0150128: port state change from 11 to 0
> emlxs1:0150128: port state change from 0 to 11
> emlxs1:0150128: port state change from 11 to 0
> :0150850: fct_port_initialize: port-ff1157ff1278, emlxs initialize
> emlxs1:0150950: port state change from 0 to e
> emlxs1:0150953: Posting sol ELS 3 (PLOGI) rp_id=fd lp_id=22000
> emlxs1:0150953: Processing sol ELS 3 (PLOGI) rp_id=fd
> emlxs1:0150953: Sol ELS 3 (PLOGI) completed with status 0, did/fd
> emlxs1:0150953: Posting sol ELS 62 (SCR) rp_id=fd lp_id=22000
> emlxs1:0150953: Processing sol ELS 62 (SCR) rp_id=fd
> emlxs1:0150953: Sol ELS 62 (SCR) completed with status 0, did/fd
> emlxs1:0151053: Posting sol ELS 3 (PLOGI) rp_id=fc lp_id=22000
> emlxs1:0151053: Processing sol ELS 3 (PLOGI) rp_id=fc
> emlxs1:0151053: Sol ELS 3 (PLOGI) completed with status 0, did/fc
> emlxs1:0151054: Posting unsol ELS 3 (PLOGI) rp_id=fffc02 lp_id=22000
> emlxs1:0151054: Processing unsol ELS 3 (PLOGI) rp_id=fffc02
> emlxs1:0151054: Posting unsol ELS 20 (PRLI) rp_id=fffc02 lp_id=22000
> emlxs1:0151054: Processing unsol ELS 20 (PRLI) rp_id=fffc02
> emlxs1:0151055: Posting unsol ELS 5 (LOGO) rp_id=fffc02 lp_id=22000
> emlxs1:0151055: Processing unsol ELS 5 (LOGO) rp_id=fffc02
> emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21500 lp_id=22000
> emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21500
> emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21500 lp_id=22000
>  emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21500
> emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21600 lp_id=22000
> emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21600
> emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21600 lp_id=22000
> emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21600
> emlxs1:0151338: Posting unsol ELS 3 (PLOGI) rp_id=21500 lp_id=22000
> emlxs1:0151338: Processing unsol ELS 3 (PLOGI) rp_id=21500
> emlxs1:0151338: Posting unso

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-07 Thread Heinrich van Riel
In the debug info I see 1000's of the following events:

FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149225: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149226: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149227: abort_task_offline called for LPORT: lport abort timed
out
emlxs1:0149228: port state change from 11 to 11
FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149228: abort_task_offline called for LPORT: lport abort timed
out
:0149228: fct_port_shutdown: port-ff1157ff1278, fct_process_logo:
unable to
clean up I/O. iport-ff1157ff1378, icmd-ff1195463110
FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
out
FROM STMF:0149229: abort_task_offline called for LPORT: lport abort timed
out


And then the following as the port recovers.

emlxs1:0150128: port state change from 11 to 11
emlxs1:0150128: port state change from 11 to 0
emlxs1:0150128: port state change from 0 to 11
emlxs1:0150128: port state change from 11 to 0
:0150850: fct_port_initialize: port-ff1157ff1278, emlxs initialize
emlxs1:0150950: port state change from 0 to e
emlxs1:0150953: Posting sol ELS 3 (PLOGI) rp_id=fd lp_id=22000
emlxs1:0150953: Processing sol ELS 3 (PLOGI) rp_id=fd
emlxs1:0150953: Sol ELS 3 (PLOGI) completed with status 0, did/fd
emlxs1:0150953: Posting sol ELS 62 (SCR) rp_id=fd lp_id=22000
emlxs1:0150953: Processing sol ELS 62 (SCR) rp_id=fd
emlxs1:0150953: Sol ELS 62 (SCR) completed with status 0, did/fd
emlxs1:0151053: Posting sol ELS 3 (PLOGI) rp_id=fc lp_id=22000
emlxs1:0151053: Processing sol ELS 3 (PLOGI) rp_id=fc
emlxs1:0151053: Sol ELS 3 (PLOGI) completed with status 0, did/fc
emlxs1:0151054: Posting unsol ELS 3 (PLOGI) rp_id=fffc02 lp_id=22000
emlxs1:0151054: Processing unsol ELS 3 (PLOGI) rp_id=fffc02
emlxs1:0151054: Posting unsol ELS 20 (PRLI) rp_id=fffc02 lp_id=22000
emlxs1:0151054: Processing unsol ELS 20 (PRLI) rp_id=fffc02
emlxs1:0151055: Posting unsol ELS 5 (LOGO) rp_id=fffc02 lp_id=22000
emlxs1:0151055: Processing unsol ELS 5 (LOGO) rp_id=fffc02
emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21500 lp_id=22000
emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21500
emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21500 lp_id=22000
emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21500
emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21600 lp_id=22000
emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21600
emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21600 lp_id=22000
emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21600
emlxs1:0151338: Posting unsol ELS 3 (PLOGI) rp_id=21500 lp_id=22000
emlxs1:0151338: Processing unsol ELS 3 (PLOGI) rp_id=21500
emlxs1:0151338: Posting unsol ELS 20 (PRLI) rp_id=21500 lp_id=22000
emlxs1:0151338: Processing unsol ELS 20 (PRLI) rp_id=21500
emlxs1:0151338: Posting unsol ELS 3 (PLOGI) rp_id=21600 lp_id=22000
emlxs1:0151338: Processing unsol ELS 3 (PLOGI) rp_id=21600
emlxs1:0151338: Posting unsol ELS 20 (PRLI) rp_id=21600 lp_id=22000
emlxs1:0151338: Processing unsol ELS 20 (PRLI) rp_id=21600
emlxs1:0151428: Posting unsol ELS 3 (PLOGI) rp_id=21500 lp_id=22000
emlxs1:0151428: Processing unsol ELS 3 (PLOGI) rp_id=21500
emlxs1:0151428: port state change from e to 4
emlxs1:0151428: Posting unsol ELS 20 (PRLI) rp_id=21500 lp_id=22000
emlxs1:0151428: Processing unsol ELS 20 (PRLI) rp_id=21500
emlxs1:0151428: Posting unsol ELS 3 (PLOGI) rp_id=21600 lp_id=22000
emlxs1:0151428: Processing unsol ELS 3 (PLOGI) rp_id=21600
emlxs1:0151428: Posting unsol ELS 20 (PRLI) rp_id=21600 lp_id=22000
emlxs1:0151428: Processing unsol ELS 20 (PRLI) rp_id=21600

To be honest it does not really tell me much since I do not understand
comstar to these depths. It would appear that the link fails so either
driver problem or hardware issue? I will replace the LPe11002 with a brand
new unopened one and then  give up on FC on OI.




On Fri, Jun 7, 2013 at 4:54 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> I did find this in my inbox from 2009, I have been using FC with ZFS for
> quite sometime and only recently retired an install with OI a5 that was
> upgraded from opensolaris. It did not do real heavy duty stuff, but I had a
> similar problem where we were stuck on build 99 for quite some time.
>
> To

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-07 Thread Heinrich van Riel
I did find this in my inbox from 2009, I have been using FC with ZFS for
quite sometime and only recently retired an install with OI a5 that was
upgraded from opensolaris. It did not do real heavy duty stuff, but I had a
similar problem where we were stuck on build 99 for quite some time.

To  Jean-Yves Chevallier@emulex
Any comments on the future of Emulex with regards to the COMSTAR project?
It seems I am not the only one that have problems using Emulex in later
builds. For now I am stuck with build 99.
As always any feedback would be greatly appreciated since we have to make a
decision of sticking with Opensolaris & COMSTAR or start migrating to
another solution since we cannot stay on build 99 forever.
What I am really trying to find out is if there is a roadmap/decision to
ultimately only support Qlogic HBA’s in target mode.

Response:


Sorry for the delay in answering you. I do have news for you.
First off, the interface used by COMSTAR has changed in recent Nevada
releases (NV120 and up I believe). Since it is not a public interface we
had no prior indication on this.
We know of a number of issues, some on our driver, some on the COMSTAR
stack. Based on the information we have from you and other community
members, we have addressed all these issues in our next driver version – we
will know for sure after we run our DVT (driver verification testing) next
week. Depending on progress, this driver will be part of NV 128 or else NV
130.
I believe it is worth taking another look based on these upcoming builds,
which I imagine might also include fixes to the rest of the COMSTAR stack.

Best regards.


I can confirm that this was fixed in 128 and all I did was update from 99
to 128 and there were no problems.
Seem like the same problem has now returned and emulex does not appear to
be a good fit since sun mostly used qlogic.

guess it is back to iscsi only for now.



On Fri, Jun 7, 2013 at 4:40 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> I changed the settings. I do see it writing all the time now, but the link
> still dies after a a few min
>
> Jun  7 16:30:57  emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1: NOTICE:
> 730: Link reset. (Disabling link...)
> Jun  7 16:30:57 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1: NOTICE:
> 710: Link down.
> Jun  7 16:33:16 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1: NOTICE:
> 720: Link up. (4Gb, fabric, target)
> Jun  7 16:33:16 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK UP,
> portid 22000, topology Fabric Pt-to-Pt,speed 4G
>
>
>
>
> On Fri, Jun 7, 2013 at 3:06 PM, Jim Klimov  wrote:
>
>> Comment below
>>
>>
>> On 2013-06-07 20:42, Heinrich van Riel wrote:
>>
>>> One sec apart cloning 150GB vm from a datastore on EMC to OI.
>>>
>>> alloc free read write read write
>>> - - - - - -
>>> 309G 54.2T 81 48 452K 1.34M
>>> 309G 54.2T 0 8.17K 0 258M
>>> 310G 54.2T 0 16.3K 0 510M
>>> 310G 54.2T 0 0 0 0
>>> 310G 54.2T 0 0 0 0
>>> 310G 54.2T 0 0 0 0
>>> 310G 54.2T 0 10.1K 0 320M
>>> 311G 54.2T 0 26.1K 0 820M
>>> 311G 54.2T 0 0 0 0
>>> 311G 54.2T 0 0 0 0
>>> 311G 54.2T 0 0 0 0
>>> 311G 54.2T 0 10.6K 0 333M
>>> 313G 54.2T 0 27.4K 0 860M
>>> 313G 54.2T 0 0 0 0
>>> 313G 54.2T 0 0 0 0
>>> 313G 54.2T 0 0 0 0
>>> 313G 54.2T 0 9.69K 0 305M
>>> 314G 54.2T 0 10.8K 0 337M
>>>
>> ...
>> Were it not for your complaints about link resets and "unusable"
>> connections, I'd say this looks like a normal behavior for async
>> writes: they get cached up, and every 5 sec you have a transaction
>> group (TXG) sync which flushes the writes from cache to disks.
>>
>> In fact, the picture still looks like that, and possibly is the
>> reason for hiccups.
>>
>> The TXG sync may be an IO intensive process, which may block or
>> delay many other system tasks; previously when the interval
>> defaulted to 30 sec we got unusable SSH connections and temporarily
>> "hung" disk requests on the storage server every half a minute when
>> it was really busy (i.e. initial filling up with data from older
>> boxes). It cached up about 10 seconds worth of writes, then spewed
>> them out and could do nothing else. I don't think I ever saw network
>> connections timing out or NICs reporting resets due to this, but I
>> wouldn't be surprised if this were the cause for your case, though
>> (i.e. disk IO threads preempting HBA/NIC threads for too long somehow,
>> making the driver very puzzled about staleness state of its card).
>>
>> At the very least, TXG syncs can be tuned by two knobs: the time
>> lim

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-07 Thread Heinrich van Riel
I changed the settings. I do see it writing all the time now, but the link
still dies after a a few min

Jun  7 16:30:57  emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1: NOTICE: 730:
Link reset. (Disabling link...)
Jun  7 16:30:57 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1: NOTICE: 710:
Link down.
Jun  7 16:33:16 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1: NOTICE: 720:
Link up. (4Gb, fabric, target)
Jun  7 16:33:16 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK UP, portid
22000, topology Fabric Pt-to-Pt,speed 4G




On Fri, Jun 7, 2013 at 3:06 PM, Jim Klimov  wrote:

> Comment below
>
>
> On 2013-06-07 20:42, Heinrich van Riel wrote:
>
>> One sec apart cloning 150GB vm from a datastore on EMC to OI.
>>
>> alloc free read write read write
>> - - - - - -
>> 309G 54.2T 81 48 452K 1.34M
>> 309G 54.2T 0 8.17K 0 258M
>> 310G 54.2T 0 16.3K 0 510M
>> 310G 54.2T 0 0 0 0
>> 310G 54.2T 0 0 0 0
>> 310G 54.2T 0 0 0 0
>> 310G 54.2T 0 10.1K 0 320M
>> 311G 54.2T 0 26.1K 0 820M
>> 311G 54.2T 0 0 0 0
>> 311G 54.2T 0 0 0 0
>> 311G 54.2T 0 0 0 0
>> 311G 54.2T 0 10.6K 0 333M
>> 313G 54.2T 0 27.4K 0 860M
>> 313G 54.2T 0 0 0 0
>> 313G 54.2T 0 0 0 0
>> 313G 54.2T 0 0 0 0
>> 313G 54.2T 0 9.69K 0 305M
>> 314G 54.2T 0 10.8K 0 337M
>>
> ...
> Were it not for your complaints about link resets and "unusable"
> connections, I'd say this looks like a normal behavior for async
> writes: they get cached up, and every 5 sec you have a transaction
> group (TXG) sync which flushes the writes from cache to disks.
>
> In fact, the picture still looks like that, and possibly is the
> reason for hiccups.
>
> The TXG sync may be an IO intensive process, which may block or
> delay many other system tasks; previously when the interval
> defaulted to 30 sec we got unusable SSH connections and temporarily
> "hung" disk requests on the storage server every half a minute when
> it was really busy (i.e. initial filling up with data from older
> boxes). It cached up about 10 seconds worth of writes, then spewed
> them out and could do nothing else. I don't think I ever saw network
> connections timing out or NICs reporting resets due to this, but I
> wouldn't be surprised if this were the cause for your case, though
> (i.e. disk IO threads preempting HBA/NIC threads for too long somehow,
> making the driver very puzzled about staleness state of its card).
>
> At the very least, TXG syncs can be tuned by two knobs: the time
> limit (5 sec default) and the size limit (when the cache is "this"
> full, begin the sync to disk). The latter is a realistic figure that
> can allow you to sync in shorter bursts - with less interruptions
> to smooth IO and process work.
>
> A somewhat related tunable is the number of requests that ZFS would
> queue up for a disk. Depending on its NCQ/TCQ abilities and random
> IO abilities (HDD vs. SSD), long or short queues may be preferable.
> See also: http://www.solarisinternals.**com/wiki/index.php/ZFS_Evil_**
> Tuning_Guide#Device_I.2FO_**Queue_Size_.28I.2FO_**Concurrency.29<http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29>
>
> These tunables can be set at runtime with "mdb -K", as well as in
> the /etc/system file to survive reboots. One of our storage boxes
> has these example values in /etc/system:
>
> *# default: flush txg every 5sec (may be max 30sec, optimize
> *# for 5 sec writing)
> set zfs:zfs_txg_synctime = 5
>
> *# Spool to disk when the ZFS cache is 0x1800 (384Mb) full
> set zfs:zfs_write_limit_override = 0x1800
> *# ...for realtime changes use mdb.
> *# Example sets 0x1800 (384Mb, 402653184 b):
> *# echo zfs_write_limit_override/**W0t402653184 | mdb -kw
>
> *# ZFS queue depth per disk
> set zfs:zfs_vdev_max_pending = 3
>
> HTH,
> //Jim Klimov
>
>
> __**_
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@**openindiana.org
> http://openindiana.org/**mailman/listinfo/openindiana-**discuss<http://openindiana.org/mailman/listinfo/openindiana-discuss>
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-07 Thread Heinrich van Riel
Thank you for all the information. Ordered the SAS SSD.
I somewhat got tired of iscsi and the networking stuff around it and went
to good ol FC. Some hypervisors will still use iSCSI.

Speed is ok

One sec apart cloning 150GB vm from a datastore on EMC to OI.

alloc free read write read write
- - - - - -
309G 54.2T 81 48 452K 1.34M
309G 54.2T 0 8.17K 0 258M
310G 54.2T 0 16.3K 0 510M
310G 54.2T 0 0 0 0
310G 54.2T 0 0 0 0
310G 54.2T 0 0 0 0
310G 54.2T 0 10.1K 0 320M
311G 54.2T 0 26.1K 0 820M
311G 54.2T 0 0 0 0
311G 54.2T 0 0 0 0
311G 54.2T 0 0 0 0
311G 54.2T 0 10.6K 0 333M
313G 54.2T 0 27.4K 0 860M
313G 54.2T 0 0 0 0
313G 54.2T 0 0 0 0
313G 54.2T 0 0 0 0
313G 54.2T 0 9.69K 0 305M
314G 54.2T 0 10.8K 0 337M
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 8.32K 0 261M
314G 54.2T 0 175 0 1.06M
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 6.29K 0 196M
314G 54.2T 0 17 0 33.5K
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 0 0 0
314G 54.2T 0 9.27K 0 292M
315G 54.2T 0 11.1K 0 347M
315G 54.2T 0 0 0 0
315G 54.2T 0 0 0 0
315G 54.2T 0 0 0 0
315G 54.2T 0 9.41K 0 296M
317G 54.2T 0 29.2K 0 918M
317G 54.2T 0 0 0 0
317G 54.2T 0 0 0 0
317G 54.2T 0 0 0 0
317G 54.2T 0 11.6K 0 365M
318G 54.2T 0 25.0K 0 785M
snip... and so on.

I cant seem to catch a break, BTW I am using VMware for this. It is
connected to a brocade 5100B with many other nodes and the EMC array is
also connected to it. No other system indicate connection drop(s).
I will change the cable, but highly doubt it is that. I will also connect
the other port on the hba.

Only under load.
Jun  7 14:02:18 emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1: NOTICE: 730:
Link reset. (Disabling link...)
Jun  7 14:02:18 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1: NOTICE: 710:
Link down.
Jun  7 14:04:41 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1: NOTICE: 720:
Link up. (4Gb, fabric, target)
Jun  7 14:04:41 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK UP, portid
22000, topology Fabric Pt-to-Pt,speed 4G
Jun  7 14:10:19 emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1: NOTICE: 730:
Link reset. (Disabling link...)
Jun  7 14:10:19 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1: NOTICE: 710:
Link down.
Jun  7 14:12:40 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1: NOTICE: 720:
Link up. (4Gb, fabric, target)
Jun  7 14:12:40 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK UP, portid
22000, topology Fabric Pt-to-Pt,speed 4G
Jun  7 14:15:24 emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1: NOTICE: 730:
Link reset. (Disabling link...)
Jun  7 14:15:24 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1: NOTICE: 710:
Link down.
Jun  7 14:17:44 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1: NOTICE: 720:
Link up. (4Gb, fabric, target)
Jun  7 14:17:44 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK UP, portid
22000, topology Fabric Pt-to-Pt,speed 4G


HBA Port WWN: 1000c
Port Mode: Target
Port ID: 22000
OS Device Name: Not Applicable
Manufacturer: Emulex
Model: LPe11002-E
Firmware Version: 2.80a4 (Z3F2.80A4)
FCode/BIOS Version: none
Serial Number: VM929238
Driver Name: emlxs
Driver Version: 2.60k (2011.03.24.16.45)
Type: F-port
State: online
Supported Speeds: 1Gb 2Gb 4Gb
Current Speed: 4Gb
Node WWN: 2000c


It does recover and will continue, but not really usable in this manner.

Any ideas? Is this perhaps a known issue with emulex driver in OI? Most use
qlogic it seems, but we are 100% emulex so this is all I have.

Thanks


On Fri, Jun 7, 2013 at 11:29 AM, Edward Ned Harvey (openindiana) <
openindi...@nedharvey.com> wrote:

> > From: Jim Klimov [mailto:jimkli...@cos.ru]
> >
> > > With 90 VM's on 8 servers, being served ZFS iscsi storage by 4x 1Gb
> > > ethernet in LACP, you're really not going to care about any one VM
> being
> > > able to go above 1Gbit.  Because it's going to be so busy all the
> time, that the
> > > 4 LACP bonded ports will actually be saturated.  I think your machines
> are
> > > going to be slow.  I normally plan for 1Gbit per VM, in order to be
> comparable
> > > with a simple laptop.
> > >
> > > You're going to have a lot of random IO.  I'll strongly suggest you
> switch to
> > > mirrors instead of raidz.
> >
> > I'll leave your practical knowledge in higher regard than my theoretical
> > hunches, but I believe typical PCs (including VDI desktops) don't do
> > much disk IO after they've loaded the OS or a requested application.
>
> Agreed, disk is mostly idle except when booting or launching apps.  Some
> apps write to disk, such as internet browsing caching stuff, and MS Office
> constantly hitting the PST or OST file, and Word/Excel autosave, etc.
>
> But there are 90 of them.  So even "idle" time multiplied by 90 is no
> longer idle time.  And most likely, when they *do* get used, a whole bunch
> of them will get used at the same time.  (20 students all browsing the
> internet in 

Re: [OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-06 Thread Heinrich van Riel
If only the network guys here told me this, I do have VMware with two nics
but that does not help going to the same target as pointed out. It does
load balance but the total amount is only 80-100MB/s.

So I guess I will change two of the interfaces in LACP on one VLAN and then
the other two on another on the storage server side and on VMware/Hyper-v
bind to the two different targets with no LACP

So from the all replies I will be doing the following:

   * net changes as above
   * create block volumes with blocksize 32
   * enable compression
   * add the SSD cache disk. (doing some limited testing, students will
clone from the same templates and use the same install media when, so it
seems like it would help with that by putting on the SSD, tested on system
where I could add SATA SSD)

I will post my findings, but might take some time to fix the network in
time and they will have to deal with 1Gbps for the storage. The request is
to run ~90 VMs on 8 servers connected.

Thank you all for all the responses.










On Thu, Jun 6, 2013 at 9:24 AM, Saso Kiselkov wrote:

> On 05/06/2013 23:52, Heinrich van Riel wrote:
> > Any pointers around iSCSI performance focused on read speed? Did not find
> > much.
> >
> > I have 2 x rz2 of 10x 3TB NL-SAS each in the pool. The OI server has 4
> > interfaces configured to the switch in LACP, mtu=9000. The switch (jumbo
> > enabled) shows all interfaces are active in the port channel. How can I
> can
> > verify it on the OI side? dladm shows that it is active mode
> >
> > [..snip..]
>
> Hi Heinrich,
>
> Your limitation is LACP. Even in a link bundle, no single connection can
> exceed the speed of a single physical link - this is necessary to
> maintain correct packet ordering and queuing. There's no way around this
> other than to put fatter pipes in or not use LACP at all.
>
> You should definitely have a look at iSCSI multipath. It's supported by
> VMware, COMSTAR and a host of other products. All you need to do is
> configure multiple separate subnets, put them on separate VLANs and tell
> VMware to create multiple vmkernel interfaces in separate vSwitches.
> Then you can scan your iSCSI targets over one interface, VMware will
> auto-discover all paths to it and initiate multiple connections with
> load-balancing across all available paths (with fail-over in case a path
> dies). This approach also enables you to divide your storage
> infrastructure into two fully independent SANs, so that even if one side
> of the network experiences some horrible mess (looped cables, crappy
> switch firmware, etc.), the other side will continue to function without
> a hitch.
>
> Cheers,
> --
> Saso
>
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


[OpenIndiana-discuss] ZFS read speed(iSCSI)

2013-06-05 Thread Heinrich van Riel
Any pointers around iSCSI performance focused on read speed? Did not find
much.

I have 2 x rz2 of 10x 3TB NL-SAS each in the pool. The OI server has 4
interfaces configured to the switch in LACP, mtu=9000. The switch (jumbo
enabled) shows all interfaces are active in the port channel. How can I can
verify it on the OI side? dladm shows that it is active mode

The Hyper-v systems has 2 interfaces in LACP and all show as active and
windows indicate 2Gbps, never go over 54% util.

When I copy to an iSCSI disk from a local disk, it copies at around 200MB/s
and thats fine. When I copy from the iSCSI disk to the local disk I get no
more that 80-90MB/s and that is after messing around with the TCP/IP
setting on Windows. Before the changes it was 47MB/s max. (copy from the
local disk to the local disk I get 107MB/s so that is not the issue)
VMware 5.0 will not get more than that either.

Even when I do a zfs send/recv is seems that reads are slower. I assume
this is the expected behavior.

It will run only VMs for lab and I have the follwoing questions:

   * With this type of workload load would there be a noticeable
improvement in by adding a cache disk?
  Looking at the OCZ Talos 2 SAS. Any feedback would be
appreciated before spending the $900 on the disk.

   * System has 2x 6core E5 2.4, 64GB mem; would compression help?

   * Would it make more sense to create the pool with mirror sets?
( Wanted to use the extra space for backups)

Thanks
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] using LSI 2308-IT onboard supermicro X9DRD-7LN4F-JBOD, openindiana not loading drivers? (OpenIndiana-discuss Digest, Vol 29, Issue 43)

2013-03-01 Thread Heinrich van Riel
I tried both 4.2.6 & 4.2.8, I did not create the file in /etc.
So that would explain that problem.

On Fri, Mar 1, 2013 at 1:27 AM, Jim Klimov  wrote:

> On 2013-03-01 07:07, Jerry Kemp wrote:
>
>> You tried the latest version - 4.2.6, right?

>>> I'm sure you meant the latest version - 4.2.8 .   :)
>>
>
> Uh, those people... You don't track them for a few weeks - and here
> they pop out with a new release! ;)
>
> But, really, at the time of OPs question, 4.2.8 was not yet published,
> so my question remained valid in this respect ;)
>
>
>
> __**_
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@**openindiana.org
> http://openindiana.org/**mailman/listinfo/openindiana-**discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] using LSI 2308-IT onboard supermicro X9DRD-7LN4F-JBOD, openindiana not loading drivers? (OpenIndiana-discuss Digest, Vol 29, Issue 43)

2013-02-28 Thread Heinrich van Riel
After close to 1TB of data added and 20+ virtual machine installs over
iSCSI, it is still running and performs better than expected at this point.
(the server was running windows storage server for a few months before,
stable but, disappointed performance wise so I know it is not hardware
related).
I am going to assume it was related to the vbox install failures and move
on, do not have the time to dig into it.
Thanks,


On Thu, Feb 28, 2013 at 2:41 PM, Heinrich van Riel <
heinrich.vanr...@gmail.com> wrote:

> It had the same behavior with 151a5 and 151a7. It is not really doing
> anything. Come to think of it the problem only started after attempting to
> installing Virtualbox. I only ran the server for 1 or 2 hours before trying
> to install it, locks up when loading kernel modules. I tried a few
> different releases.
> I re-installed it last night and did not attempt to install vbox. It is
> configured as an iSCSI target and have been hit by VMware quite a bit since
> last night (installing test machines) and it is not have the problem so
> far. The strange thing is that I did go back and uninstall vbox each time
> after the failed attempt (for a5 & a7) and it locked up every few hours.
> iSCSI was not configured, was plain installs with failed attempts of vbox.
> Currently it appears to be all good. Load will be increase quite a bit in
> the next few days.
> I will respond with the results. I am also still waiting for the sas disks
> for the rpool, so a reinstall will happen at that point.
>
>
>
> On Wed, Feb 27, 2013 at 9:33 PM, Bob Friesenhahn <
> bfrie...@simple.dallas.tx.us> wrote:
>
>> On Wed, 27 Feb 2013, Heinrich van Riel wrote:
>>
>>  I am using the same board with 20 disks. I seem to have some stability
>>> issues, all are SAS disks except for the
>>> 2x rpool disks (SATA) that I have connected on the the backplane in the
>>> back of the chassis since according to the supermicro documentation
>>> SATA/SAS should not be mixed in the same backlpane. The board is in
>>> the 6047R-E1R36L storage server from Supermicro.
>>> The system would lockup after a few hours and also as soon as it tries to
>>> load the kernel modules for Virtualbox during install.
>>>
>>
>> What version of OpenIndiana are you using (uname -v)?
>>
>> Is the system doing anything significant (other than starting VirtualBox)
>> when it locks up?
>>
>> Bob
>> --
>> Bob Friesenhahn
>> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/**
>> users/bfriesen/ <http://www.simplesystems.org/users/bfriesen/>
>> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>>
>>
>> __**_
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss@**openindiana.org
>> http://openindiana.org/**mailman/listinfo/openindiana-**discuss<http://openindiana.org/mailman/listinfo/openindiana-discuss>
>>
>
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] using LSI 2308-IT onboard supermicro X9DRD-7LN4F-JBOD, openindiana not loading drivers? (OpenIndiana-discuss Digest, Vol 29, Issue 43)

2013-02-28 Thread Heinrich van Riel
It had the same behavior with 151a5 and 151a7. It is not really doing
anything. Come to think of it the problem only started after attempting to
installing Virtualbox. I only ran the server for 1 or 2 hours before trying
to install it, locks up when loading kernel modules. I tried a few
different releases.
I re-installed it last night and did not attempt to install vbox. It is
configured as an iSCSI target and have been hit by VMware quite a bit since
last night (installing test machines) and it is not have the problem so
far. The strange thing is that I did go back and uninstall vbox each time
after the failed attempt (for a5 & a7) and it locked up every few hours.
iSCSI was not configured, was plain installs with failed attempts of vbox.
Currently it appears to be all good. Load will be increase quite a bit in
the next few days.
I will respond with the results. I am also still waiting for the sas disks
for the rpool, so a reinstall will happen at that point.



On Wed, Feb 27, 2013 at 9:33 PM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Wed, 27 Feb 2013, Heinrich van Riel wrote:
>
>  I am using the same board with 20 disks. I seem to have some stability
>> issues, all are SAS disks except for the
>> 2x rpool disks (SATA) that I have connected on the the backplane in the
>> back of the chassis since according to the supermicro documentation
>> SATA/SAS should not be mixed in the same backlpane. The board is in
>> the 6047R-E1R36L storage server from Supermicro.
>> The system would lockup after a few hours and also as soon as it tries to
>> load the kernel modules for Virtualbox during install.
>>
>
> What version of OpenIndiana are you using (uname -v)?
>
> Is the system doing anything significant (other than starting VirtualBox)
> when it locks up?
>
> Bob
> --
> Bob Friesenhahn
> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/**
> users/bfriesen/ <http://www.simplesystems.org/users/bfriesen/>
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>
>
> __**_
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@**openindiana.org
> http://openindiana.org/**mailman/listinfo/openindiana-**discuss<http://openindiana.org/mailman/listinfo/openindiana-discuss>
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] using LSI 2308-IT onboard supermicro X9DRD-7LN4F-JBOD, openindiana not loading drivers? (OpenIndiana-discuss Digest, Vol 29, Issue 43)

2013-02-27 Thread Heinrich van Riel
Hi,

I am using the same board with 20 disks. I seem to have some stability
issues, all are SAS disks except for the
2x rpool disks (SATA) that I have connected on the the backplane in the
back of the chassis since according to the supermicro documentation
SATA/SAS should not be mixed in the same backlpane. The board is in
the 6047R-E1R36L storage server from Supermicro.
The system would lockup after a few hours and also as soon as it tries to
load the kernel modules for Virtualbox during install.
Not that I need vbox on it, was just testing.
Any input on from others around stability would be great. I did order 2x
SAS for the OS since the SATA disks are very old and it seems that the
controller is not happy with it. I am sure it is that at this point.

Thanks,

On Wed, Jan 2, 2013 at 7:38 AM, Jim Klimov  wrote:

> On 2013-01-02 09:16, Ong Yu-Phing wrote:
>
>> great, that worked, albeit slightly differently, so FYI to contribute to
>> the list and knowledge:
>>
>> Once I had the hint about /etc/driver_alises, I man'd to find out more,
>> then checked using prtconf, noticed the card was referenced as
>> pci15d9,69l (rather than pciex1000,86), so used update_drv -a -i
>> "pci15d9,69l" mpt_sas, and the controller+disks were found.
>>
>
>
> Question to the list: in this case, shouldn't "pciex" work as well as
> (or better than - more native) "pci"? I.e. what is the difference:
> # update_drv -a -i "pci15d9,691" mpt_sas
> # update_drv -a -i "pciex15d9,691" mpt_sas
>
> Thanks for insights,
> //Jim
>
>
> __**_
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@**openindiana.org
> http://openindiana.org/**mailman/listinfo/openindiana-**discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] ZVOL (et al) /device node access rights

2012-10-15 Thread Heinrich van Riel
I have been looking at changing from file to zvol, not for performance
reasons.

A few questions on using ZVOL

1. Is it as stable as using a file for disk?
2. Is it possible that snapshots sizes could be smaller vs file? (This is
my main reason, snapshot replication.)
3. Can this be used with KVM? I am thinking about dropping vbx ose once I
have all amd systems replaced.

Thanks

On Mon, Oct 15, 2012 at 11:49 AM, Andrej Javoršek wrote:

> Hello,
> I'm running VB guests as normal (unprivileged) user and I have impression
> that ownership (I'm using chown on zvol's) is not always lost during
> reboot.
> The other part of my comment was joke on my behalf, since restarts of host
> OS are rare and often I forget to check if guests came up properly.
> But I completely agree with you that the behaviour is not favorable.
>
> Regards Andrej
>
> On Mon, Oct 15, 2012 at 4:39 PM, Jim Klimov  wrote:
>
> > I am not sure I understood your comment?..
> >
> >
> > 2012-10-15 11:14, Andrej Javoršek wrote:
> >
> >> Hello,
> >> I have impression that it is not always necessary to chown raw zvol's.
> >>
> >
> > It seems necessary when we need to allow a non-root user to use the
> > zvol directly, such as a backing store for his VM's virtual disk.
> >
> >
> >  It happens occasionally on some zvols (and only when I initiate reboot
> >> and forget about it)  :)
> >>
> >
> > What happens? The need to chown?
> >
> > Sorry for misunderstandings if any,
> > //Jim
> >
> >
> >
> >
> >> Regards Andrej
> >>
> >> On Sun, Oct 14, 2012 at 3:08 PM, Jim Klimov  wrote:
> >>
> >>  While updating the Wiki page on virtualization, Edward Ned Harvey
> >>> wrote of, and brought to my attention, this peculiar situation:
> >>>
> >>> A VirtualBox VM can use delegated zvols as "dsk" or "rdsk" devices
> >>> on the host, just like it can use delegated raw disks or partitions,
> >>> likely iSCSI volumes and other block devices. According to Edward,
> >>> block devices yield better performance than VDI files for VM disks.
> >>> A VM can be executed by an unprivileged user, and thus the device
> >>> node needs to be RW accessible to that non-root user (whom and why
> >>> to trust - that's the admin's problem, OS should not limit that).
> >>>
> >>> So, the problem detected with ZVOLs (and I expect it can have a
> >>> wider range on other devices) is that the ownership of the device
> >>> node for a zvol is forgotten upon reboot or other pool reimport.
> >>> That is, the node used by a VM should be chown'ed upon every VM
> >>> startup. That's inconvenient, so to say.
> >>>
> >>> I played more with this and found that I can also set ACLs with
> >>> /bin/chmod on device nodes, and that is even remembered across
> >>> reboots, however with /dev/zvol/*dsk/pool/vol being a dynamically
> >>> assigned symlink like /devices/pseudo/zfs@0:4(,raw) there is a
> >>> problem: the symlink and device node is created when I look at
> >>> it (i.e. upon first "ls" or another access to the /dev/zvol/...
> >>> object), and the device node occupies the first available number.
> >>> The /devices filesystem seems to remember ACL entries (but not
> >>> ownerships) across reboots only in conjunction with its object
> >>> names, so upon each reboot (reimport) of the pool, the same
> >>> device node name can get assigned to different zvols.
> >>>
> >>> This is not only "useless" in terms of stably providing access
> >>> to certain devices for certain users, but also harmful as after
> >>> a reboot an unexpected user (among those earlier trusted) can
> >>> gain access to incorrect devices (and might even enforce that
> >>> somehow, by being first to access the device at the correct
> >>> moment) and cause DoS or intentional illicit access to other
> >>> users' data.
> >>>
> >>> So here is the picture "as is". I am not sure what exactly to ask,
> >>> so I guess it's a call for opinions on how the situation can be
> >>> improved, in terms of remembering correct ownerships and ACLs for
> >>> those devices (not nodes) that the rights were set for, in order
> >>> to both increase usability and security of non-root device access.
> >>>
> >>> In the particular case of ZVOL devices, I guess attributes can
> >>> be added to the ZVOLs that would hold the POSIX and ACL access
> >>> rights and owner:group info (do people agree that is a worthy RFE?).
> >>>
> >>> For non-zfs devices like local disk or iscsi or USB - I am not sure
> >>> if the problem exists the same way (not tested) or how it can be
> >>> addressed if it exists (some config file for devfs?)
> >>>
> >>> Thanks,
> >>> //Jim Klimov
> >>>
> >>
> >
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Zfs stability "Scrubs"

2012-10-15 Thread Heinrich van Riel
On Mon, Oct 15, 2012 at 6:21 PM, Jason Matthews  wrote:

>
>
> From: heinrich.vanr...@gmail.com [mailto:heinrich.vanr...@gmail.com]
>
>
> > My point is most high end storage units has some form of data
> > verification process that is active all the time.
>
> As does ZFS. The blocks are checksumed on each read. Assuming you have
> mirrors or parity redundancy, the misbehaving block is corrected,
> reallocated, etc.
>
> Right, I understand ZFS checks data on each read, my point is checking the
disk or data periodically.


> In my opinion scrubs should be considered depending on the importance
> > of data and the frequency based on what type of raidz, change rates
> > and disk type used.
>
> One point of scrubs is to verify the data that you don't normally read.
> Otherwise, the errors would be found in real time upon the next read.
>

Understood, if full backups are executed weekly/monthly no scrub is
required.


> > Perhaps in future ZFS will have the ability to limit resource
> > allocation when scrubbing like with BV where it can be set. Rebuild
> > priory can also be set.
>
> There are tunables for this.
>
> Thanks, did not know will research, had a fairly heavy impact the other
day replacing a disk..

>
> > Also some high end controllers have "port" verify for each
> > disk (media read) when using their integrated raid that runs
> > periodically. Since in the world of ZFS it is recommended to use
> > JBOD I see it as more than just the filesystem. I have never deployed
> > a system containing mission critical data using filesystem raid
> > protection other than with ZFS since there is no protection in them an
> > I would much rather bank on the controller.
>
>
> Unfortunately my parser was unable to grok this. Seems like you would
> prefer
> a raid controller.
>


Sorry, boils down to this, if ZFS is not an option I use a raid controller
if data is important.
In fact I do not like to be tied to a specific controller, zfs gives me the
freedom to change at any point

>
> j.
>
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] Memory usage concern

2012-10-15 Thread Heinrich van Riel
Thanks for the responses.

Spot on  in post http://openindiana.org/**pipermail/openindiana-discuss/**
2012-September/009788.html<http://openindiana.org/pipermail/openindiana-discuss/2012-September/009788.html>

arc_meta_used =   423 MB

2 days later

arc_meta_used =   827 MB

thanks,

On Fri, Oct 12, 2012 at 12:32 PM, Roel_D  wrote:

> Hmz, i'm running multiple windows 2003 servers within virtualbox running
> on Solaris 10 hosts. Some windows servers are online for almost a year now.
>
> The VHD's are stored on a Zfs mirror and i never had any out-of-memory
> errors.
>
> The solaris servers also serve glassfish, mysqlcluster and many, many more
> services from zones.
>
> Running out of memory can imho only be caused by a programmingerror in the
> virtualbox version.
>
> Zfs will not  cause an out-of-memory
>
> Kind regards,
>
> The out-side
>
> Op 12 okt. 2012 om 18:13 heeft "Udo Grabowski (IMK)" <
> udo.grabow...@kit.edu> het volgende geschreven:
>
> > On 12/10/2012 16:10, Heinrich van Riel wrote:
> >>
> >> My concern is as follow, when I 1st start the system and VMs there is
> 8GB
> >> of free memory. This number keeps decreasing by about 300MB every 10
> hours.
> >>
> >> OI - 10GB max allocated to ZFS ARC (only setting change from default
> >> install)
> >> 
> >> At the current rate there will be no memory free in 10 days.
> >>
> >> Am I concerned for no reason?
> >
> > This can impact your other work to do on that machine severely.
> > However, if that is only a fileserver, ZFS will manage to keep
> > some memory available so that the machine will not swap if
> > nothing else interferes. Insofar Damians comments on unused
> > RAM are correct. But if you know you need space for something
> > else often, see my remedies for this problem in this thread:
> >
> > <
> http://openindiana.org/pipermail/openindiana-discuss/2012-September/009788.html
> >
> >
> > --
> > Dr.Udo GrabowskiInst.f.Meteorology a.Climate Research IMK-ASF-SAT
> > www-imk.fzk.de/asf/sat/grabowski/ www.imk-asf.kit.edu/english/sat.php
> > KIT - Karlsruhe Institute of Technologyhttp://www.kit.edu
> > Postfach 3640,76021 Karlsruhe,Germany  T:(+49)721 608-26026 F:-926026
> >
> > ___
> > OpenIndiana-discuss mailing list
> > OpenIndiana-discuss@openindiana.org
> > http://openindiana.org/mailman/listinfo/openindiana-discuss
>
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


[OpenIndiana-discuss] Memory usage concern

2012-10-12 Thread Heinrich van Riel
Hello everyone,

I have a question around memory utilization.
Recently we had a system that started to lock up daily and we had to swap
it out with a lab system running 151a5 and Virtualbox.

lab system: 2x AMD 6 core, 32GB Mem. (no KVM support for AMD)
The lab system use to run an Exchange 2010 lab environment all win2008R2
(1x domain controller, 2x DB nodes in DAG, 2x CAS/HUB nodes in an array and
1x Exchange 2007 DB and 1x Exchange 2007 CAS)
The system ran without issues, granted it was not doing much.
Now the system has 2VMs 1x win2008 server running SQL & Sharepoint and
about 10 users connecting and also a windows 2008R2 terminal server not
doing much.

I have 8GB memory allocated to the SQL server and 4GB to the terminal
server in virtualbox.
Since we cannot afford anymore lockups due to the two 2 weeks of issue on
the prod system, I need to keep an eye on things.

My concern is as follow, when I 1st start the system and VMs there is 8GB
of free memory. This number keeps decreasing by about 300MB every 10 hours.

OI - 10GB max allocated to ZFS ARC (only setting change from default
install)
VBOX - No page fusion or ballooning configured.

It seems the memory is going to the kernel.

Page SummaryPagesMB  %Tot
     
Kernel 623041  24337%
ZFS File Data 2587928 10109   31%
Anon95090   3711%
Exec and libs4349160%
Page cache3184468 12439   38%
Free (cachelist) 9890380%
Free (freelist)   1881628  7350   22%


few hours later

Page SummaryPagesMB  %Tot
     
Kernel 632236  24698%
ZFS File Data 2587980 10109   31%
Anon94909   3701%
Exec and libs4326160%
Page cache3184462 12439   38%
Free (cachelist)10378400%
Free (freelist)   1872103  7312   22%


At the current rate there will be no memory free in 10 days.

Am I concerned for no reason?

Any feedback on how stable OI is with Virtualbox would also be appreciated.
I truly do not want to reinstall the windows system for quite some time if
possible since we do not make use of USB.


Thanks,
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss