subject:"Any success stories for HAST \+ ZFS\?"

Re: Any success stories for HAST + ZFS?

2011-04-12 Thread Pete French

> Everything is detected correctly, everything comes up correctly.  See
> a new option (reload) in the RC script for hast.


same here - have patched the master databse achines, all came up fine,
everything running erfectly, have flip-flopped between the two machines
with no ill effects whatsoever, and all looking very good.

cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-11 Thread Mikolaj Golub

On Mon, 11 Apr 2011 11:26:15 -0700 Freddie Cash wrote:

 FC> On Sun, Apr 10, 2011 at 12:36 PM, Mikolaj Golub  
wrote:
 >> On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:
 >>  FC> Once the deadlock patches above are MFC'd to -STABLE, I can do an
 >>  FC> upgrade cycle and test them.
 >>
 >> Committed to STABLE.

 FC> Updated src tree to r220537.  Recompiled world, kernel, etc.
 FC> Installed world, kernel, etc.  ZFSv28 patch was not affected.

 FC> Everything is detected correctly, everything comes up correctly.  See
 FC> a new option (reload) in the RC script for hast.

 FC> Can create/change role for 24 hast devices simultaneously.

 FC> Can switch between master/slave modes.

 FC> Have 5 rsyncs running in parallel without any issues, transferring
 FC> 80-120 Mbps over the network (just under 100 Mbps seems to be the
 FC> average right now).

 FC> Switching roles while the rsyncs are running succeeds without
 FC> deadlocking (obviously, rsync complains a whole bunch while the switch
 FC> happens as the pool disappears out from underneath it, but it picks up
 FC> again when the pool is back in place).

 FC> Hitting the reset switch on the box while the rsyncs are running
 FC> doesn't affect the hast devices or the pool, beyond losing the last 5
 FC> seconds of writes.

 FC> It's only been a couple of hours of testing and hammering, but so far
 FC> things are much more stable/performant than before.

Cool! Thanks for reporting!

 FC> Anything else I should test?

Nothing particular, but any tests and reports are appreciated. E.g. ones of
the recent features Pawel has added are checksum and compression. You could
try different options and compare :-)

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-11 Thread Freddie Cash

On Sun, Apr 10, 2011 at 12:36 PM, Mikolaj Golub  wrote:
> On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:
>  FC> Once the deadlock patches above are MFC'd to -STABLE, I can do an
>  FC> upgrade cycle and test them.
>
> Committed to STABLE.

Updated src tree to r220537.  Recompiled world, kernel, etc.
Installed world, kernel, etc.  ZFSv28 patch was not affected.

Everything is detected correctly, everything comes up correctly.  See
a new option (reload) in the RC script for hast.

Can create/change role for 24 hast devices simultaneously.

Can switch between master/slave modes.

Have 5 rsyncs running in parallel without any issues, transferring
80-120 Mbps over the network (just under 100 Mbps seems to be the
average right now).

Switching roles while the rsyncs are running succeeds without
deadlocking (obviously, rsync complains a whole bunch while the switch
happens as the pool disappears out from underneath it, but it picks up
again when the pool is back in place).

Hitting the reset switch on the box while the rsyncs are running
doesn't affect the hast devices or the pool, beyond losing the last 5
seconds of writes.

It's only been a couple of hours of testing and hammering, but so far
things are much more stable/performant than before.

Anything else I should test?

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-10 Thread Mikolaj Golub


On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:

 FC> Once the deadlock patches above are MFC'd to -STABLE, I can do an
 FC> upgrade cycle and test them.

Committed to STABLE.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-05 Thread Freddie Cash

On Tue, Apr 5, 2011 at 5:05 AM, Mikolaj Golub  wrote:
> On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:
>
>  FC> On Sat, Apr 2, 2011 at 1:44 AM, Pawel Jakub Dawidek  
> wrote:
>  >>
>  >> I just committed a fix for a problem that might look like a deadlock.
>  >> With trociny@ patch and my last fix (to GEOM GATE and hastd) do you
>  >> still have any issues?
>
>  FC> Just to confirm, this is commit r220264, 220265, 220266 to -CURRENT?
>
> Yes, r220264 and 220266. As it is stated in the commit log MFC is planned
> after 1 week.

Okay.  I'll keep an eye out next week for the MFC of those patches to
hit -STABLE, and do an upgrade/test cycle after that point.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-05 Thread Mikolaj Golub


On Mon, 4 Apr 2011 11:08:16 -0700 Freddie Cash wrote:

 FC> On Sat, Apr 2, 2011 at 1:44 AM, Pawel Jakub Dawidek  
wrote:
 >>
 >> I just committed a fix for a problem that might look like a deadlock.
 >> With trociny@ patch and my last fix (to GEOM GATE and hastd) do you
 >> still have any issues?

 FC> Just to confirm, this is commit r220264, 220265, 220266 to -CURRENT?

Yes, r220264 and 220266. As it is stated in the commit log MFC is planned
after 1 week.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-04 Thread Freddie Cash

On Sat, Apr 2, 2011 at 1:44 AM, Pawel Jakub Dawidek  wrote:
> On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote:
>> [Not sure which list is most appropriate since it's using HAST + ZFS
>> on -RELEASE, -STABLE, and -CURRENT.  Feel free to trim the CC: on
>> replies.]
>>
>> I'm having a hell of a time making this work on real hardware, and am
>> not ruling out hardware issues as yet, but wanted to get some
>> reassurance that someone out there is using this combination (FreeBSD
>> + HAST + ZFS) successfully, without kernel panics, without core dumps,
>> without deadlocks, without issues, etc.  I need to know I'm not
>> chasing a dead rabbit.
>
> I just committed a fix for a problem that might look like a deadlock.
> With trociny@ patch and my last fix (to GEOM GATE and hastd) do you
> still have any issues?

Just to confirm, this is commit r220264, 220265, 220266 to -CURRENT?

Looking through the commit logs, I don't see any of these MFC'd to
-STABLE yet, so I can't test them directly.  The storage box that was
having the issues is running 8-STABLE r219754 at the moment (with
ZFSv28 and Mikolag's ggate patches).

I see there have been a lot of hast/ggate-related MFCs in the past
week, but they don't include the deadlock patches.

Once the deadlock patches above are MFC'd to -STABLE, I can do an
upgrade cycle and test them.

I do have the previous 9-CURRENT install saved, just nothing to run it on atm.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-02 Thread Pawel Jakub Dawidek

On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote:
> [Not sure which list is most appropriate since it's using HAST + ZFS
> on -RELEASE, -STABLE, and -CURRENT.  Feel free to trim the CC: on
> replies.]
> 
> I'm having a hell of a time making this work on real hardware, and am
> not ruling out hardware issues as yet, but wanted to get some
> reassurance that someone out there is using this combination (FreeBSD
> + HAST + ZFS) successfully, without kernel panics, without core dumps,
> without deadlocks, without issues, etc.  I need to know I'm not
> chasing a dead rabbit.

I just committed a fix for a problem that might look like a deadlock.
With trociny@ patch and my last fix (to GEOM GATE and hastd) do you
still have any issues?

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgp80PxT4EuiQ.pgp
Description: PGP signature

Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Freddie Cash

On Fri, Apr 1, 2011 at 4:22 AM, Pete French  wrote:
>> The other 5% of the time, the hastd crashes occurred either when
>> importing the ZFS pool, or when running multiple parallel rsyncs to
>> the pool.  hastd was always shown as the last running process in the
>> backtrace onscreen.
>
> This is what I am seeing - did you manage to reproduce this with the patch,
> or does it fix the issue for you ? Am doing more test now, with only a single
> hast device to see if it is stable. Am Ok to run without mirroring across
> hast devices for now, but wouldnt like to do so long term!

I have not been able to crash or hang the box since applying Mikolaj's patch.

I've tried the following:
  - destroy pool
  - create pool
  - destroy hast providers
  - create hast providers
  - switch from master to slave via hastctl using "role secondary all"
  - switch from slave to master via hastctl using "role primary all"
  - switch roles via hast-carp-switch which does one provider per second
  - import/export pool

I've been running 6 parallel rsyncs for the past 48 hours, getting a
consistent 200 Mbps of transfers, with just under 2 TB of deduped data
in the pool, without any lockups.

So far, so good.
-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Pete French

> This looks like a different problem. If you have this again please provide the
> output of 'procstat -kka'.

Will do...

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Mikolaj Golub

On Fri, 01 Apr 2011 11:40:11 +0100 Pete French wrote:

 >> Yes, you may hit it only on hast devices creation. The workaround is to 
 >> avoid
 >> using 'hastctl role primary all', start providers one by one instead.

 PF> Interesting to note that I just hit a lockup in hast (the discs froze
 PF> up - could not run hastctl or zpool import, and could not kill
 PF> them). I have two hast devices instead of one, but I am starting them
 PF> individually instead of  using 'all'. The copde includes all the latest
 PF> patches which have gone into STABLE over the last few days, none of which
 PF> look particularly controversial!

 PF> I havent tried your atch yet, nor been able to reporduce the lockup, but
 PF> thought you might be interested to know that I also had problems with
 PF> multiple providers.

This looks like a different problem. If you have this again please provide the
output of 'procstat -kka'.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Pete French

> The other 5% of the time, the hastd crashes occurred either when
> importing the ZFS pool, or when running multiple parallel rsyncs to
> the pool.  hastd was always shown as the last running process in the
> backtrace onscreen.

This is what I am seeing - did you manage to reproduce this with the patch,
or does it fix the issue for you ? Am doing more test now, with only a single
hast device to see if it is stable. Am Ok to run without mirroring across
hast devices for now, but wouldnt like to do so long term!

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-04-01 Thread Pete French

> Yes, you may hit it only on hast devices creation. The workaround is to avoid
> using 'hastctl role primary all', start providers one by one instead.

Interesting to note that I just hit a lockup in hast (the discs froze
up - could not run hastctl or zpool import, and could not kill
them). I have two hast devices instead of one, but I am starting them
individually instead of  using 'all'. The copde includes all the latest
patches which have gone into STABLE over the last few days, none of which
look particularly controversial!

I havent tried your atch yet, nor been able to reporduce the lockup, but
thought you might be interested to know that I also had problems with
multiple providers.

cheers,

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-28 Thread Freddie Cash

On Sun, Mar 27, 2011 at 5:16 AM, Mikolaj Golub  wrote:
 On Sat, 26 Mar 2011 10:52:08 -0700 Freddie Cash wrote:
>
>  FC> hastd backtrace is here:
>  FC> http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png
>
> It is not a hastd crash, but a kernel crash triggered by hastd process.

Ah, interesting.

> I am not sure I got the same crash as you but apparently the race is possible
> in g_gate on device creation.

95% of the time that it would crash, would be when creating the
/dev/hast/* devices (switching to primary role).  Most of the crashes
happened when doing "hastctl role primary all", but would occasionally
happen when doing it manually for each resource.  Creating the
resources by hand, one every 2 seconds or so, would usually create
them all without crashing.

The other 5% of the time, the hastd crashes occurred either when
importing the ZFS pool, or when running multiple parallel rsyncs to
the pool.  hastd was always shown as the last running process in the
backtrace onscreen.

> I got the following crash starting many hast providers simultaneously:
>
> fault virtual address   = 0x0
>
> #8  0xc0c11adc in calltrap () at /usr/src/sys/i386/i386/exception.s:168
> #9  0xc086ac6b in g_gate_ioctl (dev=0xc6a24300, cmd=3374345472,
>    addr=0xc9fec000 "\002", flags=3, td=0xc7ff0b80)
>    at /usr/src/sys/geom/gate/g_gate.c:410
> #10 0xc0853c5b in devfs_ioctl_f (fp=0xc9b9e310, com=3374345472,
>    data=0xc9fec000, cred=0xc8c9c200, td=0xc7ff0b80)
>    at /usr/src/sys/fs/devfs/devfs_vnops.c:678
> #11 0xc09210cd in kern_ioctl (td=0xc7ff0b80, fd=3, com=3374345472,
>    data=0xc9fec000 "\002") at file.h:262
> #12 0xc0921254 in ioctl (td=0xc7ff0b80, uap=0xf5edbcec)
>    at /usr/src/sys/kern/sys_generic.c:679
> #13 0xc0916616 in syscallenter (td=0xc7ff0b80, sa=0xf5edbce4)
>    at /usr/src/sys/kern/subr_trap.c:315
> #14 0xc0c2b9ff in syscall (frame=0xf5edbd28)
>    at /usr/src/sys/i386/i386/trap.c:1086
> #15 0xc0c11b71 in Xint0x80_syscall ()
>    at /usr/src/sys/i386/i386/exception.s:266
>
> Or just creating many ggate devices simultaneously:
>
> for i in `jot 100`; do
>    ./ggiocreate $i&
> done
>
> ggiocreate.c is attached.
>
> In my case the kernel crashes in g_gate_create() when checking for name
> collisions in strcmp():
>
>        /* Check for name collision. */
>        for (unit = 0; unit < g_gate_maxunits; unit++) {
>                if (g_gate_units[unit] == NULL)
>                        continue;
>                if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0)
>                        continue;
>                mtx_unlock(&g_gate_units_lock);
>                mtx_destroy(&sc->sc_queue_mtx);
>                free(sc, M_GATE);
>                return (EEXIST);
>        }
>
> I think the issue is the following. When preparing sc we take
> g_gate_units_lock, check for name collision, fill sc fields except
> sc->sc_provider, and registers sc in g_gate_units[unit]. sc_provider is filled
> later, when g_gate_units_lock is released. So the scenario is possible:
>
> 1) Thread A registers sc in g_gate_units[unit] with
> g_gate_units[unit]->sc_provider still null and releases g_gate_units_lock.
>
> 2) Thread B traverses g_gate_units[] when checking for name collision and
> craches accessing g_gate_units[unit]->sc_provider->name.
>
> The attached patch fixes the issue in my case.

Patch applied cleanly to 8-STABLE with ZFSv28 patch also applied.
Just to be safe, did a full buildwold/kernel cycle, running GENERIC
kernel.

So far, I have not been able to produce a crash in hastd, through
several reboots, switching from primary to secondary and back, and
just switching from primary to init and back.

So far, so good.

Now to see if I can reproduce any of the ZFS crashes I had earlier.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-28 Thread Mikolaj Golub

On Mon, 28 Mar 2011 10:47:22 +0100 Pete French wrote:

 >> It is not a hastd crash, but a kernel crash triggered by hastd process.
 >>
 >> I am not sure I got the same crash as you but apparently the race is 
 >> possible
 >> in g_gate on device creation.
 >>
 >> I got the following crash starting many hast providers simultaneously:

 PF> This is very interestng to me - my successful ZFS+HAST only had
 PF> a single drive, but in my new setup I am intending to use two
 PF> HAST processes and then mirror across thhem under ZFS, so I am
 PF> likely to hit this bug. Are the processes stable once launched ?

Yes, you may hit it only on hast devices creation. The workaround is to avoid
using 'hastctl role primary all', start providers one by one instead.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-28 Thread Pete French

> It is not a hastd crash, but a kernel crash triggered by hastd process.
>
> I am not sure I got the same crash as you but apparently the race is possible
> in g_gate on device creation.
>
> I got the following crash starting many hast providers simultaneously:

This is very interestng to me - my successful ZFS+HAST only had
a single drive, but in my new setup I am intending to use two
HAST processes and then mirror across thhem under ZFS, so I am
likely to hit this bug. Are the processes stable once launched ?

I dont have a system on whcih to try your patch at the moment, but will do
so when I get the opportunity!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-27 Thread Mikolaj Golub


On Sun, 27 Mar 2011 15:16:15 +0300 Mikolaj Golub wrote to Freddie Cash:

 MG> The attached patch fixes the issue in my case.

The patch is committed to current.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-27 Thread Mikolaj Golub


On Sat, 26 Mar 2011 10:52:08 -0700 Freddie Cash wrote:

 FC> hastd backtrace is here:
 FC> http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png

It is not a hastd crash, but a kernel crash triggered by hastd process.

I am not sure I got the same crash as you but apparently the race is possible
in g_gate on device creation.

I got the following crash starting many hast providers simultaneously:

fault virtual address   = 0x0

#8  0xc0c11adc in calltrap () at /usr/src/sys/i386/i386/exception.s:168
#9  0xc086ac6b in g_gate_ioctl (dev=0xc6a24300, cmd=3374345472, 
addr=0xc9fec000 "\002", flags=3, td=0xc7ff0b80)
at /usr/src/sys/geom/gate/g_gate.c:410
#10 0xc0853c5b in devfs_ioctl_f (fp=0xc9b9e310, com=3374345472, 
data=0xc9fec000, cred=0xc8c9c200, td=0xc7ff0b80)
at /usr/src/sys/fs/devfs/devfs_vnops.c:678
#11 0xc09210cd in kern_ioctl (td=0xc7ff0b80, fd=3, com=3374345472, 
data=0xc9fec000 "\002") at file.h:262
#12 0xc0921254 in ioctl (td=0xc7ff0b80, uap=0xf5edbcec)
at /usr/src/sys/kern/sys_generic.c:679
#13 0xc0916616 in syscallenter (td=0xc7ff0b80, sa=0xf5edbce4)
at /usr/src/sys/kern/subr_trap.c:315
#14 0xc0c2b9ff in syscall (frame=0xf5edbd28)
at /usr/src/sys/i386/i386/trap.c:1086
#15 0xc0c11b71 in Xint0x80_syscall ()
at /usr/src/sys/i386/i386/exception.s:266

Or just creating many ggate devices simultaneously:

for i in `jot 100`; do
./ggiocreate $i&
done

ggiocreate.c is attached.

In my case the kernel crashes in g_gate_create() when checking for name
collisions in strcmp():

/* Check for name collision. */
for (unit = 0; unit < g_gate_maxunits; unit++) {
if (g_gate_units[unit] == NULL)
continue;
if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0)
continue;
mtx_unlock(&g_gate_units_lock);
mtx_destroy(&sc->sc_queue_mtx);
free(sc, M_GATE);
return (EEXIST);
}

I think the issue is the following. When preparing sc we take
g_gate_units_lock, check for name collision, fill sc fields except
sc->sc_provider, and registers sc in g_gate_units[unit]. sc_provider is filled
later, when g_gate_units_lock is released. So the scenario is possible:

1) Thread A registers sc in g_gate_units[unit] with
g_gate_units[unit]->sc_provider still null and releases g_gate_units_lock.

2) Thread B traverses g_gate_units[] when checking for name collision and
craches accessing g_gate_units[unit]->sc_provider->name.

The attached patch fixes the issue in my case.

-- 
Mikolaj Golub



ggiocreate.c
Description: Binary data
Index: sys/geom/gate/g_gate.c
===
--- sys/geom/gate/g_gate.c	(revision 220050)
+++ sys/geom/gate/g_gate.c	(working copy)
@@ -407,13 +407,14 @@ g_gate_create(struct g_gate_ctl_create *ggio)
 	for (unit = 0; unit < g_gate_maxunits; unit++) {
 		if (g_gate_units[unit] == NULL)
 			continue;
-		if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0)
+		if (strcmp(name, g_gate_units[unit]->sc_name) != 0)
 			continue;
 		mtx_unlock(&g_gate_units_lock);
 		mtx_destroy(&sc->sc_queue_mtx);
 		free(sc, M_GATE);
 		return (EEXIST);
 	}
+	sc->sc_name = name;
 	g_gate_units[sc->sc_unit] = sc;
 	g_gate_nunits++;
 	mtx_unlock(&g_gate_units_lock);
@@ -432,6 +433,9 @@ g_gate_create(struct g_gate_ctl_create *ggio)
 	sc->sc_provider = pp;
 	g_error_provider(pp, 0);
 	g_topology_unlock();
+	mtx_lock(&g_gate_units_lock);
+	sc->sc_name = sc->sc_provider->name;
+	mtx_unlock(&g_gate_units_lock);
 
 	if (sc->sc_timeout > 0) {
 		callout_reset(&sc->sc_callout, sc->sc_timeout * hz,
Index: sys/geom/gate/g_gate.h
===
--- sys/geom/gate/g_gate.h	(revision 220050)
+++ sys/geom/gate/g_gate.h	(working copy)
@@ -76,6 +76,7 @@
  * 'P:' means 'Protected by'.
  */
 struct g_gate_softc {
+	char			*sc_name;		/* P: (read-only) */
 	int			 sc_unit;		/* P: (read-only) */
 	int			 sc_ref;		/* P: g_gate_list_mtx */
 	struct g_provider	*sc_provider;		/* P: (read-only) */
@@ -96,7 +97,6 @@ struct g_gate_softc {
 	LIST_ENTRY(g_gate_softc) sc_next;		/* P: g_gate_list_mtx */
 	char			 sc_info[G_GATE_INFOSIZE]; /* P: (read-only) */
 };
-#define	sc_name	sc_provider->geom->name
 
 #define	G_GATE_DEBUG(lvl, ...)	do {	\
 	if (g_gate_debug >= (lvl)) {	\
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-26 Thread Freddie Cash

On Fri, Mar 25, 2011 at 12:55 AM, Pawel Jakub Dawidek  wrote:
> On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote:
>> I've tried with FreeBSD 8.2-RELEASE, 8-STABLE, 8-STABLE w/ZFSv28
>> patches, and 9-CURRENT (after the ZFSv28 commit).  Things work well
>> until I start hastd.  Then either the system locks up, or hastd causes
>> a kernel panic, or hastd dumps core.
>
> The minimum amount of information (as always) would be backtrace from
> the kernel and also hastd backtrace when it coredumps. There is really
> decent logging in hast, so I'm also sure it does log something
> interesting on primary or secondary. Another useful thing would be to
> turn on debugging in hast (single -d option for hastd).
>
> The best you can do is to give me the simplest and quickest procedure to
> reproduce the issue, eg. configure two hast resources, put ZFS mirror on
> top, start rsync /usr/src to the file system on top of hast and switch
> roles. The simpler the better.

FreeBSD 8-STABLE r219754 with the ZFSv28 patches applied.

hast.conf:
resource disk-a1 {
local /dev/label/disk-a1

on omegadrive {
remote tcp4://10.20.0.102
}

on alphadrive {
remote tcp4://10.20.0.101
}
}

resource disk-a2 {
local /dev/label/disk-a2

on omegadrive {
remote tcp4://10.20.0.102
}

on alphadrive {
remote tcp4://10.20.0.101
}
}

Following will crash hastd:
service hastd onestart
hastctl create disk-a1
hastctl create disk-a2
hastctl role primary all

hastd backtrace is here:
http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png

I'll try running it with -d to see if there's anything interesting there.

Sure, running it with -d and -F, output to a log file, everything
works well using 2 disks.

Hrm, running it with all 24 disks, I can't make it crash now.
However, I did change the kernel hz from 100 to 1000.  I'll see if I
can switch it back to 100 and try the tests again using -dF.

The backtrace listed above is with kern.hz=100.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-26 Thread Mickaël Maillot

Hi,

2011/3/24 Freddie Cash :
> The hardware is fairly standard fare:
>  - SuperMicro H8DGi-F motherboard
>  - AMD Opteron 6100-series CPU (8-cores @ 2.0 GHz)
>  - 8 GB DDR3 SDRAM
>  - 64 GB Kingston V-Series SSD for the OS install (using ahci(4) and
> the motherboard SATA controller)
>  - 3x SuperMicro AOC-USAS2-8Li SATA controllers with IT firmware
>  - 6x 1.5 TB Seagate 7200.11 drives (1x raidz2 vdev)
>  - 12x 1.0 TB Seagate 7200.12 drives (2x raidz2 vdev)
>  - 6x 0.5 TB WD RE3 drives (1x raidz2 vdev)

just for info, sun recommend 1 Gb of RAM per Tera of data.
i see here ~ 16 To of available data, so i would recommend 16 Gb for
arc_size and 24 or 32 Gb for the host.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

2011-03-25 Thread Pawel Jakub Dawidek

On Thu, Mar 24, 2011 at 01:36:32PM -0700, Freddie Cash wrote:
> I've tried with FreeBSD 8.2-RELEASE, 8-STABLE, 8-STABLE w/ZFSv28
> patches, and 9-CURRENT (after the ZFSv28 commit).  Things work well
> until I start hastd.  Then either the system locks up, or hastd causes
> a kernel panic, or hastd dumps core.

The minimum amount of information (as always) would be backtrace from
the kernel and also hastd backtrace when it coredumps. There is really
decent logging in hast, so I'm also sure it does log something
interesting on primary or secondary. Another useful thing would be to
turn on debugging in hast (single -d option for hastd).

The best you can do is to give me the simplest and quickest procedure to
reproduce the issue, eg. configure two hast resources, put ZFS mirror on
top, start rsync /usr/src to the file system on top of hast and switch
roles. The simpler the better.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com

pgpQSkm3Cfnru.pgp
Description: PGP signature

Re: Any success stories for HAST + ZFS?

2011-03-25 Thread Pete French

> So, please, someone, somewhere, share a success story, where you're
> using FreeBSD, ZFS, and HAST.  Let me know that it does work.  I'm
> starting to lose faith in my abilities here.  :(

I ran our main database for the old company using ZFS on top of HAST
without any problems at all. Had a single HAST disc with a zpool on
top of it, and mysql on top of that. All worked perfectly for us.

Am not running that currently as the company went under and we lost the
hardware. But am working for a new business and am about to deploy the
same configuration for the main database as its "tried and tested" as
far as I am concerned.

Will be slightly different, as I will have a pair of HAST drives and do
mirroring over the top with ZFS. But I shall report back how well, or
not, it works.

-pete.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Any success stories for HAST + ZFS?

2011-03-24 Thread Freddie Cash

[Not sure which list is most appropriate since it's using HAST + ZFS
on -RELEASE, -STABLE, and -CURRENT.  Feel free to trim the CC: on
replies.]

I'm having a hell of a time making this work on real hardware, and am
not ruling out hardware issues as yet, but wanted to get some
reassurance that someone out there is using this combination (FreeBSD
+ HAST + ZFS) successfully, without kernel panics, without core dumps,
without deadlocks, without issues, etc.  I need to know I'm not
chasing a dead rabbit.

In tests using VirtualBox and FreeBSD 8-STABLE from when HAST was
first MFC'd, everything worked wonderfully.   HAST-based pool would
come up, data would sync to the slave node, fail-over worked nicely,
bringing the other box back online as the slave worked, data synced
back, etc.  It was a thing of beauty.

Now, on real hardware, I cannot get the system to stay online for more
than an hour.  :(  hastd causes kernel panics with "bufwrite: buffer
not busy" errors.  ZFS pools get corrupted.  System deadlocks (no log
messages, no onscreen errors, not even NumLock key works) at random
points.

The hardware is fairly standard fare:
  - SuperMicro H8DGi-F motherboard
  - AMD Opteron 6100-series CPU (8-cores @ 2.0 GHz)
  - 8 GB DDR3 SDRAM
  - 64 GB Kingston V-Series SSD for the OS install (using ahci(4) and
the motherboard SATA controller)
  - 3x SuperMicro AOC-USAS2-8Li SATA controllers with IT firmware
  - 6x 1.5 TB Seagate 7200.11 drives (1x raidz2 vdev)
  - 12x 1.0 TB Seagate 7200.12 drives (2x raidz2 vdev)
  - 6x 0.5 TB WD RE3 drives (1x raidz2 vdev)

The motherboard BIOS is up-to-date.  I do not see any way to update
the firmware on the SATA controllers.  Using the onboard IPMI-based
sensors, CPU, motherboard, RAM temps and volatages are in the nominal
range.

I've tried with FreeBSD 8.2-RELEASE, 8-STABLE, 8-STABLE w/ZFSv28
patches, and 9-CURRENT (after the ZFSv28 commit).  Things work well
until I start hastd.  Then either the system locks up, or hastd causes
a kernel panic, or hastd dumps core.

Each harddrive is glabel'd as "disk-a1" through "disk-d6".

hast.conf has 24 resources listed, one for each glabel'd device.

The pool is created using the /dev/hast/* devices with disk-a1 through
disk-a6 being one raidz2 vdev, and so on through disk-b*, disk-c*, and
disk-d*, for a total of 4 raidz2 vdevs of 6 drives each.  A fairly
standard setup, I would think.

Even using a GENERIC kernel, I can't keep things stable and running.

So, please, someone, somewhere, share a success story, where you're
using FreeBSD, ZFS, and HAST.  Let me know that it does work.  I'm
starting to lose faith in my abilities here.  :(

Or point out where I'm doing things wrong so I can correct the issues.

Thanks.
-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Re: Any success stories for HAST + ZFS?

Any success stories for HAST + ZFS?

23 matches

Site Navigation

Mail list logo

Footer information