Re: Threaded 6.4 code compiled under 9.0 uses a lot more memory?..

2012-10-30 Thread Jan Mikkelsen
Hi,


On 30/10/2012, at 10:12 PM, Karl Pielorz  wrote:

> 
> Hi All,
> 
> Can anyone think of any quick pointers as to why some code originally written 
> under 6.4 amd64 - when re-compiled under 9.0-stable amd64 takes up a *lot* 
> more memory when running?
> 
> The code involved is a sendmail Milter, and a TCP server type program (that 
> runs up a large number of threads [~700] at startup).
> 
> Both were previously compiled with:
> 
> -O2 -pthread -lc_r
> 
> They're now compiled under 9.0-S with just:
> 
> -O2 -pthread

libc_r is a user mode implementation of pthreads, so there is one actual kernel 
thread with a stack. You now have ~700 kernel threads on startup. Per-thread 
stack allocation will be different, and you could quite easily explain 
differences that way.

Regards,

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: OS support for fault tolerance

2012-02-14 Thread Jan Mikkelsen

On 15/02/2012, at 3:57 AM, Julian Elischer wrote:

> On 2/14/12 6:23 AM, Maninya M wrote:
>> For multicore desktop computers, suppose one of the cores fails, the
>> FreeBSD OS crashes. My question is about how I can make the OS tolerate
>> this hardware fault.
>> The strategy is to checkpoint the state of each core at specific intervals
>> of time in main memory. Once a core fails, its previous state is retrieved
>> from the main memory, and the processes that were running on it are
>> rescheduled on the remaining cores.
>> 
>> I read that the OS tolerates faults in large servers. I need to make it do
>> this for a Desktop OS. I assume I would have to change the scheduler
>> program. I am using FreeBSD 9.0 on an Intel core i5 quad core machine.
>> How do I go about doing this? What exactly do I need to save for the
>> "state" of the core? What else do I need to know?
>> I have absolutely no experience with kernel programming or with FreeBSD.
>> Any pointers to good sources about modifying the source-code of FreeBSD
>> would be greatly appreciated.
> This question has always intrigued me, because I'm always amazed
> that people actually try.
> From my viewpoint, There's really not much you can do if the core
> that is currently holding the scheduler lock fails.
> And what do you mean by 'fails"?  do you run constant diagnostics?
> how do you tell when it is failed? It'd be hard to detect that 'multiply'
> has suddenly started giving bad results now and then.
> 
> if it just "stops" then you might be able to have a watchdog that
> notices,  but what do you do when it was half way through rearranging
> a list of items? First, you have to find out that it held
> the lock for the module and then you have to find out what it had
> done and clean up the mess.
> 
> This requires rewriting many many parts of the kernel to remove
> 'transient inconsistent states". and even then, what do you do if it
> was half way through manipulating some hardware..
> 
> and when you've figured that all out, how do you cope with the
> mess it made because it was dying?
> Say for example it had started calculating bad memory offsets
> before writing out some stuff and written data out over random memory?
> 
> but I'm interested in any answers people may have

Back in the '90s I spent a bunch of time with looking at and using systems that 
dealt with this kind of failure.

There are two basic approaches: With software support and without. The basic 
distinction is what the hardware can do when something breaks. Is it able to 
continue, or must it stop immediately?

Tandem had systems with both approaches:

The NonStop proprietary operating system had nodes with lock-step processors 
and lots of error checking that would stop immediately when something broke. A 
CPU failure turned into a node halt. There was a bunch of work to have nodes 
move their state around so that terminal sessions would not be interrupted, 
transactions would be rolled back, and everything would be in a consistent 
state.

The Integrity Unix range was based on MIPS RISC/os, with a lot of work at 
Tandem. We had the R2000 and later the R3000 based systems. They had three CPUs 
all in lock step with voting ("triple modular redundancy"), and entirely 
duplicated memory, all with ECC. Redundant busses, separate cabinets for 
controllers and separate cabinets for each side of the disk mirror. You could 
pull out a CPU board and memory board, show a manager, and then plug them back 
in.

Tandem claimed to have removed 80% of panics from the kernel, and changed the 
device driver architecture so that they could recover from some driver faults 
by reinitialising driver state on a running system.

We still had some outages on this system, all caused by software. It was also 
expensive: AUD$1,000,000 for a system with the same underlying CPU/memory as a 
$30k MIPS workstation at the time. It was also slower because of the error 
checking overhead. However, it did crash much less than the MIPS boxes.

Coming back to the multicore issue:

The problem when a core fails is that it has affected more than its own state. 
It will be holding locks on shared resources and may have corrupted shared 
memory or asked a device to do the wrong thing. By the time you detect a fault 
in a core, it is too late. Checkpointing to main memory means that you need to 
be able to roll back to a checkpoint, and replay operations you know about. 
That involves more that CPU core state, that includes process file and device 
state.

The Tandem lesson is that it much easier when you involve the higher level 
software in dealing with these issues. Building a system where you can make the 
application programmer ignorant of the need to deal with failure is much harder 
than when you expose units of work to the application programmer and can just 
fail a node and replay the work somewhere else. Transactions are your friend.

Lots of literature on this stuff. My favourite is "Transaction Processing: 

Re: sem(4) lockup in python?

2012-02-07 Thread Jan Mikkelsen

On 06/02/2012, at 3:49 AM, Attilio Rao wrote:

> 2012/2/5 Ivan Voras :
>> On 5 February 2012 11:44, Garrett Cooper  wrote:
>> 
>>> 
>>>'make MAKE_JOBS_NUMBER=1' is the workground used right now..
>> 
>> David Xu suggested that it is a bug in Python - it doesn't set
>> process-shared attribute when it calls sem_init(), but i've tried
>> patching it (replacing the port patchfile file the one I've attached)
>> and I still get the hang.
> 
> Guys,
> it would be valuable if you do the following:
> 1) recompile your kernel with INVARIANTS, WITNESS and without WITNESS_SKIPSPIN
> 2a) If you have a serial console, please run the DDB stuff through it
> (go to point 3)
> 2b) If you don't have a serial console please run the DDB stuff in
> textdump (go to point 3)
> 3) Collect the following informations:
> - show allpcpu
> - show alllocks
> - ps
> - alltrace
> 3a) If you had the serial console (thus not textdump) please collect
> the coredump with: call doadump
> 4) reset your machine
> 
> You will end up with the textdump or coredump + all the serial logs
> necessary to debug this.
> If you cannot reproduce your issue with WITNESS enabled, please remove
> from your kernel config and avoid to call 'show alllocks' when in DDB.
> But try to leave INVARIANTS on.
> 
> Hope this helps,
> Attilio


This has just happened again, this time with MAKE_JOBS_NUMBER=1, so that 
workaround didn't work.

I don't have INVARIANTS or WITNESS compiled in, but I did fire up kgdb to poke 
around. The stack traces look identical. I don't know what to expect in these 
structures. If there's anything useful I can dig out here, please let me know.

However: A parent and child process both blocked waiting on semaphores smells 
like an user level bug to me.

Jan.



(kgdb) proc 24969
[Switching to thread 648 (Thread 101022)]#0  sched_switch 
(td=0xfe003de43000, newtd=0xfe000b501000, flags=Variable "flags" is not 
available.
)
at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
1854cpuid = PCPU_GET(cpuid);
(kgdb) where
#0  sched_switch (td=0xfe003de43000, newtd=0xfe000b501000, 
flags=Variable "flags" is not available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
#1  0x8083af24 in mi_switch (flags=260, newtd=0x0) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
#2  0x80872644 in sleepq_catch_signals (wchan=0xfe0015fca800, 
pri=0) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
#3  0x80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not 
available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
#4  0x8083b599 in _sleep (ident=0xfe0015fca800, 
lock=0x81114860, priority=Variable "priority" is not available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
#5  0x8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
#6  0x8084ad61 in __umtx_op_sem_wait (td=0xfe003de43000, 
uap=0xff8693d85bc0) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:3205
#7  0x80b17de0 in amd64_syscall (td=0xfe003de43000, traced=0) at 
subr_syscall.c:131
#8  0x80b03517 in Xfast_syscall () at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/amd64/amd64/exception.S:387
#9  0x0008010277fc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) proc 24970
[Switching to thread 665 (Thread 100553)]#0  sched_switch 
(td=0xfe02f7240460, newtd=0xfe000b501460, flags=Variable "flags" is not 
available.
)
at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
1854cpuid = PCPU_GET(cpuid);
(kgdb) where
#0  sched_switch (td=0xfe02f7240460, newtd=0xfe000b501460, 
flags=Variable "flags" is not available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/sched_ule.c:1854
#1  0x8083af24 in mi_switch (flags=260, newtd=0x0) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:448
#2  0x80872644 in sleepq_catch_signals (wchan=0xfe0015fd7380, 
pri=0) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:425
#3  0x80872fb6 in sleepq_wait_sig (wchan=Variable "wchan" is not 
available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/subr_sleepqueue.c:631
#4  0x8083b599 in _sleep (ident=0xfe0015fd7380, 
lock=0x811145e0, priority=Variable "priority" is not available.
) at 
/home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_synch.c:232
#5  0x8084ac69 in do_sem_wait (td=Variable "td" is not available.
) at /home/janm/p4/freebsd-image-std-2011.1/FreeBSD/src/sys/kern/kern_umtx.c:513
#6  0x8084ad61 in 

Re: sem(4) lockup in python?

2012-02-05 Thread Jan Mikkelsen

On 05/02/2012, at 9:44 PM, Garrett Cooper wrote:

> On Sun, Feb 5, 2012 at 2:41 AM, Jan Mikkelsen
>  wrote:
>> 
>> On 12/01/2012, at 3:47 AM, Garrett Cooper wrote:
>> 
>>> [ builds hanging in python with waf … ]
>> 
>> Any progress on this? Or a workaround?
>> 
>> Just had another build stuck on this …
> 
>'make MAKE_JOBS_NUMBER=1' is the workground used right now..

Great, thanks.

Jan.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sem(4) lockup in python?

2012-02-05 Thread Jan Mikkelsen

On 12/01/2012, at 3:47 AM, Garrett Cooper wrote:

> [ builds hanging in python with waf … ]

> Glad to see that iXsystems isn't the only one ([1] -- please add a "me
> too" to the PR). The problem is that we do FreeNAS nightlies and they
> frequently get stuck building tdb (10%~20% of the time) and it sticks
> when doing interactive builds as well. The issue appears to be
> exacerbated when we have more builds running in parallel on the same
> machine. I've also run into the same issue compiling talloc because it
> uses the same waf infrastructure as tdb, which was designed to "speed
> things up by forcing builds to be parallelized" (It builds
> kern.smp.ncpus jobs instead of -j 1). Furthermore, it seems to occur
> regardless of whether or not we have the WITH_SEM enabled in python or
> not (build.ix's copy of python doesn't have it enabled, but
> streetfighter.ix, my system bayonetta, etc do).

Any progress on this? Or a workaround?

Just had another build stuck on this …

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Hardware supported by ng_frame_relay?

2012-01-18 Thread Jan Mikkelsen

On 15/01/2012, at 6:00 PM, Roman Kurakin wrote:

> Hi,
> 
> Jan Mikkelsen wrote:
>> Hi,
>> 
>> I'm looking to upgrade a system running frame relay over a Sangoma A101 card 
>> and WANPIPE.
>> 
>> Sangoma do not support FreeBSD anymore, so I'm looking for alternatives.
>> 
>> What hardware does ng_frame_relay support now that ar(4) and sr(4) are not 
>> in FreeBSD 9?
>> 
>> Specifically, will ng_frame_relay work with a Digium TE121 and 
>> ports/dahdi-kmod?
>> 
>> Any suggestions welcome, G.703, X.21 or V.35 interfaces OK.
>>  
> Check also www. cronyx. ru for ce(4) and cp(4).  As far as I know, an old 
> digium adapters were using
> software framer for HDLC, but I don'k know the current state. If they didn't 
> provide hardware framer
> now, I suggest to check for any other adapter.


Thanks. Just had a look at their site; they support up to 6.x, there is a red 
note saying 7.x is not supported. I suspect that would also apply to 8.x and 
9.x ...

For now going with a standalone router. Might have a look at the Digium card 
more closely if I have a different application for it ...

Regards,

Jan.


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Hardware supported by ng_frame_relay?

2012-01-18 Thread Jan Mikkelsen

On 16/01/2012, at 2:34 PM, Julian Elischer wrote:

> On 1/13/12 11:00 PM, Jan Mikkelsen wrote:
>> Hi,
>> 
>> I'm looking to upgrade a system running frame relay over a Sangoma A101 card 
>> and WANPIPE.
>> 
>> Sangoma do not support FreeBSD anymore, so I'm looking for alternatives.
>> 
>> What hardware does ng_frame_relay support now that ar(4) and sr(4) are not 
>> in FreeBSD 9?
>> 
>> Specifically, will ng_frame_relay work with a Digium TE121 and 
>> ports/dahdi-kmod?
>> 
>> Any suggestions welcome, G.703, X.21 or V.35 interfaces OK.
> 
> Unfortunately, with the advent of Ethernet connected internet feeds (e.g. 
> dsl, cable etc),
> plain synchronous interfaces have become almost irrelevant for most of the 
> developers.
> If you can find a card that "would have" been usable with  the ar or sr
> drivers and you have one for testing, we could possibly resurrect it with 
> your help,
> but none of the current developers have such a card.. (that I know of).

Thanks. For now I'm going with a standalone router. This is a single link for a 
communicating with a legacy application at a government body, so I'm not 
looking to invest significant amounts of time.

I was curious about about what was involved in hooking ng_frame_relay up to one 
of the E1 based telephone cards to do frame relay. Perhaps if I get one of the 
cards for another purpose I'll have a look.

Regards,

Jan.___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Hardware supported by ng_frame_relay?

2012-01-13 Thread Jan Mikkelsen
Hi,

I'm looking to upgrade a system running frame relay over a Sangoma A101 card 
and WANPIPE.

Sangoma do not support FreeBSD anymore, so I'm looking for alternatives.

What hardware does ng_frame_relay support now that ar(4) and sr(4) are not in 
FreeBSD 9?

Specifically, will ng_frame_relay work with a Digium TE121 and ports/dahdi-kmod?

Any suggestions welcome, G.703, X.21 or V.35 interfaces OK.

Thanks,

Jan Mikkelsen___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sem(4) lockup in python?

2012-01-13 Thread Jan Mikkelsen

On 12/01/2012, at 3:47 AM, Garrett Cooper wrote:
> Glad to see that iXsystems isn't the only one ([1] -- please add a "me
> too" to the PR).
> [ … ]
> 1. http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/163489

Also reported in:

ports/163467
ports/160717

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: mfi (Dell H700) + hot swapping doesn't appear to work with RC1

2011-12-18 Thread Jan Mikkelsen
On 15/12/2011, at 2:16 AM, Borja Marcos wrote:

> 
> On Dec 14, 2011, at 2:09 PM, Hugo Silva wrote:
> 
>> Hello,
>> 
>> First of all apologies if this has been fixed in RC3. I set this server
>> up with mfsbsd, which is RC1, and didn't get to update the system yet.
>> 
>> This box has 6 hdds, a 2-mirror zpool was set up as the root pool, with
>> 2 spares.
>> 
>> While testing hot swapping I noticed that while the controller detects
>> disk removal/insertion, the zpool will never recover. The problem seems
>> to be deeper than ZFS, as disklabel/fdisk/etc also fail on the
>> removed-and-reinserted disk.
>> 
>> At the ZFS level, doing a zpool clear yields more errors on the removed
>> disk; rebooting becomes the only option to make the pool healthy again.
>> 
>> 
>> Is this normal? Did I miss any step?
> 
> I assume that you have tried to use the H700 as a "JBOD" card, defining 
> logical volume for each hard disk.
> 
> The problem is: that gorgeous, fantastic, masterful, Nobel award candidate 
> card, has a wonderful behavior in that case. If you extract one of the disks, 
> the logical volume associated to it is invalidated. So, you insert a 
> replacement disk, and the card refuses to recognize the volume. What is even 
> worse, in order to recover it's mandatory to reboot the complete system *AND* 
> go through the RAID configuration utility.
> 
> That's the problem. The card refuses  to work as a simple disk controller 
> without frills, and the frills get in the way.
> 
> To summarize: it isn't FreeBSD's fault, no matter which version you use. It's 
> a "feature" coming directly from the geniuses who designed the card.

Hugo: You missed a step. Borja: No reboot required.

For the mfi controllers I have been testing recently (MegaRAID 9261-8i), you 
need to install the sysutils/megacli port, and use that to clear the 
"foreignness" of the disk you just added. Something like:

MegaCli -CfgForeign -Clear -a0

You should be able to then recreate it as a JBOD device, and progress through 
whatever higher level recovery you need to do.

Regards,

Jan Mikkelsen



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: mfi (Dell H700) + hot swapping doesn't appear to work with RC1

2011-12-15 Thread Jan Mikkelsen

On 16/12/2011, at 3:40 AM, Andrew Boyer wrote:

> 
> On Dec 15, 2011, at 4:19 AM, Jan Mikkelsen wrote:
> 
>> For the mfi controllers I have been testing recently (MegaRAID 9261-8i), you 
>> need to install the sysutils/megacli port, and use that to clear the 
>> "foreignness" of the disk you just added. Something like:
>> 
>>  MegaCli -CfgForeign -Clear -a0
> 
> I don't think that's what you want.  You want to use -Import, not -Clear, to 
> keep your data intact.

OK. When I did a -Clear and recreated the drive as a single disk raid0 volume, 
the data was still there, but I wanted it to go away.

> On Dec 15, 2011, at 11:03 AM, Hugo Silva wrote:
> 
>> On 12/15/11 15:28, Hugo Silva wrote:
>>> As Borja said, part of the difficulty is the H700 abstracting a single
>>> disk as a RAID-0, I guess. So far I've been unable to find a way to
>>> bring the drive back, except by rebooting and recreating.
>> 
>> Turns out no interaction is needed after reboot. It was something else
>> unrelated. The main issue then is convincing the controller to once
>> again accept the hard disk. I'm going through MegaCli "documentation"
>> (ie --help).. it's not a pretty place.
> 
> I'm not sure it would even be possible to come up with a worse interface.  It 
> boggles the mind.

I agree. It is insanely bad.

> I recommend you always run with this configuration:
> 
> # MegaCli -AdpSetProp AutoEnhancedImportEnbl -aALL
> # MegaCli -AdpSetProp MaintainPdFailHistoryEnbl -0 -aALL
> 
> AutoEnhancedImportEnbl will bring the foreign disk back in on a reboot.  LSI 
> recommends turning off MaintainPdFailHistory when using single-disk RAID0 
> configurations.

What does PD Fail History actually do?

> Adding these capabilities to mfiutil is on my list of things to do, but it's 
> not ready yet.

Thanks.

> Has anyone managed to get the real JBOD mode working on this controller?  It 
> advertises support in the firmware but doesn't seem to do anything.  The 
> documentation only lists JBOD mode as a feature of the lower-end controllers.

You mean using "MegaCli -PDMakeJBOD"? No, it doesn't work from me on the 
9281-8i. I get "Failed to change PD state". Single disk RAID-0 works fine.

> Hope this helps.

It does, thank you.

Regards,

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: mfi (Dell H700) + hot swapping doesn't appear to work with RC1

2011-12-15 Thread Jan Mikkelsen

On 16/12/2011, at 1:56 AM, John Baldwin wrote:

> On Thursday, December 15, 2011 4:19:58 am Jan Mikkelsen wrote:
>> For the mfi controllers I have been testing recently (MegaRAID 9261-8i), you 
> need to install the sysutils/megacli port, and use that to clear the 
> "foreignness" of the disk you just added. Something like:
>> 
>>   MegaCli -CfgForeign -Clear -a0
>> 
>> You should be able to then recreate it as a JBOD device, and progress 
> through whatever higher level recovery you need to do.
> 
> Can you do this by marking it as 'good' via mfiutil and then using mfiutil
> to create a volume?

I was going to reply and say that mfiutil will complain about the drive being 
in the wrong state, but after reading the other replies I decided to test.

With a blank drive, yes, you can use mfiutil to recreate the jbod device. You 
don't even need to do an "mfiutil good" first.

If you use a drive that has previously been used by an mfi controller, it shows 
up as "bad". Doing "mfiutil good" makes it go to the "unconfigured good" state. 
Then creation of the jbod fails with this error:

mfiutil: Command failed: Wrong firmware or drive state
mfiutil: Failed to add volume: Input/output error

At this point you need to reach for "MegaCli -CfgForeign" and deal with the now 
foreign drive.

You can use -Import (as pointed out by Andrew Boyer) or -Clear. In my previous 
testing (on which my original reply was based), I used drives that were being 
moved between machines and so my procedure ended up being -Clear because I did 
not want the drive to have the same configuration as the last time it was used. 
That was followed a dd from /dev/zero and then the higher level steps. I have 
just tested -Import for the same slot and it worked fine for me. I have not 
tested -Import when putting the drive into a different slot.

Regards,

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: mfi (Dell H700) + hot swapping doesn't appear to work with RC1

2011-12-15 Thread Jan Mikkelsen
On 15/12/2011, at 2:16 AM, Borja Marcos wrote:

> 
> On Dec 14, 2011, at 2:09 PM, Hugo Silva wrote:
> 
>> Hello,
>> 
>> First of all apologies if this has been fixed in RC3. I set this server
>> up with mfsbsd, which is RC1, and didn't get to update the system yet.
>> 
>> This box has 6 hdds, a 2-mirror zpool was set up as the root pool, with
>> 2 spares.
>> 
>> While testing hot swapping I noticed that while the controller detects
>> disk removal/insertion, the zpool will never recover. The problem seems
>> to be deeper than ZFS, as disklabel/fdisk/etc also fail on the
>> removed-and-reinserted disk.
>> 
>> At the ZFS level, doing a zpool clear yields more errors on the removed
>> disk; rebooting becomes the only option to make the pool healthy again.
>> 
>> 
>> Is this normal? Did I miss any step?
> 
> I assume that you have tried to use the H700 as a "JBOD" card, defining 
> logical volume for each hard disk.
> 
> The problem is: that gorgeous, fantastic, masterful, Nobel award candidate 
> card, has a wonderful behavior in that case. If you extract one of the disks, 
> the logical volume associated to it is invalidated. So, you insert a 
> replacement disk, and the card refuses to recognize the volume. What is even 
> worse, in order to recover it's mandatory to reboot the complete system *AND* 
> go through the RAID configuration utility.
> 
> That's the problem. The card refuses  to work as a simple disk controller 
> without frills, and the frills get in the way.
> 
> To summarize: it isn't FreeBSD's fault, no matter which version you use. It's 
> a "feature" coming directly from the geniuses who designed the card.

(Sending again to avoid moderation.)

Hugo: You missed a step. Borja: No reboot required.

For the mfi controllers I have been testing recently (MegaRAID 9261-8i), you 
need to install the sysutils/megacli port, and use that to clear the 
"foreignness" of the disk you just added. Something like:

   MegaCli -CfgForeign -Clear -a0

You should be able to then recreate it as a JBOD device, and progress through 
whatever higher level recovery you need to do.

Regards,

Jan Mikkelsen


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: read(1) garbage when input redirected from make incorrectly

2010-02-16 Thread Jan Mikkelsen
On 16/02/2010, at 10:49 PM, Dag-Erling Smørgrav wrote:

> The LHS of < is a command, the RHS is the name of the file to be read.
> After that, you can have further redirections, a command separator
> (semicolon, single or double ampersand, single or double pipe etc.), or,
> depending on context, various other stuff such as a paren, bracket,
> backquote etc.

A redirection doesn't terminate the argument list.

For example:

echo a b < /dev/null c d

Produces:

a b c d

And:

< /etc/passwd cat

Will emit /etc/passwd to stdout.

Regards,

Jan Mikkelsen

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: read(1) garbage when input redirected from make incorrectly

2010-02-15 Thread Jan Mikkelsen
On 16/02/2010, at 11:55 AM, Garrett Cooper wrote:

> Hi Hackers,
>I accidentally reproduced the following after executing read
> properly in a pipeline with make:
> 
> [garrc...@garrcoop-fbsd /usr/home/garrcoop]$ read DESTDIR SRCCONF <
> /usr/bin/make -V DESTDIR -V SRCCONF
> bash: read: `-V': not a valid identifier
> [garrc...@garrcoop-fbsd /usr/home/garrcoop]$ echo $DESTDIR
>  ELF
> [garrc...@garrcoop-fbsd /usr/home/garrcoop]$ hexdump -C foo
>   7f 45 4c 46 01 01 01 0a   |.ELF|
> 0008
> [garrc...@garrcoop-fbsd /usr/home/garrcoop]$
> 
>Is this an issue to be concerned about apart from cosmetic noise,
> i.e. potential buffer access problem? I see the same garbage from
> bash/coreutils on RHEL 4.6 as well as read(1) and /bin/sh on FreeBSD
> with RELENG_8, so the issue appears to be consistent on multiple
> OSes...
> Thanks,
> -Garrett

I think you meant to type:

make -V DESTDIR -V SRCCONF | read DESTDIR SRCCONF

What you are actually doing is feeding the contents of the make binary into:

read DESTDIR SRCCONF -V DESTDIR -V SRCCONF

and the shell is correctly complaining about '-V' not being a valid identifier, 
and populating DESTDIR with data it got from the binary.

Jan

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Kernel panic caused by fork

2009-09-08 Thread Jan Mikkelsen

Hi,

On 07/09/2009, at 8:59 PM, Ivan Radovanovic wrote:
...

After running this program I got kernel panic with message
"get_pv_entry: increase vm.pmap.shpgperproc"
IMHO it is not very good idea to bring entire system down if one  
process misbehaves in this way, it is maybe much better to kill  
offending process and to send this message to system log. I am not  
sure whether the panic is actually caused by process forking forever  
or when the system tries to create new process when maxproc limit is  
already reached (since system is only printing warning message that  
maxproc limit is reached and it only panics when I try to start new  
process (like ps)).


A quick observation: This is not "one process misbehaving", it is a  
large number of processes misbehaving.  From an administrative point  
of view, I think the response is "call setrlimit(RLIMIT_NPROC, ...)",  
otherwise the expected behaviour is for your machine to stop making  
forward progress.


Having said that, I agree that panics are bad and it would be nice if  
fork() returned EAGAIN, again and again and again.  Or perhaps the  
machine should just panic ...


Regards,

Jan.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


RE: kqueue and libev

2007-12-16 Thread Jan Mikkelsen
Julian Elischer wrote:
> Julian Elischer wrote:
> > James Mansion wrote:
> >> [ On the libev being unhappy with kqueue ]
> >> ...
> >> It looks like a decent library, but these comments seem 
> unfortunate.
> >> Does anyone know what the author is concerned about?
> > 
> > he's just plain misinformed
> > 
> 
> kqueue works well with aio to files and raw devices for 
> example. (Only using AIO really makes sense in these cases 
> anyhow, so I've never really tried using kqueue with non-aio calls.)

It also depends what version of FreeBSD.  For example, in FreeBSD 4, kqueue
was non-functional with USB serial devices.  I ran into that exact problem
with kqueue when porting my event library.  Problems like that can make life
difficult for a library author; having a special case for one kind of handle
is a pain, to the point of leading to comments like this.

Of course, in this case the best thing to do is to ask the author, and to
see if the situation has changed.

Regards,

Jan Mikkelsen



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Australian cvs repository

2007-08-15 Thread Jan Mikkelsen
Hi,

Robert McKenzie wrote:
> Has anyone noted that the Australian cvs repository seems to
> be so
> hopelessly out of sink that you cannot do a clean build using
> a clean cvsup.

Yes, I've noticed this too.  I'm using mirror.pacific.net.au at the
moment, and it seems fine.  Pacific Internet is our ISP;  I don't
know if it is accessible from outside their network.

Regards,

Jan Mikkelsen


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: i386 with PAE or AMD64 on PowerEdge with 4G RAM

2007-06-18 Thread Jan Mikkelsen
Martin Turgeon wrote:
> ...
> I am facing a difficult decision. Should I use i386 with PAE 
> enabled in 
> the kernel (I read a lot of warnings using it) or should I go with 
> AMD64? Which branch should I follow?
> 
> These servers will be front-end/back-end MySQL(with replication) and 
> Apache servers with BIND, Postfix, Dovecot, PF.

Looks like an easy decision to me.

You have source for all of those things, and they are known to work on
amd64.  I suggest going amd64.  There are many advantages in going amd64,
and the primary disadvantage of going amd64 is the inability to run some
(but not all) 32-bit binaries at the moment.  I see no 32-bit binaries in
your list.

Regards,

Jan Mikkelsen.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: patchset-9 release (Re: [unionfs][patch] improvements of the unionfs - Problem Report, kern/91010)

2006-03-16 Thread Jan Mikkelsen
Daichi GOTO wrote:
> All folks have interests in improved unionfs should keep attentions
> and ask "how about merge?" at every turn :)

OK.  How about a merge?

I'd really like to see this in 6-STABLE.

Regards,

Jan Mikkelsen.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: how to flush out cache.?

2004-04-24 Thread Jan Mikkelsen
Julian Elischer wrote:
> Other than reading a few GB of data, is there a way to flush
> out the cache copy of a file I've written?

I don't know how this will fit into your application, but unmounting and
remounting the filesystem is a way that springs to mind.  Perhaps not as
isloated as you'd like, but still ...

Regards,

Jan Mikkelsen
[EMAIL PROTECTED]

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: making CVS more convenient

2003-03-16 Thread Jan Mikkelsen
Nate Williams wrote:
> > The current version of Perforce has "p4proxy" which caches 
> a local copy
> > of the depot files used.
> 
> Does it still require a working net link to the master 
> repository?  When
> it was originally released, I remember it being useful for slow links,
> but not so good on non-existant links (ie; airplane rides, etc..)

Yes, it still requires a working link.  Perforce depends on being able
to keep its database of client state up to date.
 
> > What is the status of Perforce in the FreeBSD project?
>
> See the archives for a more thorough discussion, but I believe the
> licensing is the biggest issue.  If we moved to use 
> commercial software,
> it would make our development much more difficult for the average
> developer to track our progress.

I'll take a look.  Presumably something like a "p4up" could get around
that.

Regards,

Jan.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message


RE: making CVS more convenient

2003-03-16 Thread Jan Mikkelsen
Nate Williams wrote:
> The other solution to the problem is the P4 route.  Making things so
> darn effecient that there's little need to have a local mirror.  Where
> this falls down is when the remote developer doesn't have a 24x7
> connection to the main repository.  From what I've been told ClearCase
> allows for 'mirrored read-only' repositories similar to what 
> most of the
> open-source CVS developers have been doing with sup/CVSup for years,
> although it's nowhere near as effecient as CVSup at creating 
> snapshots.

The current version of Perforce has "p4proxy" which caches a local copy
of the depot files used.  To the p4 client, it looks just like the
server.  The Perforce model makes this a bit easier with a significant
amount of client state stored on the server.

What is the status of Perforce in the FreeBSD project?  Is the issue the
absence of a "p4up"?  Licensing?  Inertia?

Regards,

Jan Mikkelsen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message


RE: High Avaliability Processes

2002-07-22 Thread Jan Mikkelsen

Johan Brodin wrote:
> I'm new to this list and I want to ask a rather simple question. Does
> FreeBSD contain a program (preferably a kernel process) that 
> can see if
> another process (user defined) terminates and then restart 
> this process?
> Or will I have to use an "external" program for this?

As well as init/ttys, look at supervise from Dan Bernstein's daemontools
package:

http://cr.yp.to/daemontools.html

There are other benefits like reliably sending signals to processes by
name, log file management and a bunch of other useful, well designed
stuff.

Jan Mikkelsen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



ATA Controller choices

2002-02-20 Thread Jan Mikkelsen

Søren Schmidt wrote:
> ... However the Serverworks ROSB4 chips is not one I 
> would recommend using, if you need serious ATA support on 
> such a board, install a Promise TX2 or later or a HPT370 or later ...

I think I've also seen you post that the Highpoint is better than the
Promise.

What is the "quality heirarchy" of ATA chips?  Eg. I know the VIA chips
have issues.  Where does the CMD 649 fit in that heirarchy?

Such a list (or even a list of known issues with particular chips) would
be useful for specifying new machines.

Thanks,

Jan Mikkelsen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Undefined symbol "_ZTVN10__cxxabiv117__class_type_infoE"

2001-08-31 Thread Jan Mikkelsen

You probably have the system default libstdc++.so.3 in your library search
path before the GCC 3 libstdc++.so.3.  Try setting LD_LIBRARY_PATH to the
GCC 3 lib directory.

Jan Mikkelsen

-Original Message-
From: Benjamin Gross <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Date: Wednesday, 29 August 2001 18:11
Subject: Undefined symbol "_ZTVN10__cxxabiv117__class_type_infoE"


>Hello,
>
>I've just installed gcc version 3.0 on a FreeBSD v4.4 system to work on a
c++ project, and when I try to execute a program that has been successfully
compiled and linked, I get the following message:
>
>/usr/libexec/ld-elf.so.1: Undefined symbol
"_ZTVN10__cxxabiv117__class_type_infoE" referenced from COPY relocation in
./test
>
>Does anyone know what the problem is ?
>
>thanks,
>
>ben
>
>
>
>To Unsubscribe: send mail to [EMAIL PROTECTED]
>with "unsubscribe freebsd-hackers" in the body of the message
>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: C++ to C translator

2001-07-03 Thread Jan Mikkelsen

Comeau C++, http://www.comeaucomputing.com might do what you want,
although you'll have to hack at the build system a bit.  Other products
based on the EDG front end might do similar things.  I seem to recall
that KAI C++ converted to C, but I don't know.

-Original Message-
From: Rayson Ho <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Date: Wednesday, 4 July 2001 12:48
Subject: C++ to C translator


>Hi,
>
>I have written some code in C++. However, I want to run it on an old
>mainframe machine, which a C++ compiler is not available.
>
>I know that the old g++ is a C++ to C compiler. Does anyone know which
>version it is? Also, anyone knows other C++ to C compilers?
>
>Thanks,
>Rayson
>
>
>__
>Do You Yahoo!?
>Get personalized email addresses from Yahoo! Mail
>http://personal.mail.yahoo.com/
>
>To Unsubscribe: send mail to [EMAIL PROTECTED]
>with "unsubscribe freebsd-hackers" in the body of the message
>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: write() vs aio_write()

2001-04-30 Thread Jan Mikkelsen

Mike Silbersack <[EMAIL PROTECTED]> wrote:
[ On using aio on disks vs. sockets ]
>Sockets already support non-blocking IO, and have for a long while.
>Hence, the socket code is probably more optimized for non-blocking
>operation than AIO operation.  As a plus, using non-blocking socket
>operations will allow your code to run on any platform; aio isn't as
>portable.

I recall reading about possible zero copy I/O using the aio interface.  Is
anyone thinking about this?  And on a related note, how about something like
IRIX's O_DIRECT mode for files?

I'm sure there are lots of issues, but I'm curious.

Jan Mikkelsen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: atomic operations

2000-10-03 Thread Jan Mikkelsen

John Baldwin <[EMAIL PROTECTED]> wrote:

>On 03-Oct-00 Jan Mikkelsen wrote:
>> There shouldn't be a need for a loop like the one you describe for a
simple
>> atomic increment.
>
>The trick is that I want to increment and read at the same time.


I don't know the exact semantics of atomic_cmpset_int, but it looks like a
compare and swap operation which returns zero if the operation failed, some
other value on success.

Unless I've missed something, the basic operation of your loop can be done
(on a '486 or better) without the loop by using the xadd instruction.  Of
course, if the code needs to run on earlier processors, xadd fails and a
loop is necessary.

Jan Mikkelsen




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: atomic operations

2000-10-03 Thread Jan Mikkelsen

John Baldwin <[EMAIL PROTECTED]> wrote:
>Uh, there is no xaddl instruction in the x86 instruction set.

It was introduced in the '486.  I've been using it for some years now, so I
am confident of its existence.

A quick test using my example:

$ objdump -d jan.o

jan.o: file format elf32-i386

Disassembly of section .text:

 :
   0:   55  push   %ebp
   1:   89 e5   mov%esp,%ebp
   3:   8b 4d 08mov0x8(%ebp),%ecx
   6:   b8 ff ff ff ff  mov$0x,%eax
   b:   f0 0f c1 01 lock xadd %eax,(%ecx)
   f:   48  dec%eax
  10:   c9  leave
  11:   c3  ret

There shouldn't be a need for a loop like the one you describe for a simple
atomic increment.

I'm pretty new to FreeBSD:  what is changing in -current which alters the
behaviour of your code?

Jan Mikkelsen




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: atomic operations

2000-09-25 Thread Jan Mikkelsen

Kevin Mills <[EMAIL PROTECTED]> wrote:
>I found the atomic_* functions in , but noticed that they
>have no return value.  What I need is a function that increments/decrements
>the given value *and* returns the new value in an atomic operation.  I
>suppose this is possible, yes?  How would one modify the assembly to make
>this work?


Atomic decrement, in the Intel style:

long atomic_decrement(volatile long* address)
{
  asm {
mov ecx, [address]
mov eax, -1
lock xadd [ecx], eax
dec eax
  }
 /* Return value in EAX */
}

An untested conversion into the GNU/AT&T style:

long atomic_decrement(volatile long* address)
{
 asm("movl 8(%ebp),%ecx");
 asm("movl $-1, %eax");
 asm("lock xaddl %eax,(%ecx)");
 asm("decl %eax");
 /* Return value in %eax */
}

Deriving increment is straightforward.

I haven't looked at the GNU inline assembler notation for indicating
register usage.  I'd be curious to see what is should look like.

Jan Mikkelsen




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message