Rebooting from loader causes a fault in VMware Workstation

2013-04-19 Thread Jeremy Chadwick
(Please keep me CC'd as I'm not subscribed to -hackers)

When running FreeBSD under VMware Workstation (I'm using 9.0.1, but this
issue has existed for many years now, I remember it occurring on
Workstation 6.x), the following is reproducible:

1. Power on + boot FreeBSD VM
2. At loader menu, press 3 to reboot
3. Loader prints Rebooting...
4. VMware proceeds to show the following message in a dialog box:

A fault has occurred causing a virtual CPU to enter the shutdown state.
If this fault had occurred outside of a virtual machine, it would have
caused the physical machine to restart. The shutdown state can be
reached incorrectly configuring the virtual machine, a bug in the guest
operating system, or a problem in VMware Workstation.

It can also happen when dropping to the loader prompt and doing
reboot.

It *does not* happen when booting fully into FreeBSD and issuing
shutdown -r now.  Likewise, hw.acpi.disable_on_reboot and
hw.acpi.handle_reboot have no bearing (e.g. changing either of those to
1 (default = 0) then doing shutdown -r now to try and induce the
problem).

So it seems the issue is specific to the bootstrap/loader env.

FreeBSD 9.1-STABLE is being used, but I've seen this happen with FreeBSD
8.x as well as 7.x.  It does not happen with other OSes like Linux and
Solaris.  I have not tried other VM systems (VirtualBox, etc.) but I
imagine they might just silently deal with the situation rather than
provide a useful message (although I know VirtualBox has an amazingly
detailed debugger).

I've looked at sys/boot/i386/loader/main.c (func command_reboot()) and
the actual magic seems to happen inside of __exit.

__exit comes from sys/boot/i386/btx/lib/btxsys.s, which zeros eax then
issues INT 0x30 (syscall interrupt).  That lead me to this:

http://www.freebsd.org/doc/en/books/arch-handbook/book.html

Eek.  x86 architecture is a lot different than I remember it being in
my 386 days, so this is all a bit over my head.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Rebooting from loader causes a fault in VMware Workstation

2013-04-19 Thread Jeremy Chadwick
On Fri, Apr 19, 2013 at 05:49:34PM -0500, Joshua Isom wrote:
 Basically, the loader finds a simple safe way to reboot that's
 worked since the 286, and VMWare doesn't like it.  It's called a
 triple fault.  FreeBSD and Linux even use it to reboot as a fail
 safe.  Read sys/i386/i386/vm_machdep.c and cpu_reset_real to see how
 FreeBSD handles it.  VMWare at least says that it would have caused
 the physical machine to restart.  Blame VMWare.
 
 On 4/19/2013 11:28 AM, Jeremy Chadwick wrote:
 (Please keep me CC'd as I'm not subscribed to -hackers)
 
 When running FreeBSD under VMware Workstation (I'm using 9.0.1, but this
 issue has existed for many years now, I remember it occurring on
 Workstation 6.x), the following is reproducible:
 
 1. Power on + boot FreeBSD VM
 2. At loader menu, press 3 to reboot
 3. Loader prints Rebooting...
 4. VMware proceeds to show the following message in a dialog box:
 
 A fault has occurred causing a virtual CPU to enter the shutdown state.
 If this fault had occurred outside of a virtual machine, it would have
 caused the physical machine to restart. The shutdown state can be
 reached incorrectly configuring the virtual machine, a bug in the guest
 operating system, or a problem in VMware Workstation.
 
 It can also happen when dropping to the loader prompt and doing
 reboot.
 
 It *does not* happen when booting fully into FreeBSD and issuing
 shutdown -r now.  Likewise, hw.acpi.disable_on_reboot and
 hw.acpi.handle_reboot have no bearing (e.g. changing either of those to
 1 (default = 0) then doing shutdown -r now to try and induce the
 problem).
 
 So it seems the issue is specific to the bootstrap/loader env.
 
 FreeBSD 9.1-STABLE is being used, but I've seen this happen with FreeBSD
 8.x as well as 7.x.  It does not happen with other OSes like Linux and
 Solaris.  I have not tried other VM systems (VirtualBox, etc.) but I
 imagine they might just silently deal with the situation rather than
 provide a useful message (although I know VirtualBox has an amazingly
 detailed debugger).
 
 I've looked at sys/boot/i386/loader/main.c (func command_reboot()) and
 the actual magic seems to happen inside of __exit.
 
 __exit comes from sys/boot/i386/btx/lib/btxsys.s, which zeros eax then
 issues INT 0x30 (syscall interrupt).  That lead me to this:
 
 http://www.freebsd.org/doc/en/books/arch-handbook/book.html
 
 Eek.  x86 architecture is a lot different than I remember it being in
 my 386 days, so this is all a bit over my head.

I'm happy to open up a ticket with VMware about the issue as I'm a
customer, but I find it a little odd that other operating systems do not
exhibit this problem, including another BSD.  Ones which reboot just
fine from their bootloaders:

- Linux -- so many that I don't know where to begin: ArchLinux
  2012.10.06, CentOS 6.3, Debian 6.0.7, Finnix 1.0.5, Knoppix 7.0.4,
  Slackware 14.0, and Ubuntu 11.10
- OpenBSD 5.2
- OpenIndiana -- build 151a7 (server version)

So when you say Blame VMware, I'd be happy to, except there must be
something FreeBSD's bootstraps are doing differently than everyone else
that causes this oddity.  Would you not agree?

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: retry mounting with ro when rw fails

2011-04-07 Thread Jeremy Chadwick
On Thu, Apr 07, 2011 at 01:20:53PM -0700, Garrett Cooper wrote:
 On Thu, Apr 7, 2011 at 10:25 AM, Andriy Gapon a...@freebsd.org wrote:
 
  [sorry for double post, it should have been hackers not hardware]
 
  Guys,
  could you please review and comment on the following patch?
  http://people.freebsd.org/~avg/mount-retry-ro.diff
  Thank you!
 
  The patch consists of two parts.
 
  The first part is in CAM/SCSI to make sure that ENODEV is consistently 
  returned to
  signal that an operation is not supported by a device (in accordance to 
  intro(2))
  and specifically to return ENODEV on write attempt to a read-only or
  write-protected media. ?Making this change in SCSI should cover real SCSI 
  devices,
  as well as ATAPI through ahci/siis/atapicam or similar, plus majority 
  (all?) of
  USB Mass Storage devices.
 
  The second part is in vfs_mount code. ?The idea is to re-try a mount call 
  if we
  get the ENODEV error, and mounting was not already in read-only mode, and 
  there
  was no explicit rw or noro option; the second try is changed to ro.
 
  I did only basic testing with an SD card in write-protected mode and a USB
  card-reader. ?Since I am not very familiar with vfs_mount code I might have 
  missed
  some important details.
 
 As a generic question / observation, maybe we should just
 implement 'errors=remount-ro' (or a reasonable facsimile) like Linux
 has in our mount(8) command? Doesn't look like NetBSD, OpenBSD, or
 [Open]Solaris sported similar functionality.

I was going to recommend exactly this.  :-)

I like the idea of Andriy's patch, but would feel more comfortable if it
were only used if a mount option was specified (-o errors=remount-ro).
Why:

Are there any conditions where ENODEV is returned to the underlying vfs
layer for things like unexpected hardware issues?  I would imagine the
latter would be ENXIO, but I'm not certain.  An example situation:

1. User inserts USB flash drive/etc.
2. User tries to mount disk R/W manually
3. Weird/bizarre hardware issue happens mid-mount (drive falling off
   the bus, or maybe even the user yanking the drive right in the
   middle) -- could this ever return ENODEV?
4. Kernel attempts re-mount, which also fails, or possibly panics
   due to some underlying condition which nobody predicted
5. User mails mailing list

If I'm worrying over nothing, then perfect.  :-)  My other concern is
whether or not this mechanism change could caused some sort of infinite
loop within devd(8)/devctl(4) where the daemon gets very confused as to
what's going on or some automated commands get run when they shouldn't.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: quotas an essential feature? (was: svn commit: r218953 - stable/8/usr.sbin/sysinstall)

2011-02-26 Thread Jeremy Chadwick
On Fri, Feb 25, 2011 at 11:25:00PM -0800, Tim Kientzle wrote:
 On Feb 25, 2011, at 3:46 PM, Steven Hartland wrote:
 
  While I can understand some may want its not something we use on any of
  our machines, and I suspect that's the case for many others.
  
  Given adding it means the kernel will be doing extra work and hence a
  drop in performance...
 
 Does anyone have benchmark results to measure the performance hit?

I imagine there wouldn't be any (or extremely negligible)
unless you used quotaon(8).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: random FreeBSD panics

2010-03-29 Thread Jeremy Chadwick
On Mon, Mar 29, 2010 at 05:01:02PM +, Masoom Shaikh wrote:
 On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote:
  On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote:
 
  lets assume if this is h/w problem, then how can other OSes overcome
  this ? is there a way to make FreeBSD ignore this as well, let it
  result in reasonable performance penalty.
 
  Very probably, if only we could detect where the problem is.
  Try adding options     PRINTF_BUFR_SIZE=128 to the kernel
 
 this option is already there

The key word in Ivan's phrase is less mangled.  Neither use of or
increasing PRINTF_BUFR_SIZE solves the problem of interspersed console
output.  I've been ranting/raving about this problem for years now; it
truly looks like a mutex lock issue (or lack of such lock), but I've
been told numerous times that isn't the case.

To developers: what incentives would help get this issue well-needed
attention?  This problem makes kernel debugging, panic analysis, and
other console-oriented viewing basically impossible.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: random FreeBSD panics

2010-03-29 Thread Jeremy Chadwick
On Mon, Mar 29, 2010 at 02:27:34PM -0400, John Baldwin wrote:
 On Monday 29 March 2010 1:30:38 pm Jeremy Chadwick wrote:
  On Mon, Mar 29, 2010 at 05:01:02PM +, Masoom Shaikh wrote:
   On Sun, Mar 28, 2010 at 5:38 PM, Ivan Voras ivo...@freebsd.org wrote:
On 28 March 2010 16:42, Masoom Shaikh masoom.sha...@gmail.com wrote:
   
lets assume if this is h/w problem, then how can other OSes overcome
this ? is there a way to make FreeBSD ignore this as well, let it
result in reasonable performance penalty.
   
Very probably, if only we could detect where the problem is.
Try adding options PRINTF_BUFR_SIZE=128 to the kernel
   
   this option is already there
  
  The key word in Ivan's phrase is less mangled.  Neither use of or
  increasing PRINTF_BUFR_SIZE solves the problem of interspersed console
  output.  I've been ranting/raving about this problem for years now; it
  truly looks like a mutex lock issue (or lack of such lock), but I've
  been told numerous times that isn't the case.
  
  To developers: what incentives would help get this issue well-needed
  attention?  This problem makes kernel debugging, panic analysis, and
  other console-oriented viewing basically impossible.
 
 I was recently going to look at it.  The somewhat drastic approach I was 
 going 
 to take was to add a simple serializing lock around trap_fatal() and a few 
 other places that do similar block prints (e.g. mca_log()).  One of the 
 issues 
 with fixing this in printf itself is that you'd want probably want to 
 serialize complete lines of text on a per-thread basis.  You would want to be 
 able to accumulate this line of text across multiple calls to printf (think 
 of 
 it as line-buffering ala stdio).  However, some folks may be nervous about 
 printf not printing things immediately.
 
 The other issue is that lots of code assumes it can call printf from anywhere 
 and everywhere.  Mostly this just means that if you add locking and line-
 buffering to printf(9) you have to be very careful to make sure it works in 
 odd places.  Probably a lot of this could be solved by deferring things like 
 trap_fatal() until panic() has already been called (which is bde's preferred
 solution I think).

John,

Thanks for the insights, they're greatly appreciated.

I went looking this morning to see how Linux addressed this issue (if at
all), and it's been discussed a few times in the past.  The longest lkml
thread I could find that mentioned the problem was circa 2002.  Probably
not worth reading as there was work done in 2009 to solve the issue.

http://lkml.indiana.edu/hypermail/linux/kernel/0204.1/index.html#161

Work done by RedHat in 2009 details how they implemented a lockless
version of their kernel ring buffer (similar to our system message
buffer, but probably a lot more complex):

http://lwn.net/Articles/340400/
http://lwn.net/Articles/340443/

Supposedly having multiple writers to the ring is 100% safe; no
interspersed output.  Same goes for interrupt-generated stuff.  There's
some comments in the technical document (2nd link) that imply there's an
individual ring buffer for each CPU; possibly per-CPU kernel message
buffers would solve our issue?

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [Testers wanted] /dev/console cleanups

2008-11-20 Thread Jeremy Chadwick
On Wed, Nov 19, 2008 at 11:48:36PM -0800, Nate Eldredge wrote:
 On Wed, 19 Nov 2008, Jeremy Chadwick wrote:

 On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote:

 I hope that never gets committed - it will make debugging kernel
 problems much harder.  There is already a kern.msgbuf_clear sysctl and
 maybe people who are concerned about msgbuf leakage need to learn to
 use it.

 And this sysctl is only usable *after* the kernel loads, which means
 you lose all of the messages shown from the time the kernel loads to
 the time the sysctl is set (e.g. hardware detected/configured).  This is
 even less acceptable, IMHO.

 But surely you can arrange that the contents are written out to  
 /var/log/messages first?

 E.g. a sequence like

 - mount /var
 - write buffer contents via syslogd
 - clear buffer via sysctl
 - allow user logins

This has two problems, but I'm probably missing something:

1) See my original post, re: users of our systems use dmesg to find
out what the status of the system is.  By status I don't mean from
the point the kernel finished to now, I literally mean they *expect*
to see the kernel device messages and all that jazz.  No, I'm not
making this up, nor am I arguing just to hear myself talk (despite
popular belief).  I can bring these users into the discussion if people
feel it would be useful.

2) I don't understand how this would work (meaning, technically and
literally: I do not understand).  How do messages like CPU: Intel(R)
Core(TM)2 Duo CPU E8400 @ 3.00GHz (2992.52-MHz K8-class CPU) get
written to syslog when syslogd isn't even running (or any filesystems)
mounted at that time?  There must be some magic involved there (since
syslog == libc, not syscall) when syslogd starts, but I don't know
how it works.

 This way the buffer is cleared before any unprivileged users get to do  
 anything.  No kernel changes needed, just a little tweaking of the init  
 scripts at most.

 If you should have a crash and suspect there is useful data in the 
 buffer, you can boot to single-user mode (avoiding the clear) and 
 retrieve it manually.

 Seems like this should make everyone happy.

What I'm not understanding is the resistance towards Rink's patch,
assuming the tunable defaults to disabled/off.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Testers wanted] /dev/console cleanups

2008-11-20 Thread Jeremy Chadwick
On Thu, Nov 20, 2008 at 10:53:07AM +0100, Dag-Erling Smørgrav wrote:
 Jeremy Chadwick [EMAIL PROTECTED] writes:
  Peter Jeremy [EMAIL PROTECTED] writes:
   This is deliberate.  If the system panics, stuff that was in the
   message buffer (and might not be on disk) can be read when the
   system reboots.  If there is no crashdump, this might be the only
   record of what happened.
  That doesn't sound deliberate at all -- it sounds like a quirk that
  people (you?) are relying on.
 
 No, it is deliberate.  Just because you don't like it doesn't mean we're
 morons.

I said nothing about liking/disliking it, nor did I namecall or
condescend.

Thanks for being a complete prick, des.  Jesus christ...

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Core i7 anyone else?

2008-11-19 Thread Jeremy Chadwick
On Wed, Nov 19, 2008 at 08:44:03PM +0900, Takanori Watanabe wrote:
 Hi, I recently bought Core i7 machine(for 145,000JPY: about $1500)
 and sometimes hangs up oddly.
 When in the state, some specific process only works and 
 replys ping, but not reply any useful information.
 
 I suspect it may caused by CPU power management, so I cut 
 almost all CPU power management feature on BIOS parameter.
 
 Are there any people encouterd such trouble?
 And on this machine build world in SCHED_ULE(15min.) is slower 
 than SCHED_4BSD(12min.).
 
 
 ===dmesg===
 http://www.init-main.com/corei7.dmesg
 or
 http://pastebin.com/m187f77aa
 (if host is down)
 
 =DSDT
 http://www.init-main.com/corei7.asl
 or
 http://pastebin.com/m6879984a
 
 ==some sysctls==
 hw.machine: i386
 hw.model: Intel(R) Core(TM) i7 CPU 920  @ 2.67GHz
 hw.ncpu: 8
 hw.byteorder: 1234
 hw.physmem: 3202322432
 hw.usermem: 2956083200
 hw.pagesize: 4096
 hw.floatingpoint: 1
 hw.machine_arch: i386
 hw.realmem: 3211264000
 ==
 machdep.enable_panic_key: 0
 machdep.adjkerntz: -32400
 machdep.wall_cmos_clock: 1
 machdep.disable_rtc_set: 0
 machdep.disable_mtrrs: 0
 machdep.guessed_bootdev: 2686451712
 machdep.idle: acpi
 machdep.idle_available: spin, mwait, mwait_hlt, hlt, acpi, 
 machdep.hlt_cpus: 0
 machdep.prot_fault_translation: 0
 machdep.panic_on_nmi: 1
 machdep.kdb_on_nmi: 1
 machdep.tsc_freq: 2684011396
 machdep.i8254_freq: 1193182
 machdep.acpi_timer_freq: 3579545
 machdep.acpi_root: 1024240
 machdep.hlt_logical_cpus: 0
 machdep.logical_cpus_mask: 254
 machdep.hyperthreading_allowed: 1
 ==
 kern.sched.preemption: 0
 kern.sched.topology_spec: groups
  group level=1 cache-level=0
   cpu count=8 mask=0xff0, 1, 2, 3, 4, 5, 6, 7/cpu
   flags/flags
  /group
 /groups
 
 kern.sched.steal_thresh: 3
 kern.sched.steal_idle: 1
 kern.sched.steal_htt: 1
 kern.sched.balance_interval: 133
 kern.sched.balance: 1
 kern.sched.affinity: 1
 kern.sched.idlespinthresh: 4
 kern.sched.idlespins: 1
 kern.sched.static_boost: 160
 kern.sched.preempt_thresh: 0
 kern.sched.interact: 30
 kern.sched.slice: 13
 kern.sched.name: ULE
 ===

When building world/kernel, do you see odd behaviour (on CURRENT) such
as the load average being absurdly high, or processes (anything; sh,
make, mutt, etc.) getting stuck in bizarre states?  These things are
what caused my buildworld/buildkernel times to increase (compared to
RELENG_7).  I was using ULE entirely (on CURRENT and RELENG_7), but
did not try 4BSD.  I documented my experience.

http://wiki.freebsd.org/JeremyChadwick/Bizarre_CURRENT_experience

I have no idea if your problem is the same as mine.  This is purely
speculative on my part.  (And readers of that Wiki article should
note that the problem was not hardware-related)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Testers wanted] /dev/console cleanups

2008-11-19 Thread Jeremy Chadwick
On Wed, Nov 19, 2008 at 02:02:42AM -0800, Garrett Cooper wrote:
 On Tue, Nov 18, 2008 at 1:49 PM, David Wolfskill [EMAIL PROTECTED] wrote:
  On Tue, Nov 18, 2008 at 10:34:10PM +0100, Ed Schouten wrote:
  ...
  One solution would be to let xconsole just display /var/log/messages.
 
  Errr... it may be rather a pathological case, but you might want to
  check the content of /etc/syslog.conf on the local machine before
  getting too carried away with that approach.
 
  For example, on my firewall box at home (where I really do not want to
  log anything to local disk files, though I do have a serial console on it):
 
  janus(6.4-P)[1] grep -v '^#' /etc/syslog.conf
  *.* @bunrab.catwhisker.org
  janus(6.4-P)[2]
 
  And then consider the fate of bunrab -- with stuff getting logged to
  /var/log/messages from various machines
 
  ...
  I'll discuss this with others to decide if we should take such an
  approach.
 
  I'm not trying to be obstructionist, here.  If the above case is really
  too pathological to consider -- or if it's a case of me bringing that
  fate upon myself, I suppose -- that's actually something I can live
  with.  It would be nice to be forwarned about it, though.  :-}
 
  Peace,
  david
 
 Uh, I second that. /var/log/messages shouldn't necessarily be
 accessible by non-root users. Also, OSX 10.5 protects against non-root
 access to dmesg. Not saying we should go that far, but it's already
 being implemented, so I don't see any harm in hiding the contents of
 `messages', as required by the sysadmin.

Footnote (not really applicable to the thread, but I want to point it
out to users/admins reading): inhibiting users viewing the kernel
message buffer (dmesg) can be accomplished by setting the
security.bsd.unprivileged_read_msgbuf sysctl to 0.

However, note that this can piss users off.  We have numerous users
on our system who rely on this information to see if anything weird is
going on with the box.  I set that sysctl one day (see below for why),
and I got flames in my mailbox within 48 hours.  Just something to keep
in mind if you have technically-savvy users.

There's a known issue with the kernel message buffer though: it's not
NULL'd out upon reboot.  Meaning, in some cases (depends on the BIOS or
system), the kernel message buffer from single-user mode is retained
even after a reboot!  A user can then do dmesg and see all the nifty
stuff you've done during single-user, which could include unencrypted
passwords if mergemaster was tinkering with passwd/master.passwd, etc..
I've brought this up before, and people said Yeah, we know, moving on.
Rink Springer created a patch where the kernel message buffer will start
with NULL to keep this from happening, but it needs to be made into a
loader.conf tunable.

Also, /var/log/messages is explicitly set to 0644 in newsyslog.conf.  If
people want to debate that, be my guest.  I'm not sure what security
hole we'd be plugging if it was set to 0600, especially given that many
userland programs use the LOG_NOTICE facility in syslog.  If people want
to debate those default perms, be my guest.  I would rather people
debate the default syslog.conf layout altogether; I'm surprised we
haven't moved to syslog-ng (as part of the base system) by now.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reccomendation for tools to use on FreeBSD for a wiki ?

2008-11-19 Thread Jeremy Chadwick
On Wed, Nov 19, 2008 at 10:08:03AM -0800, Garrett Cooper wrote:
 On Wed, Nov 19, 2008 at 9:24 AM, Julian Stacey [EMAIL PROTECTED] wrote:
  Hi hackers,
  Maybe Some of you might suggest some software I might install, Wiki I 
  guess. ?
  I got zero response from ports@,  I could use some reccomendations please.
  PS From http://wiki.freebsd.org/HelpContents I tried
 cd /usr/ports/www ; vi *iki*/pkg-descr
 or is /usr/ports/www/moinmoin  the way to go ?
  Thanks.
  ---
 
  Subject: Reccomendation for ports for web based club events forthcoming 
  diary ?
 
  Can anyone reccomend some ports to install on a FreeBSD web server,
  for a club of mostly non technical people, to support:
   - All club members can add events to a forthcoming calendar,
   - All club members can request server to prepare a listing
 of next next upcoming events, to download (probably in PDF,
 or perhaps tbl to a pipe or ?
   - A list of moderators can delete fake events from robots  the malicious.
   - Preferably moderators should not themselves be capable of
 deleting logged event submission, but only capable of deleting
 events formatted to the ouput printable programme sheet. (To
 autopsy for suspect rogue moderators)
   - I guess first entry criteria might be a fuzzy picture for human
 to decode password from). 2nd might be mail return for confirm password,
   -  3rd, A majordomo (later mailman) maintained list of club members 
 moderators etc is available for automated validation.
   - I hope there will be some packages available,
 http  probably wiki based etc, that will come close enough ?
  I'm hoping this has been done often enough that people can suggest
  names of ports already existing ? If not I dont mind creating a
  port if I have to, but dont want to write something from scratch.
 
  PS
  - - I've had apache up for years, but no wiki yet, so if any tips,
   shout please, even if just RTFM URL= :-)
  - - Web based forums I don't care about, but others may, so I suppose if 
  some
   software does  does not support web forums, it'd be good to know.
 
  Suggestions welcome please ! Thanks in advance.
 
  Cheers,
  Julian
 
 Julian,
  FWIW, ports@ or questions@ would be better lists than [EMAIL PROTECTED]

He asked this on -ports only 2 days ago and received no response.  So he
appears to be going from list to list hoping someone will answer him.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Testers wanted] /dev/console cleanups

2008-11-19 Thread Jeremy Chadwick
On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote:
 On 2008-Nov-19 02:47:31 -0800, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 There's a known issue with the kernel message buffer though: it's not
 NULL'd out upon reboot.
 
 This is deliberate.  If the system panics, stuff that was in the
 message buffer (and might not be on disk) can be read when the system
 reboots.  If there is no crashdump, this might be the only record of
 what happened.

That doesn't sound deliberate at all -- it sounds like a quirk that
people (you?) are relying on.  I do not think any piece of the FreeBSD
system (e.g. savecore, etc.) relies on this behaviour.

You're under the mentality that the information is *always* available
after a panic/reboot -- it isn't.  I have 4 different Supermicro
motherboards (all from different years) which will most of the time
lose the msgbuf after rebooting from single-user -- but sometimes the
msgbuf is retained.  And no, bad hardware is not responsible for the
randomness of the problem.

I think it's been discussed in the past how/why this can happen.  It has
to do with what each BIOS manufacturer chooses to do with some parts of
memory during start-up.  I'm sure the Quick Boot (e.g. no extensive
memory test, which really doesn't test anything these days) option plays
a role, and that option is enabled by default on all motherboards I've
used in the past 10 years.

   Meaning, in some cases (depends on the BIOS or
 system), the kernel message buffer from single-user mode is retained
 even after a reboot!  A user can then do dmesg and see all the nifty
 stuff you've done during single-user, which could include unencrypted
 passwords if mergemaster was tinkering with passwd/master.passwd, etc..
 
 There shouldn't be unencrypted passwords, though there might be encrypted
 passwords visible.

Sorry, that's what I meant.

The point is that a lot of things can go on in single-user mode which
can/will disclose information or data in files which users do not have
access to.  Once the system is rebooted, a non-root user can do dmesg
-a and see this buffer, getting access to data he/she normally does not
have access to.

Do you and I agree that this is in fact a security risk/problem?

 Rink Springer created a patch where the kernel message buffer will start
 with NULL to keep this from happening, but it needs to be made into a
 loader.conf tunable.
 
 I hope that never gets committed - it will make debugging kernel
 problems much harder.  There is already a kern.msgbuf_clear sysctl and
 maybe people who are concerned about msgbuf leakage need to learn to
 use it.

And this sysctl is only usable *after* the kernel loads, which means
you lose all of the messages shown from the time the kernel loads to
the time the sysctl is set (e.g. hardware detected/configured).  This is
even less acceptable, IMHO.

I would like to see Rink's patch committed, as long as the loader
tunable defaults to *off* (e.g. current/historic behaviour).  I'll also
ask Rink to chime in here with his thoughts/opinions.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NET.ISR and CPU utilization performance w/ HP DL 585 using FreeBSD 7.1 Beta2

2008-11-15 Thread Jeremy Chadwick
On Sat, Nov 15, 2008 at 04:59:16AM -0800, Won De Erick wrote:
 Hello,
 
 I tested HP DL 585 (16 CPUs, w/ built-in Broadcom NICs) running FreeBSD 7.1 
 Beta2 under heavy network traffic (TCP).
 
 SCENARIO A : Bombarded w/ TCP traffic:
 
 When net.isr.direct=1,
 
   PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
52 root1 -68- 0K16K CPU11  b  38:43 95.36% irq32: bce1
51 root1 -68- 0K16K CPU10  a  25:50 85.16% irq31: bce0
16 root1 171 ki31 0K16K RUNa  65:39 15.97% idle: cpu10
28 root1 -32- 0K16K WAIT   8  12:28  5.18% swi4: clock 
 sio
15 root1 171 ki31 0K16K RUNb  52:46  3.76% idle: cpu11
45 root1 -64- 0K16K WAIT   7   7:29  1.17% irq17: uhci0
47 root1 -64- 0K16K WAIT   6   1:11  0.10% irq16: ciss0
27 root1 -44- 0K16K WAIT   0  28:52  0.00% swi1: net
 
 When net.isr.direct=0,
 
16 root1 171 ki31 0K16K CPU10  a 106:46 92.58% idle: cpu10
19 root1 171 ki31 0K16K CPU7   7 133:37 89.16% idle: cpu7
27 root1 -44- 0K16K WAIT   0  52:20 76.37% swi1: net
25 root1 171 ki31 0K16K RUN1 132:30 70.26% idle: cpu1
26 root1 171 ki31 0K16K CPU0   0 111:58 64.36% idle: cpu0
15 root1 171 ki31 0K16K CPU11  b  81:09 57.76% idle: cpu11
52 root1 -68- 0K16K WAIT   b  64:00 42.97% irq32: bce1
51 root1 -68- 0K16K WAIT   a  38:22 12.26% irq31: bce0
45 root1 -64- 0K16K WAIT   7  11:31 12.06% irq17: uhci0
47 root1 -64- 0K16K WAIT   6   1:54  3.66% irq16: ciss0
28 root1 -32- 0K16K WAIT   8  16:01  0.00% swi4: clock 
 sio
 
 Overall CPU utilization has significantly dropped, but I noticed that swi1 
 has taken CPU0 with high utilization when the net.isr.direct=0.
 What does this mean?
 
 SCENARIO B : Bombarded w/ more TCP traffic:
 
 Worst thing, the box has become unresponsive (can't be PINGed, inaccessible 
 through SSH) after more traffic was added retaining net.isr.direct=0.
 This is due maybe to the 100% utilization on CPU0 for sw1:net (see below 
 result, first line). bce's and swi's seem to race each other based on the 
 result when net.isr.direct=1, swi1 . 
 The rest of the CPUs are sitting pretty (100% Idle). Can you shed some lights 
 on this?
 
 When net.isr.direct=0:
27 root1 -44- 0K16K CPU0   0   5:45 100.00% swi1: net
11 root1 171 ki31 0K16K CPU15  0   0:00 100.00% idle: cpu15
13 root1 171 ki31 0K16K CPU13  0   0:00 100.00% idle: cpu13
17 root1 171 ki31 0K16K CPU9   0   0:00 100.00% idle: cpu9
18 root1 171 ki31 0K16K CPU8   0   0:00 100.00% idle: cpu8
21 root1 171 ki31 0K16K CPU5   5 146:17 99.17% idle: cpu5
22 root1 171 ki31 0K16K CPU4   4 146:17 99.07% idle: cpu4
14 root1 171 ki31 0K16K CPU12  0   0:00 99.07% idle: cpu12
16 root1 171 ki31 0K16K CPU10  a 109:33 98.88% idle: cpu10
15 root1 171 ki31 0K16K CPU11  b  86:36 93.55% idle: cpu11
52 root1 -68- 0K16K WAIT   b  59:42 13.87% irq32: bce1
 
 When net.isr.direct=1,
52 root1 -68- 0K16K CPU11  b  55:04 97.66% irq32: bce1
51 root1 -68- 0K16K CPU10  a  33:52 73.88% irq31: bce0
16 root1 171 ki31 0K16K RUNa 102:42 26.86% idle: cpu10
15 root1 171 ki31 0K16K RUNb  81:20  3.17% idle: cpu11
28 root1 -32- 0K16K WAIT   e  13:40  0.00% swi4: clock 
 sio
 
 With regards to bandwidth in all scenarios above, the result is extremely low 
 (expected is several hundred Mb/s). Why? 
 
   - iface   Rx   Tx
 Total
   
 ==
  bce0:   4.69 Mb/s   10.49 Mb/s   15.18 
 Mb/s
  bce1:  20.66 Mb/s4.68 Mb/s   25.34 
 Mb/s
   lo0:   0.00  b/s0.00  b/s0.00  
 b/s
   
 --
 total:  25.35 Mb/s   15.17 Mb/s   40.52 
 Mb/s
 
 
 Thanks,
 
 Won

And does this behaviour change if you use some other brand of NIC?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers

Re: assigning interrupts

2008-11-13 Thread Jeremy Chadwick
On Thu, Nov 13, 2008 at 06:03:20PM +0800, Ronnel P. Maglasang wrote:
 Hi All,

 Is there a way to explicitly assign an interrupt
 of a device? I'm running on 6.3 and the two NICs
 share the same interrupt. Obviously this will affect
 the performance if the NICs are exposed to heavy network
 traffic.

 # vmstat -i
 interrupt  total   rate
 sniff
 irq11: em0 vr0+  1081099 77
 sniff
 Total   16958562   1222


 Looking at the driver's code, I have the initial though
 that this is the place where I can modify.

This is the responsibility of the BIOS or ACPI configuration.  There is
no way to do this via OS software, as far as I know.

Try looking at the motherboard manual for what PCI levels (A/B/C/D)
share IRQs with what slot, then move cards around.

Otherwise, consider purchasing a motherboard that has an APIC (this is
not a typo) increasing the IRQ count to 256.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: assigning interrupts

2008-11-13 Thread Jeremy Chadwick
On Thu, Nov 13, 2008 at 04:40:03PM +0100, Joerg Sonnenberger wrote:
 On Thu, Nov 13, 2008 at 02:40:54AM -0800, Jeremy Chadwick wrote:
  Otherwise, consider purchasing a motherboard that has an APIC (this is
  not a typo) increasing the IRQ count to 256.
 
 This is wrong. The first IO-APIC gives you 8 additional interrupts to
 the 16 ISA interrupt lines. Every additional IO-APIC gives you 24 more.
 Most modern chipsets have one IO-APIC, at least for non-embedded
 systems. It doesn't mean you don't get interrupt sharing though.

I think the problem is that I was thinking of local APICs, which provide
a few hundred (I don't remember the exact number) IRQs to an I/O APIC.

For what it's worth, the devices he listed are exclusively on the PCI
bus.

Regarding it means you can still get interrupt sharing, I'd like to
hear more about why/how that's possible with a system sporting at least
one I/O APIC.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ukbd attachment and root mount

2008-11-12 Thread Jeremy Chadwick
On Wed, Nov 12, 2008 at 02:49:15PM +0200, Andriy Gapon wrote:
 on 12/11/2008 14:33 Jeremy Chadwick said the following:
  On Wed, Nov 12, 2008 at 02:20:41PM +0200, Andriy Gapon wrote:
  on 12/11/2008 14:14 Jeremy Chadwick said the following:
  On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote:
  [snip]
  2. if ukbd driver is not attached then I don't see any way USB keyboard
  would work in non-legacy way
  Regarding #2: at which stage?  boot0/boot2/loader require an AT or PS/2
  keyboard to work.  None of these stages use ukbd(4) or anything -- there
  is no kernel loaded at this point!!  Meaning: if you have a USB keyboard,
  your BIOS will need to have a USB Legacy option to cause it to act as
  a PS/2 keyboard, for typing in boot0/boot2/loader to work.
 
  Device hints are for kernel drivers, once the kernel is loaded.
  Jeremy,
 
  I understand all of this.
  In subject line and earlier messages I say that I am interested in
  mountroot prompt - the prompt where kernel can ask about what device to
  use for root filesystem.
  Essentially I would like kernel to recognize USB keyboard (and disable
  all the legacy stuff if needed) before it prompts for the root device.
  
  I fully understand that fact.  However, I don't see the logic in that
  statement.  You should be able to remove and add a keyboard at any time
  and be able to type immediately.  Meaning: I don't see why when the
  keyboard recognition is performed (e.g. before printing mountroot or
  after) matters.  It should not.  I think this is a red herring.
 
 I think that this does matter because keyboard recognition is performed
 after the 'mounting from' log line *only if* root mount is done
 automatically.
 If there is an actual interactive prompt then recognition is not
 performed, at least I do not see any relevant lines on the screen and I
 am stuck at the prompt.
 
  I've seen the problem where I have a fully functional USB keyboard in
  boot0/boot2/loader
 
 For me it even randomly dies at these stages.
 I reported this in a different thread.
 But this should not be related to kernel behavior.
 
 and in multi-user,
 
 For me this always works.
 
  but when booting into single-user
 
 For me this always works.
 
  or when getting a mountroot prompt, the keyboard does not function.
  When the mountroot prompt is printed (before or after ukbd attached)
  makes no difference for me in this scenario -- I tested it many times.
 
 For me ukbd lines are never printed if I get actual interactive
 mountroot prompt.
 
  It's very possible that something (kbdcontrol?) is getting run only
  during late stages of multi-user, which makes the keyboard work.  But
  prior to that something being run (but AFTER boot2/loader), the
  keyboard is not truly usable.
 
 For me this is not true. My keyboard always works after ukbd lines
 appear on screen.

I've pointed you to evidence where this isn't true, especially when
using the USB4BSD stack.  There is something called boot legacy
protocol which USB keyboards have to support to properly be interfaced
with in FreeBSD using the USB4BSD stack; in the case of the Microsoft
Natural Ergo 4000 keyboard, it does not play well with USB4BSD (it DOES
work with the old USB stack, but none of the multimedia keys work, and
worse, the F-Lock key does not work; this is because those keys use
uhid(4) and not ukbd(4)).

Linux has a __20 page Wiki document__ on **just this keyboard**.  That
should give you some idea of how complex the situation with USB
keyboards is in general.

http://www.gentoo-wiki.info/HOWTO_Microsoft_Natural_Ergonomic_Keyboard_4000

  I hope everyone here is also aware of that fact that not all keyboards
  are created equal.  Case in point (and this reason is exactly why I
  am purchasing a native PS/2 keyboard, as USB4BSD doesn't work with
  all USB keyboards right now):
 
 For me this is not an option, no PS/2 ports.

I don't know what to say to ***ANY*** of the above, other than this:

No one is doing anything about this problem because there does not
appear to be a 100% reproducible always-screws-up-when-I-do-this
scenario that happens to *every FreeBSD user*.

Until we settle down, stop replying to Emails with one-liner injections,
and compile a list of test scenarios/cases that people can perform, and
get these people to provide both 1) full hardware details, 2) full
kernel configuration files, 3) full loader.conf files, and 4) full
device.hints files, we're not going to get anywhere.

  http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000219.html
  
  The bottom line:
  
  FreeBSD cannot be reliably used with a USB keyboard in all
  circumstances.And that is a very sad reality, because 90% of the
  keyboards you find on the consumer and enterprise market are USB --
  native PS/2 keyboards are now a scarcity.
 
 I agree that this is a sad reality but only for boot stages where we
 depend on external entity named BIOS to help us.
 This doesn't have to be a sad reality

Re: ukbd attachment and root mount

2008-11-12 Thread Jeremy Chadwick
On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote:
 on 12/11/2008 13:53 Nate Eldredge said the following:
  On Wed, 12 Nov 2008, Andriy Gapon wrote:
  
  on 05/11/2008 17:24 Andriy Gapon said the following:
  [...]
  I have a legacy-free system (no PS/2 ports, only USB) and I wanted to
  try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was
  bitten hard when I made a mistake and kernel could not find/mount root
  filesystem.
 
  So I stuck at mountroot prompt without a keyboard to enter anything.
  This was repeatable about 10 times after which I resorted to live cd.
 
  Since then I put back atkbdc into my kernel. I guess BIOS or USB
  hardware emulate AT or PS/2 keyboard, so the USB keyboard works before
  the driver attaches. I guess I need such emulation e.g. for loader or
  boot0 configuration. But I guess I don't have to have atkbd driver in
  kernel.
 
  This turned out not to be a complete solution as it seems that there are
  some quirks about legacy USB here, sometimes keyboard stops working even
  at loader prompt (this is described in a different thread).
 
  ukbd attachment still puzzles me a lot.
  I look at some older dmesg, e.g. this 7.0-RELEASE one:
  http://www.mavetju.org/mail/view_message.php?list=freebsd-usbid=2709973
  and see that ukbd attaches along with ums before mountroot.
 
  I look at newer dmesg and I see that ums attaches at about the same time
  as before but ukbd consistently attaches after mountroot.
  I wonder what might cause such behavior and how to fix it.
  I definitely would like to see ukbd attach before mountroot, I can debug
  this issue, but need some hints on where to start.
  
  I haven't been following this thread, and I'm pretty sleepy right now,
  so sorry if this is irrelevant, but I had a somewhat similar problem
  that was fixed by adding
  
  hint.atkbd.0.flags=0x1
  
  to /boot/device.hints .

To those reading, the above setting enables the following option:

   bit 0 (FAIL_IF_NO_KBD)
  By default the atkbd driver will install even if a keyboard is not
  actually connected to the system.  This option prevents the driver
  from being installed in this situation.

 I can try this, but I think this wouldn't help for two reasons:
 1. I already tried kernel without atkb at all
 2. if ukbd driver is not attached then I don't see any way USB keyboard
 would work in non-legacy way

Regarding #2: at which stage?  boot0/boot2/loader require an AT or PS/2
keyboard to work.  None of these stages use ukbd(4) or anything -- there
is no kernel loaded at this point!!  Meaning: if you have a USB keyboard,
your BIOS will need to have a USB Legacy option to cause it to act as
a PS/2 keyboard, for typing in boot0/boot2/loader to work.

Device hints are for kernel drivers, once the kernel is loaded.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ukbd attachment and root mount

2008-11-12 Thread Jeremy Chadwick
On Wed, Nov 12, 2008 at 02:20:41PM +0200, Andriy Gapon wrote:
 on 12/11/2008 14:14 Jeremy Chadwick said the following:
  On Wed, Nov 12, 2008 at 01:58:58PM +0200, Andriy Gapon wrote:
 [snip]
  2. if ukbd driver is not attached then I don't see any way USB keyboard
  would work in non-legacy way
  
  Regarding #2: at which stage?  boot0/boot2/loader require an AT or PS/2
  keyboard to work.  None of these stages use ukbd(4) or anything -- there
  is no kernel loaded at this point!!  Meaning: if you have a USB keyboard,
  your BIOS will need to have a USB Legacy option to cause it to act as
  a PS/2 keyboard, for typing in boot0/boot2/loader to work.
  
  Device hints are for kernel drivers, once the kernel is loaded.
 
 Jeremy,
 
 I understand all of this.
 In subject line and earlier messages I say that I am interested in
 mountroot prompt - the prompt where kernel can ask about what device to
 use for root filesystem.
 Essentially I would like kernel to recognize USB keyboard (and disable
 all the legacy stuff if needed) before it prompts for the root device.

I fully understand that fact.  However, I don't see the logic in that
statement.  You should be able to remove and add a keyboard at any time
and be able to type immediately.  Meaning: I don't see why when the
keyboard recognition is performed (e.g. before printing mountroot or
after) matters.  It should not.  I think this is a red herring.

I've seen the problem where I have a fully functional USB keyboard in
boot0/boot2/loader and in multi-user, but when booting into single-user
or when getting a mountroot prompt, the keyboard does not function.
When the mountroot prompt is printed (before or after ukbd attached)
makes no difference for me in this scenario -- I tested it many times.

It's very possible that something (kbdcontrol?) is getting run only
during late stages of multi-user, which makes the keyboard work.  But
prior to that something being run (but AFTER boot2/loader), the
keyboard is not truly usable.

I hope everyone here is also aware of that fact that not all keyboards
are created equal.  Case in point (and this reason is exactly why I
am purchasing a native PS/2 keyboard, as USB4BSD doesn't work with
all USB keyboards right now):

http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000219.html

The bottom line:

FreeBSD cannot be reliably used with a USB keyboard in all
circumstances.  And that is a very sad reality, because 90% of the
keyboards you find on the consumer and enterprise market are USB --
native PS/2 keyboards are now a scarcity.

Do not even for a minute tell me buy a USB-to-PS2 adapter, because the
green ones that come with USB mice do not work with USB keyboards.  I
have even bought a purple USB-to-PS2 keyboard adapter from Amazon,
specifically for this purpose, and it *does not work*.  I found out
weeks later the adapters only work on CERTAIN models of USB keyboards,
depending upon how they're engineered.

What really needs to happen here should be obvious: we need some form of
inexpensive keyboard-only USB support in boot2/loader.

I would *love* to know how Linux and Windows solve this problem.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: strange behaviour with /sbin/init and serial console

2008-10-31 Thread Jeremy Chadwick
On Fri, Oct 31, 2008 at 05:46:23PM +0100, Thierry Herbelot wrote:
 with the following patch on /sbin/init, I have two different behaviours 
 depending on the console type (on a i386/32 PC) :
 - on a video console, I see the expected two messages,
 - on a serial console, the messages are not displayed (init silently finishes 
 its job and gets to start /etc/rc and everything)

I thought this was normal behaviour on FreeBSD, but it's very likely I'm
misunderstanding.  The charts in Section 27.6.4 describe what level of
logging is shown where and at what stage, depending upon which boot
flags and device settings you use:

http://www.freebsd.org/doc/en/books/handbook/serialconsole-setup.html

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: strange behaviour with /sbin/init and serial console

2008-10-31 Thread Jeremy Chadwick
On Fri, Oct 31, 2008 at 01:28:02PM -0600, Scott Long wrote:
 Ed Schouten wrote:
 Hello Theirry,

 * Thierry Herbelot [EMAIL PROTECTED] wrote:
 with the following patch on /sbin/init, I have two different 
 behaviours depending on the console type (on a i386/32 PC) :
 - on a video console, I see the expected two messages,
 - on a serial console, the messages are not displayed (init silently 
 finishes its job and gets to start /etc/rc and everything)

 I assume that the writev system call is implemented in  
 src/sys/kern/tty_cons.c::cnwrite(), but I could not parse the code to 
 find an explanation.

 any taker ?

 TfH

 PS : this is initially for a RELENG_6 machine, but the code is quite 
 similar under RELENG_7 or Current

 Any data written to /dev/console is not multiplexed to all console
 devices, but only the first active device in the list. The reason behind
 this, is because it adds a real lot of complexity to the console code,
 especially related to polling and reading on /dev/console.

 This weekend I'm going to commit a replacement implementation of
 /dev/console, which also has this restriction.


 The multiplexed console feature is one thing that linux got right.  In a
 corporate setting, you really need both a serial console and a video
 console in order to effectively manage the machines, as you want to be
 able to access them both remotely and locally.

I know this comment isn't much help, but, I am in full agreement with
Scott.  FreeBSD's lack of *true* multi (or even dual) console during all
stages is a big disappointment to server administrators.  The common
reaction is: What do you mean I can only get some messages on serial or
some messages on VGA?!  That's retarded!

I believe DragonFly has addressed this (offering a true dual console
mechanism), and if I remember correctly, Matt Dillon discussed the code
changes in great detail, citing a large amount of re-engineering
required to accomplish it.

 While it might be hard to build multiplexing into the console driver,
 do you think it would be possible to layer a multiplexer on top of it,
 similar to how the kbdmux driver works?

Let's make sure that we don't implement it identically though, as there
are many of us who have major problems with kbdmux (reports of LORs, and
even more reports of incredibly slow keyboard input when a USB keyboard
is used; workarounds are either disabling atkbd/atkbdc entirely, or
disabling kbdmux entirely.  In my case, I found the latter to be
preferable).  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


open(2) and O_NOATIME

2008-10-30 Thread Jeremy Chadwick
I've recently been reading about Linux's O_NOATIME flag to open(2), and
I'm curious why we haven't implemented this.  There seem to be a lot of
good reasons to implement such a thing.

Chances are it's due to lack of time/interest, which is expected, but I
was wondering if there were other reasons.

I realise mount's noatime trumps this, but there are lots of scenarios
where atime is desired as a default, but disabled in specific cases.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: open(2) and O_NOATIME

2008-10-30 Thread Jeremy Chadwick
On Thu, Oct 30, 2008 at 07:16:42PM -0700, Xin LI wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Jeremy Chadwick wrote:
  I've recently been reading about Linux's O_NOATIME flag to open(2), and
  I'm curious why we haven't implemented this.  There seem to be a lot of
  good reasons to implement such a thing.
  
  Chances are it's due to lack of time/interest, which is expected, but I
  was wondering if there were other reasons.
  
  I realise mount's noatime trumps this, but there are lots of scenarios
  where atime is desired as a default, but disabled in specific cases.
 
 Em...  Allowing administrators to disable NOATIME would be a good thing,
 but wouldn't allowing arbitrary program to decide whether atime should
 be changed, be a serious security disaster?

How?

There's only one condition I can think of: where a system administrator
is, for some reason, relying upon atimes as a form of proof of something
bad happening (which is a horrible concept in general, being as the
amount of false positives seen would be tremendous; using atime as a
security auditing method is stupid).

If that's what you were referring to, then possibly making O_NOATIME
only to root would be a suitable compromise.

 Disclaimer: I'm not a big atime fan myself, actually I disable atime on
 a lot of my servers for performance reasons :)

I can't disable atime on any systems I maintain, because they all
provide access to classic UNIX mbox spools where atime is used to
determine if new mail has arrived.  The instant filesystem-level
backups run, atime is lost, and users have no way of knowing if
they have new mail or not.  Switching to Maildir is an option, but
the performance hit of readdir() + stat() on thousands of files is
tremendous (which is why mail clients like mutt have features like
header caching via Oracle/Sleepycat DB).

Anyway, I just was reading about it and realise that a lot of backup
solutions out there can make use of O_NOATIME if available, which it
isn't on FreeBSD.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: neophyte: tcsetattr() gives 22 error in i386, not in amd64?

2008-10-25 Thread Jeremy Chadwick
On Sat, Oct 25, 2008 at 06:06:38PM +1030, en0f wrote:
 Nate Eldredge wrote:
  On Fri, 24 Oct 2008, Steve Franks wrote:
  
  Hi,
 
  I'm getting a 22 errno from tcsetattr() on 7-STABLE i386 in code which
  was working under 7-STABLE amd64.  Serial device is a ucom (silabs
  cp2103).  Permissions on /dev/cuaU0 look fine.  Cutecom/Minicom
  appears to open the port without error...
  
  I don't see anything obviously wrong, but I'd bet a bug related to
  32/64-bit types.  Can you post a complete piece of code that can be
  compiled and run and demonstrates the problem?  Also, try compiling with
  -Wall -W and investigate any warnings that are produced.
  
  By the way, errno 22 is EINVAL, Invalid argument.  perror() is your
  friend.
 
 Strange freebsd doesnt document error numbers. On POSIX, errno 22 is
 EINVAL as well (documented in errno(3)). Is this applicable to freebsd?

/usr/include/errno.h isn't documentation of error numbers?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: zfs waiting on zio-io_cv

2008-10-25 Thread Jeremy Chadwick
On Sat, Oct 25, 2008 at 09:48:15AM +0200, Danny Braniss wrote:
  In the last episode (Oct 24), Danny Braniss said:
   there is a big delay (probably more than 1 sec.) when doing simple tasks
   on this zfs, like ls(1), or 'zfs list', long enough to hit ^T
   and get the same [zio-io_cv)], any hints?
   
   store-01# zfs list
   (hitting ^T)load: 0.00  cmd: zfs 88376 [zio-io_cv)] 0.00u 0.00s 0% 1672k
   (hitting ^T)load: 0.00  cmd: zfs 88376 [zio-io_cv)] 0.00u 0.00s 0% 1684k
   NAME  USED  AVAIL  REFER  MOUNTPOINT
   h 472G  11.2T23K  /h
   h/home466G  11.2T   466G  /h/home
   h/[EMAIL PROTECTED]54K  -   466G  -
   h/root 18K  11.2T18K  /h/root
   h/src  18K  11.2T18K  /h/src
   h/system 5.64G  11.2T  5.64G  /h/system
  
  That's sort of the equivalent to waiting in biord on a UFS
  filesystem, I think.  ZFS is just waiting for the disk to return a
  block.  If you happen to do something during the window where ZFS is
  commiting its transaction group, it has to wait until the sync
  finishes.  If some other process is doing a lot of writes, or you only
  have one disk in your zpool, or your pool is close to full, it may take
  a couple seconds to sync.
  
  There's a couple of things you can try to improve interactive
  performance.  Raising zfs's arc_max is the easiest to do, and will let
  ZFS cache more stuff, increasing the likelyhood that an ls will be
  able to read from cache instead of having to go to disk.  Setting it at
  1/4 your physical RAM is probably as high as you can go without causing
  panics.
  
  Raising txg_time ( in /sys/cddl/.../zfs/txg.c ) from 5 to
  say 30 will tell zfs to sync less often, which can be a win if you
  don't actually do that much writing.  With a single spindle, it may
  take a substantial fraction of a second just to sync a tiny txg due to
  the number of copies of metadata ZFS writes for redundancy.
  
  If you do a lot of writing, lowering zfs_vdev_max_pending ( in
  /sys/cddl/.../zfs/vdev_queue.c ) from 35 down to 16 or less will reduce
  the number of simultaneous I/Os ZFS will try to send to each disk,
  which will let your reads compete a little better with other I/O.  On
  ATA or SATA disks, you might want to set it to 2.
  
 ok, forgot to mention a small detail, the machine is a cuad core, with 8gb
 of main memory, the disks are 14x1tb connected via a perc/raid5
 tests show that disk access is quiet fast, over 200Mg/s.
 
 the 'delays' are seen when the machine is totaly idle. (it's not production 
 yet)
 and been up for some time. btw, I can't reproduce the 'delay', so I think
 it has to do with caching.
 
 I guess this beast needs some tunning, are there any tools out there
 to monitor/tune ZFS? 

Monitor ZFS: sysctl
Tune ZFS: vi /boot/loader.conf or sysctl

I'm not sure what you're looking for.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: neophyte: tcsetattr() gives 22 error in i386, not in amd64?

2008-10-25 Thread Jeremy Chadwick
On Sat, Oct 25, 2008 at 06:29:54PM +1030, en0f wrote:
 Jeremy Chadwick wrote:
  On Sat, Oct 25, 2008 at 06:06:38PM +1030, en0f wrote:
  Nate Eldredge wrote:
  On Fri, 24 Oct 2008, Steve Franks wrote:
 
  Hi,
 
  I'm getting a 22 errno from tcsetattr() on 7-STABLE i386 in code which
  was working under 7-STABLE amd64.  Serial device is a ucom (silabs
  cp2103).  Permissions on /dev/cuaU0 look fine.  Cutecom/Minicom
  appears to open the port without error...
  I don't see anything obviously wrong, but I'd bet a bug related to
  32/64-bit types.  Can you post a complete piece of code that can be
  compiled and run and demonstrates the problem?  Also, try compiling with
  -Wall -W and investigate any warnings that are produced.
 
  By the way, errno 22 is EINVAL, Invalid argument.  perror() is your
  friend.
  Strange freebsd doesnt document error numbers. On POSIX, errno 22 is
  EINVAL as well (documented in errno(3)). Is this applicable to freebsd?
  
  /usr/include/errno.h isn't documentation of error numbers?
 
 Gah! But Jeremy, I dont have magic brains to work me way out of
 source code :)

You're confusing me.  :P  The errors in errno.h are commented, and it's
quite readable.  Of course, it matches what's in errno(3).

If you're wanting to track down how/why tcsetattr(3) results in EINVAL,
using truss or ktrace might come in handy.  Otherwise, you literally
will have to throw some debugging code into the ucom(4) driver to
try and figure out what function is kicking out code 22.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-10-22 Thread Jeremy Chadwick
On Wed, Oct 22, 2008 at 01:06:20PM -0700, Nate Eldredge wrote:
 On Wed, 22 Oct 2008, Gary Kline wrote:

 On Wed, Oct 22, 2008 at 01:06:29PM +0200, Dag-Erling Sm?rgrav wrote:
 martinko [EMAIL PROTECTED] writes:
 I have always thought that Fn key in left most bottom corner of the
 keyboard is, especially for programmers, a very bad idea.  :-(

 Seconded.  Worse still, on my Lenovo T60, if the Fn key is held down
 longer than a fraction of a second, it generates an input event which
 just happens to correspond to Gnome's default key binding for the next
 track function in media players...

  I've seen that Fn key, but don't know what it is for.  What? you press
  it, then follow with the integers [ 1, 2, 3 ... ]?   At any rate, maybe
  you can remap the key with ~/.xmodmaprc.

 Fn is usually used on laptop keyboards to allow two logical keys to share 
 a single physical key.  For example, see the keyboard pictured at
 http://www.notebookreview.com/assets/3415.jpg .  On the extreme lower  
 right is a key with - in white and End in blue.  Pressing it by  
 itself sends the keycode corresponding to an ordinary keyboard's - 
 key. Holding Fn and pressing that key sends the keycode corresponding to 
 an ordinary keyboard's End key.  On many keyboards, pressing Fn by 
 itself sends no keycode at all, so it cannot be remapped.

 It is also sometimes used to control hardware features which on a desktop 
 machine might have a different interface.  For instance, on the laptop  
 pictured, holding Fn and pressing F6 would increase the screen 
 brightness, probably without sending a keycode.  A desktop machine would 
 probably have a button on the monitor itself to do this.

I always figured Fn was a good name for the key, given that it
resembles the expletive that comes forth from my mouth when intending to
hit Control.

http://www.notebookreview.com/assets/9328.jpg

;-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS boot

2008-10-12 Thread Jeremy Chadwick
On Sat, Oct 11, 2008 at 04:21:55PM -0700, Nate Eldredge wrote:
 On Sat, 11 Oct 2008, Pegasus Mc Cleaft wrote:

 FWIW, my system is amd64 with 1 G of memory, which the page implies is
 insufficient.  Is it really?

  This may be purely subjective, as I have never bench marked the speeds, 
 but
 when I was first testing zfs on a i386 machine with 1gig ram, I thought the
 performance was mediocre. However, when I loaded the system on a quad core -
 core2 with 8 gigs ram, I was quite impressed. I put localized changes in my
 /boot/loader.conf to give the kernel more breathing room and disabled the
 prefetch for zfs.

 #more loader.conf
 vm.kmem_size_max=1073741824
 vm.kmem_size=1073741824
 vfs.zfs.prefetch_disable=1

 I was somewhat confused by the suggestions on the wiki.  Do the kmem_size 
 sysctls affect the allocation of *memory* or of *address space*?

The Wiki is somewhat vague and doesn't give you all the knowledge you
need.

The kmem_* sysctls do not define pre-allocated amounts.  They define the
amount of memory which can be used by the kernel for allocation.

I strongly advocate tuning two other sysctls, which can help greatly
in ensuring no system lock-ups and no kmem exhaustion panics:

vfs.zfs.arc_min
vfs.zfs.arc_max

The following values are what I use, but others have reported better
performance with arc_max set to 128M:

vfs.zfs.arc_min=16M
vfs.zfs.arc_max=64M

 It seems a bit much to reserve 1 G of memory solely for the use of the
 kernel, expecially in my case when that's all I have :)  But on amd64, 
 it's welcome to have terabytes of address space if it will help.

ZFS is a memory hog, period.  That's just the nature of the beast.
You probably should not be using it on a system with 1GB.  I'll remind
you that memory right now is *incredibly* cheap; you can get 4GB of
brand-name lifetime-warranty RAM for around US$40-50.

Secondly, with regards to amd64:

RELENG_6 and RELENG_7 amd64 cannot handle more than 2GB of kmem.  Yes,
you read that correct; it's not a typo.  It's an implementation issue
which cannot be easily solved on those releases.  CURRENT can address up
to 512GB.  I've fully documented this on my Wiki, see section Kernel.

http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: continuous backup solution for FreeBSD

2008-10-11 Thread Jeremy Chadwick
On Sat, Oct 11, 2008 at 01:07:44PM +0200, Danny Braniss wrote:
  On Sat, Oct 11, 2008 at 12:35:16PM +0200, Danny Braniss wrote:
On Fri, 10 Oct 2008 08:42:49 -0700
Jeremy Chadwick [EMAIL PROTECTED] wrote:

 On Fri, Oct 10, 2008 at 11:29:52AM -0400, Mike Meyer wrote:
  On Fri, 10 Oct 2008 07:41:11 -0700
  Jeremy Chadwick [EMAIL PROTECTED] wrote:
  
   On Fri, Oct 10, 2008 at 03:53:38PM +0300, Evren Yurtesen wrote:
Mike Meyer wrote:
On Fri, 10 Oct 2008 02:34:28 +0300
[EMAIL PROTECTED] wrote:
   
Quoting Oliver Fromme [EMAIL PROTECTED]:
   
These features are readily available right now on FreeBSD.
You don't have to code anything.
Well with 2 downsides,
   
Once you actually try and implement these solutions, you'll 
see that
your downsides are largely figments of your imagination.
   
So if it is my imagination, how can I actually convert UFS to 
ZFS  
easily? Everybody seems to say that this is easy and that is 
easy.
   
   It's not that easy.  I really don't know why people are telling 
   you it
   is.
  
  Maybe because it is? Of course, it *does* require a little prior
  planning, but anyone with more than a few months experience as a
  sysadmin should be able to deal with it without to much hassle.
  
   Converting some filesystems are easier than others; /home (if you
   create one) for example is generally easy:
   
   1) ZFS fs is called foo/home, mounted as /mnt
   2) fstat, ensure nothing is using /home -- if something is, shut 
   it
  down or kill it
   3) rsync or cpdup /home files to /mnt
   4) umount /home
   5) zfs set mountpoint=/home foo/home
   6) Restart said processes or daemons
   
   See! It's like I said! EASY!  You can do this with /var as well.
  
  Yup. Of course, if you've done it that way, you're not thinking 
  ahead,
  because:
  
   Now try /usr.  Hope you've got /rescue available, because once 
   /usr/lib
   and /usr/libexec disappear, you're in trouble.  Good luck doing 
   this in
   multi-user, too.
  
  Oops. You F'ed up. If you'd done a little planning, you would have
  realized that / and /usr would be a bit of extra trouble, and 
  planned
  accordingly.
  
   And finally, the root fs.  Whoever says this is easy is kidding
   themselves; it's a pain.
  
  Um, no, it wasn't. Of course, I've been doing this long enough to 
  have
  a system set up to make this kind of thing easy. My system disk is 
  on
  a mirror, and I do system upgrades by breaking the mirror and
  upgrading one disk, making everything work, then putting the mirror
  back together. And moving to zfs on root is a lot like a system
  upgrade:
  
  1) Break the mirror (mirrors actually, as I mirrored file systems).
  2) Repartition the unused drive into /boot, swap  data
  3) Build zfs  /boot according to the instructions on ZFSOnRoot
 wiki, just copying /boot and / at this point.
  4) Boot the zfs disk in single user mode.
  5) If 4 fails, boot back to the ufs disk so you're operational while
 you contemplate what went wrong, then repeat step 3. Otherwise, 
  go
 on to step 6.
  6) Create zfs file systems as appropriate (given that zfs file
 systems are cheap, and have lots of cool features that ufs
 file systems don't have, you probably want to create more than
 you had before, doing thing like putting SQL serves on their
 own file system with appropriate blocking, etc, but you'll want 
  to
 have figured all this out before starting step 1).
  7) Copy data from the ufs file systems to their new homes,
 not forgetting to take them out of /etc/fstab.
  8) Reboot on the zfs disk.
  9) Test until you're happy that everything is working properly,
 and be prepared to reboot on the ufs disk if something is 
  broken. 
  10) Reformat the ufs disk to match the zfs one. Gmirror /boot,
  add the data partition to the zfs pool so it's mirrored, and
  you should have already been using swap.
  
  This is 10 steps to your easy 6, but two of the extra steps are
  testing you didn't include, and 1 of the steps is a failure recovery
  step that shouldn't be necessary. So - one more step than your easy
  process.
 
 Of course, the part you seem to be (intentionally?) forgetting: most
 people are not using gmirror.  There is no 2nd disk.  They have one 
 disk
 with a series of UFS2 filesystems, and they want to upgrade.  That's 
 how
 I read Evren's how do I do this? You say it's easy... comment, and I
 think his viewpoint is very reasonable.

Granted

Re: continuous backup solution for FreeBSD

2008-10-11 Thread Jeremy Chadwick
On Sat, Oct 11, 2008 at 12:35:16PM +0200, Danny Braniss wrote:
  On Fri, 10 Oct 2008 08:42:49 -0700
  Jeremy Chadwick [EMAIL PROTECTED] wrote:
  
   On Fri, Oct 10, 2008 at 11:29:52AM -0400, Mike Meyer wrote:
On Fri, 10 Oct 2008 07:41:11 -0700
Jeremy Chadwick [EMAIL PROTECTED] wrote:

 On Fri, Oct 10, 2008 at 03:53:38PM +0300, Evren Yurtesen wrote:
  Mike Meyer wrote:
  On Fri, 10 Oct 2008 02:34:28 +0300
  [EMAIL PROTECTED] wrote:
 
  Quoting Oliver Fromme [EMAIL PROTECTED]:
 
  These features are readily available right now on FreeBSD.
  You don't have to code anything.
  Well with 2 downsides,
 
  Once you actually try and implement these solutions, you'll see 
  that
  your downsides are largely figments of your imagination.
 
  So if it is my imagination, how can I actually convert UFS to ZFS  
  easily? Everybody seems to say that this is easy and that is easy.
 
 It's not that easy.  I really don't know why people are telling you it
 is.

Maybe because it is? Of course, it *does* require a little prior
planning, but anyone with more than a few months experience as a
sysadmin should be able to deal with it without to much hassle.

 Converting some filesystems are easier than others; /home (if you
 create one) for example is generally easy:
 
 1) ZFS fs is called foo/home, mounted as /mnt
 2) fstat, ensure nothing is using /home -- if something is, shut it
down or kill it
 3) rsync or cpdup /home files to /mnt
 4) umount /home
 5) zfs set mountpoint=/home foo/home
 6) Restart said processes or daemons
 
 See! It's like I said! EASY!  You can do this with /var as well.

Yup. Of course, if you've done it that way, you're not thinking ahead,
because:

 Now try /usr.  Hope you've got /rescue available, because once 
 /usr/lib
 and /usr/libexec disappear, you're in trouble.  Good luck doing this 
 in
 multi-user, too.

Oops. You F'ed up. If you'd done a little planning, you would have
realized that / and /usr would be a bit of extra trouble, and planned
accordingly.

 And finally, the root fs.  Whoever says this is easy is kidding
 themselves; it's a pain.

Um, no, it wasn't. Of course, I've been doing this long enough to have
a system set up to make this kind of thing easy. My system disk is on
a mirror, and I do system upgrades by breaking the mirror and
upgrading one disk, making everything work, then putting the mirror
back together. And moving to zfs on root is a lot like a system
upgrade:

1) Break the mirror (mirrors actually, as I mirrored file systems).
2) Repartition the unused drive into /boot, swap  data
3) Build zfs  /boot according to the instructions on ZFSOnRoot
   wiki, just copying /boot and / at this point.
4) Boot the zfs disk in single user mode.
5) If 4 fails, boot back to the ufs disk so you're operational while
   you contemplate what went wrong, then repeat step 3. Otherwise, go
   on to step 6.
6) Create zfs file systems as appropriate (given that zfs file
   systems are cheap, and have lots of cool features that ufs
   file systems don't have, you probably want to create more than
   you had before, doing thing like putting SQL serves on their
   own file system with appropriate blocking, etc, but you'll want to
   have figured all this out before starting step 1).
7) Copy data from the ufs file systems to their new homes,
   not forgetting to take them out of /etc/fstab.
8) Reboot on the zfs disk.
9) Test until you're happy that everything is working properly,
   and be prepared to reboot on the ufs disk if something is broken. 
10) Reformat the ufs disk to match the zfs one. Gmirror /boot,
add the data partition to the zfs pool so it's mirrored, and
you should have already been using swap.

This is 10 steps to your easy 6, but two of the extra steps are
testing you didn't include, and 1 of the steps is a failure recovery
step that shouldn't be necessary. So - one more step than your easy
process.
   
   Of course, the part you seem to be (intentionally?) forgetting: most
   people are not using gmirror.  There is no 2nd disk.  They have one disk
   with a series of UFS2 filesystems, and they want to upgrade.  That's how
   I read Evren's how do I do this? You say it's easy... comment, and I
   think his viewpoint is very reasonable.
  
  Granted, most people don't think about system upgrades when they build
  a system, so they wind up having to do extra work. In particular,
  Evren is talking about spending thousands of dollars on proprietary
  software, not to mention the cost of the server that all this data is
  going to flow to, for a backup solution. Compared to that, the cost of
  a few spare disks

Re: continuous backup solution for FreeBSD

2008-10-10 Thread Jeremy Chadwick
On Fri, Oct 10, 2008 at 03:53:38PM +0300, Evren Yurtesen wrote:
 Mike Meyer wrote:
 On Fri, 10 Oct 2008 02:34:28 +0300
 [EMAIL PROTECTED] wrote:

 Quoting Oliver Fromme [EMAIL PROTECTED]:

 These features are readily available right now on FreeBSD.
 You don't have to code anything.
 Well with 2 downsides,

 Once you actually try and implement these solutions, you'll see that
 your downsides are largely figments of your imagination.

 So if it is my imagination, how can I actually convert UFS to ZFS  
 easily? Everybody seems to say that this is easy and that is easy.

It's not that easy.  I really don't know why people are telling you it
is.  Converting some filesystems are easier than others; /home (if you
create one) for example is generally easy:

1) ZFS fs is called foo/home, mounted as /mnt
2) fstat, ensure nothing is using /home -- if something is, shut it
   down or kill it
3) rsync or cpdup /home files to /mnt
4) umount /home
5) zfs set mountpoint=/home foo/home
6) Restart said processes or daemons

See! It's like I said! EASY!  You can do this with /var as well.

Now try /usr.  Hope you've got /rescue available, because once /usr/lib
and /usr/libexec disappear, you're in trouble.  Good luck doing this in
multi-user, too.

And finally, the root fs.  Whoever says this is easy is kidding
themselves; it's a pain.  You get to make a new filesystem called /boot,
and have all sorts of fun.  It's really not a snap-fingers-voila thing,
and I will gladly argue with anyone who thinks otherwise.  Is it do-able
though?  Yes.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: continuous backup solution for FreeBSD

2008-10-10 Thread Jeremy Chadwick
On Fri, Oct 10, 2008 at 11:29:52AM -0400, Mike Meyer wrote:
 On Fri, 10 Oct 2008 07:41:11 -0700
 Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  On Fri, Oct 10, 2008 at 03:53:38PM +0300, Evren Yurtesen wrote:
   Mike Meyer wrote:
   On Fri, 10 Oct 2008 02:34:28 +0300
   [EMAIL PROTECTED] wrote:
  
   Quoting Oliver Fromme [EMAIL PROTECTED]:
  
   These features are readily available right now on FreeBSD.
   You don't have to code anything.
   Well with 2 downsides,
  
   Once you actually try and implement these solutions, you'll see that
   your downsides are largely figments of your imagination.
  
   So if it is my imagination, how can I actually convert UFS to ZFS  
   easily? Everybody seems to say that this is easy and that is easy.
  
  It's not that easy.  I really don't know why people are telling you it
  is.
 
 Maybe because it is? Of course, it *does* require a little prior
 planning, but anyone with more than a few months experience as a
 sysadmin should be able to deal with it without to much hassle.
 
  Converting some filesystems are easier than others; /home (if you
  create one) for example is generally easy:
  
  1) ZFS fs is called foo/home, mounted as /mnt
  2) fstat, ensure nothing is using /home -- if something is, shut it
 down or kill it
  3) rsync or cpdup /home files to /mnt
  4) umount /home
  5) zfs set mountpoint=/home foo/home
  6) Restart said processes or daemons
  
  See! It's like I said! EASY!  You can do this with /var as well.
 
 Yup. Of course, if you've done it that way, you're not thinking ahead,
 because:
 
  Now try /usr.  Hope you've got /rescue available, because once /usr/lib
  and /usr/libexec disappear, you're in trouble.  Good luck doing this in
  multi-user, too.
 
 Oops. You F'ed up. If you'd done a little planning, you would have
 realized that / and /usr would be a bit of extra trouble, and planned
 accordingly.
 
  And finally, the root fs.  Whoever says this is easy is kidding
  themselves; it's a pain.
 
 Um, no, it wasn't. Of course, I've been doing this long enough to have
 a system set up to make this kind of thing easy. My system disk is on
 a mirror, and I do system upgrades by breaking the mirror and
 upgrading one disk, making everything work, then putting the mirror
 back together. And moving to zfs on root is a lot like a system
 upgrade:
 
 1) Break the mirror (mirrors actually, as I mirrored file systems).
 2) Repartition the unused drive into /boot, swap  data
 3) Build zfs  /boot according to the instructions on ZFSOnRoot
wiki, just copying /boot and / at this point.
 4) Boot the zfs disk in single user mode.
 5) If 4 fails, boot back to the ufs disk so you're operational while
you contemplate what went wrong, then repeat step 3. Otherwise, go
on to step 6.
 6) Create zfs file systems as appropriate (given that zfs file
systems are cheap, and have lots of cool features that ufs
file systems don't have, you probably want to create more than
you had before, doing thing like putting SQL serves on their
own file system with appropriate blocking, etc, but you'll want to
have figured all this out before starting step 1).
 7) Copy data from the ufs file systems to their new homes,
not forgetting to take them out of /etc/fstab.
 8) Reboot on the zfs disk.
 9) Test until you're happy that everything is working properly,
and be prepared to reboot on the ufs disk if something is broken. 
 10) Reformat the ufs disk to match the zfs one. Gmirror /boot,
 add the data partition to the zfs pool so it's mirrored, and
 you should have already been using swap.
 
 This is 10 steps to your easy 6, but two of the extra steps are
 testing you didn't include, and 1 of the steps is a failure recovery
 step that shouldn't be necessary. So - one more step than your easy
 process.

Of course, the part you seem to be (intentionally?) forgetting: most
people are not using gmirror.  There is no 2nd disk.  They have one disk
with a series of UFS2 filesystems, and they want to upgrade.  That's how
I read Evren's how do I do this? You say it's easy... comment, and I
think his viewpoint is very reasonable.

 Yeah, this isn't something you do on a whim. On the other hand, it's
 not something that any competent sysadmin would consider a pain. For a
 good senior admin, it's a lot easier than doing an OS upgrade from
 source, which should be the next step up from trivial.

I guess you have a very different definition of easy.  :-)

The above procedure, in no way shape or form, will be classified as
easy by the user (or even junior sysadmin) community, I can assure you
of that.

I'll also throw this in the mix: the fact that we are *expecting* users
to know how to do this is unreasonable.  It's even *more* rude to expect
that mid-level or senior SAs have to do it the hard way.  Why?  I'll
explain:

I'm an SA of 16+ years.  I'm quite familiar with PBR/MBR, general disk
partitioning, sectors vs. blocks, slices, filesystems

Re: SSH Brute Force attempts

2008-09-30 Thread Jeremy Chadwick
On Tue, Sep 30, 2008 at 09:56:32AM +0200, Jeroen Ruigrok van der Werven wrote:
 -On [20080930 05:14], Rich Healey ([EMAIL PROTECTED]) wrote:
 What do you BSD guys use for this purpose?
 
 I actually use blockhosts, which is a Python solution you tie into
 hosts.allow.
 
 http://www.aczoom.com/cms/blockhosts

In no way shape or form does this solve the problem of the attackers
being able to establish a TCP connection to you -- they are still tying
up sockets, mbufs, and extra network I/O (coming from you when you
respond and close the socket).

TCP wrappers are absolutely 100% worthless in this day and age.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How do I unchown a directory after I: chown -R /etc ???

2008-09-30 Thread Jeremy Chadwick
On Tue, Sep 30, 2008 at 02:25:54AM -0700, Mike Price wrote:
 How do I unchown a directory after I: chown -R /etc

You can't.  Restore /etc from backups.

And ***please*** stop posting this stuff to -hackers.  It is not the
appropriate list for it.  Start using -questions.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-29 Thread Jeremy Chadwick
On Mon, Sep 29, 2008 at 04:19:43PM +0200, Dag-Erling Smørgrav wrote:
 Jeremy Chadwick [EMAIL PROTECTED] writes:
  I see the system has an Intel AHCI-based controller (probably an ICH10
  chip, since the ICH10 is the first to support 6 SATA channels).
 
 No.  I have an ICH8 with six channels, of which five are in use.

Thanks for the clarification.  The breakdown is as follows:

ICH7= 4 ports
ICH7R   = 6 ports
ICH7DH  = 6 ports
ICH7M   = unknown
ICH7MDH = unknown
ICH7U   = unknown

ICH8= 4 ports
ICH8R   = 6 ports
ICH8DH  = 6 ports
ICH8DO  = 6 ports
ICH8M   = 3 ports
ICH8ME  = 3 ports

ICH9= 4 ports
ICH9R   = 4 ports
ICH9DH  = 4 ports
ICH9DO  = 4 ports
ICH9M   = unknown
ICH9ME  = unknown

ICH10   = 6 ports
ICH10R  = 6 ports

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA Security patch to atacontrol

2008-09-29 Thread Jeremy Chadwick
On Tue, Sep 30, 2008 at 01:06:55AM +0200, Daniel Roethlisberger wrote:
 I've added experimental support for the ATA Security command set to
 atacontrol.  Please test and review.  If you have some spare disk(s)
 with ATA Security support and a BIOS which does not freeze the security
 configuration, I'd like to hear about any results of playing with this
 patch.  See the changes to the manual page for details on the commands.
 
 Note that you may render disks unusable using the ATA Security commands.
 Use with great care.
 
 http://daniel.roe.ch/code/ata/atasecurity-20080930-complete.diff

Daniel,

Can you provide me datasheet and technical reference material to what
ATA Security is?  Which ATA specification is this documented in?  I'd
like to read it.

Thanks!

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SSH Brute Force attempts

2008-09-29 Thread Jeremy Chadwick
On Tue, Sep 30, 2008 at 10:10:59AM +1000, Rich Healey wrote:
 Recently I'm getting a lot of brute force attempts on my server, in the
 past I've used various tips and tricks with linux boxes but many of them
 were fairly linux specific.
 
 What do you BSD guys use for this purpose?

This probably should've gone to -security, correct.

There are 3 ports which people often use for solving this:

ports/security/blocksshd
ports/security/sshblock
ports/security/sshguard-(pf|ipfw|ipfilter)

The latter depends on which firewalling stack you use, and I believe
one of the other two only work with ipfw (I forget which).

I have great reservations using any of these, because they dynamically
add firewalling rules/tables to your machines based on data in log
files.  For me, it smells of an accident waiting to happen.

I'm an advocate of simply blocking large netblocks where most of these
attacks come from (Latin America, eastern Europe, Asia, and Russia).
This requires that you appropriately tune things over time, and *be
intelligent* about what you're doing.  :-)

What we use in our pf.conf on our production systems:

table ssh-allow persist file /conf/ME/pf.conf.ssh-allow
table ssh-deny persist file /conf/ME/pf.conf.ssh-deny

block in on $ext_if proto tcp from ssh-deny to any port ssh
pass  in on $ext_if proto tcp from ssh-allow to any port ssh flags S/SA keep 
state

pf.conf.ssh-deny contains a list of IPs or CIDRs which are to be
blocked.  I can provide this file if desired.

pf.conf.ssh-allow contains a list of IPs or CIDRs which override
blocks in the previous block rule.  The reason we have this is due to
one Russian user who wasn't able to SSH into our boxes due to the
previous block rule.

You naturally have to keep pf.conf.ssh-* in sync if you have multiple
machines.  You can use pfsync(4) to accomplish this task (I think), or
you can do it the obvious way (make a central distribution box that
scp/rsync's the files out and runs /etc/rc.d/pf reload).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Sun, Sep 28, 2008 at 10:43:58AM +, Pegasus McCleaft wrote:
   I was wondering if anyone else is experiencing this problem. I have 
 recently reloaded my machine (due to a meltdown of my primary boot  
 drive) and noticed that under 7.0-rel the atacontrol command seems to 
 work great, however, under 7.1 I get and error

 atacontrol: ioctl(IOCATADEVICES): Device not configured

What arguments did you give atacontrol?

   Has anyone else seen this error. I wouldent be conserned if it wasent 
 for the fact that it worked under 7.0-rel but now dosent. The machine is 
 using both the:

 atapci0: SiI SiI 3132 SATA300 controller
 atapci1: JMicron JMB363 SATA300 controller

atapci is just the PCI portion, and doesn't show any sign of the ATA
driver being attached.  Do you have ataX (e.g. ata0) devices showing
up in dmesg?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Sun, Sep 28, 2008 at 10:32:48PM +0100, Pegasus Mc Cleaft wrote:
 On Sunday 28 September 2008 21:42:41 Jeremy Chadwick wrote:
  On Sun, Sep 28, 2008 at 10:43:58AM +, Pegasus McCleaft wrote:
 I was wondering if anyone else is experiencing this problem. I have
   recently reloaded my machine (due to a meltdown of my primary boot
   drive) and noticed that under 7.0-rel the atacontrol command seems to
   work great, however, under 7.1 I get and error
  
   atacontrol: ioctl(IOCATADEVICES): Device not configured
 
  What arguments did you give atacontrol?

 Sorry, forgot to add that in. I was typing from a terminal window while xorg
 was rebuilding.  Perhapse some more usefull information to follow :

 feathers# atacontrol list
 atacontrol: ioctl(IOCATADEVICES): Device not configured

 
 Has anyone else seen this error. I wouldent be conserned if it wasent
   for the fact that it worked under 7.0-rel but now dosent. The machine is
   using both the:
  
   atapci0: SiI SiI 3132 SATA300 controller
   atapci1: JMicron JMB363 SATA300 controller
 
  atapci is just the PCI portion, and doesn't show any sign of the ATA
  driver being attached.  Do you have ataX (e.g. ata0) devices showing
  up in dmesg?

 Yea.. The machine is otherwise running fine, and also loaded the driver for
 the ata raid controller (I made the machine boot off a raid-1 pack and made
 slices on the pack for /, /usr, /var . The rest zfs for /usr/home)

I don't know what filesystems you have assigned to what drives, but
please be aware there are known major problems with Silicon Image
controllers on FreeBSD, Linux, and Windows.  The most common problem is
silent data corruption.  I can refer you to previous discussions of this
problem if you'd like.  If at all possible, disable this controller in
the BIOS, and do not use it.

JMicron controllers are known to behave OK, but have a history of
serious driver problems under Windows.  I doubt that'll affect you, but
it's something to keep in mind.

I see the system has an Intel AHCI-based controller (probably an ICH10
chip, since the ICH10 is the first to support 6 SATA channels).  I would
recommend using that for as much as you can (I see you have disks ad14
through ad24 off that controller).

Just things you should be aware of.

 Thinking about it, I also added atapicd in the kernel config so I could use
 things like K3B and xcdroast.. I dont know if maybe that shim might be causing
 issues. I'll try making another kernel without it and giving that a try.

I highly doubt that's the problem.

Can you please include your kernel configuration file here?

Also, please do not copy/paste the file; I noticed in your dmesg|grep
ata output, all of the lines had trailing spaces (I've stripped them
off).

I've CC'd [EMAIL PROTECTED] who maintains ata(4).  I see no reason why
atacontrol list would be returning such an error.

 feathers# dmesg | grep ata
 atapci0: SiI SiI 3132 SATA300 controller port 0x9000-0x907f mem
 0xe7004000-0xe700407f,0xe700-0xe7003fff irq 16 at device 0.0 on pci3
 atapci0: [ITHREAD]
 ata2: ATA channel 0 on atapci0
 ata2: [ITHREAD]
 ata3: ATA channel 1 on atapci0
 ata3: [ITHREAD]
 atapci1: JMicron JMB363 SATA300 controller mem 0xec10-0xec101fff irq 19
 at device 0.0 on pci5
 atapci1: [ITHREAD]
 atapci1: AHCI called from vendor specific driver
 atapci1: AHCI Version 01.00 controller with 2 ports detected
 ata4: ATA channel 0 on atapci1
 ata4: [ITHREAD]
 ata5: ATA channel 1 on atapci1
 ata5: [ITHREAD]
 atapci2: JMicron JMB363 UDMA133 controller port
 0xb000-0xb007,0xb100-0xb103,0xb200-0xb207,0xb300-0xb303,0xb400-0xb40f irq 16
 at device 0.1 on pci5
 atapci2: [ITHREAD]
 ata6: ATA channel 0 on atapci2
 ata6: [ITHREAD]
 atapci3: Intel AHCI controller port
 0xe600-0xe607,0xe700-0xe703,0xe800-0xe807,0xe900-0xe903,0xea00-0xea1f mem
 0xec406000-0xec4067ff irq 19 at device 31.2 on pci0
 atapci3: [ITHREAD]
 atapci3: AHCI Version 01.20 controller with 6 ports detected
 ata7: ATA channel 0 on atapci3
 ata7: [ITHREAD]
 ata8: ATA channel 1 on atapci3
 ata8: [ITHREAD]
 ata9: ATA channel 2 on atapci3
 ata9: [ITHREAD]
 ata10: ATA channel 3 on atapci3
 ata10: [ITHREAD]
 ata11: ATA channel 4 on atapci3
 ata11: [ITHREAD]
 ata12: ATA channel 5 on atapci3
 ata12: [ITHREAD]
 acd0: DVDR ASUS DRW-2014L1T/1.02 at ata3-master SATA150
 ad8: 476940MB Hitachi HDP725050GLA360 GM4OA52A at ata4-master SATA300
 ad10: 476940MB Hitachi HDP725050GLA360 GM4OA52A at ata5-master SATA300
 ad14: 476938MB WDC WD5000AAKS-00TMA0 12.01C01 at ata7-master SATA300
 ad16: 476940MB SAMSUNG HD502IJ 1AA01109 at ata8-master SATA300
 ad18: 476940MB SAMSUNG HD502IJ 1AA01112 at ata9-master SATA300
 ad20: 476940MB SAMSUNG HD502IJ 1AA01109 at ata10-master SATA300
 ad22: 476940MB SAMSUNG HD502IJ 1AA01109 at ata11-master SATA300
 ad24: 476940MB SAMSUNG HD502IJ 1AA01109 at ata12-master SATA300
 ar0: disk0 READY (master) using ad8 at ata4-master
 ar0: disk1 READY (mirror) using ad10 at ata5-master
 cd0 at ata1 bus 0 target 0 lun 0

Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Mon, Sep 29, 2008 at 12:21:42AM +0100, Pegasus Mc Cleaft wrote:
 On Sunday 28 September 2008 23:37:09 Jeremy Chadwick wrote:
 
 snip
   Yea.. The machine is otherwise running fine, and also loaded the driver
   for the ata raid controller (I made the machine boot off a raid-1 pack
   and made slices on the pack for /, /usr, /var . The rest zfs for
   /usr/home)
 
  I don't know what filesystems you have assigned to what drives, but
  please be aware there are known major problems with Silicon Image
  controllers on FreeBSD, Linux, and Windows.  The most common problem is
  silent data corruption.  I can refer you to previous discussions of this
  problem if you'd like.  If at all possible, disable this controller in
  the BIOS, and do not use it.
 
   I didnt know that, so thanks for telling me. As luck would have it, I 
 just 
 have the DVD-RW drive on the Silicon Image controller.  I originally tried to 
 use the SI controller for the primary boot, but the driver support was not 
 there for it with the 7.0 boot disks, so I played with the cables and put the 
 two raid-1 drives on the JMicron controller. It was only after I got a 7.1 
 kernel installed that it showed back up and I got use of the DVD-Rom drive. 

This often happens when a change is made between two versions of FreeBSD
but there's no available ISO which contains the drivers/changes which
provide the hardware support you need.  You might be interested in the
snapshots/ directory on the FTP mirrors:

ftp://ftp4.freebsd.org/pub/FreeBSD/snapshots/200809/

The allbsd.org site is also quite useful when you need something that's
been added in the past few days/weeks, and not within the past month:

http://pub.allbsd.org/FreeBSD-snapshots/

  I see the system has an Intel AHCI-based controller (probably an ICH10
  chip, since the ICH10 is the first to support 6 SATA channels).  I would
  recommend using that for as much as you can (I see you have disks ad14
  through ad24 off that controller).
 
   Its actually a ICH9R based motherboard by Gigabyte (GS-X48-DS5).

Ahh, right.  The ICH9R, ICH9DH, and ICH9DO contain 6 ports, while the
ICH9 only provides 4 ports (confirmed in the data sheet).

The ICH10 is the first to include 6 ports on the non-RAID version of the
chip, that's why I made that assumption.

 It seems to be a decent board, with the exception of, what I believe
 to be a hardware fault with the watchdog timer. I think they have put
 a pull-up resistor on the speaker line on the ICH9 that on release of
 reset is hardware disabling the watchdog timer function. I have gone
 round and round with them on this, but I can not adequately explain
 the problem to there tech-support as there is a language barrier. 

One has to remember that Gigabyte is predominantly a consumer product
vendor.  Chances of getting through to an engineer are slim; Tier 1
support shields customers from engineers -- understandable (you don't
want some irate customer wasting engineering's time on something that's
simple), but it's also a problem because most T1 folks often lack the
ability to make the judgement call as to when they should forward the
request.  Instead, the (false) conclusion they reach is This is just
some one-off, what a waste of time, /dev/null it.

Supermicro is the one company I've had luck with, where their Tier 1
guys understand that certain topics should be forwarded directly to
engineers, while other topics remain in T1s hands.

   Thinking about it, I also added atapicd in the kernel config so I could
   use things like K3B and xcdroast.. I dont know if maybe that shim might
   be causing issues. I'll try making another kernel without it and giving
   that a try.
 
  I highly doubt that's the problem.
 
  Can you please include your kernel configuration file here?
 
  Also, please do not copy/paste the file; I noticed in your dmesg|grep
  ata output, all of the lines had trailing spaces (I've stripped them
  off).
 
   Sure, I'll attach the config file and also a full dmesg. Forgive the 
 state of 
 the config file.. Its basically a copy of the GENERIC with a bunch of stuff 
 added at the end.
 
  I've CC'd [EMAIL PROTECTED] who maintains ata(4).  I see no reason why
  atacontrol list would be returning such an error.
 
   Thank you... 
 
   I'm not sure why I'm having the problem either. I just thought it was 
 strange 
 that it worked using the GENERIC kernel from 7.0-REL but once I built the 
 latest 7.1-PR it stopped working.  My machine (feathers) is actually a sister 
 machine from one I built at work so I could test things out at home and then 
 make changes to the production machine. The other machine does not have the 
 SI 
 controller in it (everything else is the same) and it also will not do the 
 atacontrol list without erroring in the same way. 

7.0-RELEASE to 7.1-PRERELEASE is pretty major in terms of changes;
there's been a lot.  I'd like to blame ata(4), but I don't know of any
changes there (and I follow

Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Sun, Sep 28, 2008 at 11:24:38PM +0100, Bruce Cran wrote:
 On Sun, 28 Sep 2008 10:43:58 + (UTC)
 Pegasus McCleaft [EMAIL PROTECTED] wrote:
 
  Hello everyone.
  
  I was wondering if anyone else is experiencing this problem.
  I have recently reloaded my machine (due to a meltdown of my primary
  boot drive) and noticed that under 7.0-rel the atacontrol command
  seems to work great, however, under 7.1 I get and error
  
  atacontrol: ioctl(IOCATADEVICES): Device not configured
  
  Has anyone else seen this error. I wouldent be conserned if
  it wasent for the fact that it worked under 7.0-rel but now dosent.
  The machine is using both the:
  
  atapci0: SiI SiI 3132 SATA300 controller
  atapci1: JMicron JMB363 SATA300 controller
 
 I'm also seeing this problem on my amd64 7.1-PRERELEASE system:
 
  atacontrol list
 ATA channel 0:
 Master: acd0 HL-DT-ST DVD+/-RW GSA-T11N/A102 ATA/ATAPI revision 5
 Slave:   no device present
 atacontrol: ioctl(IOCATADEVICES): Device not configured
 
 I've attached the dmesg, and truss output from atacontrol list.

Your dmesg output implies you're not using atapicam, while Pegasus is.
So I believe that rules that out.

Are you using ATA_STATIC_ID?  If not, then I'm out of simple ideas as
to what could be causing this.

 open(/dev/ata,O_RDWR,03766320)   = 3 (0x3)
 ioctl(3,IOCATAGMAXCHANNEL,0xec20)  = 0 (0x0)
 ioctl(3,IOCATADEVICES,0xe590)  = 0 (0x0)
 fstat(1,{ mode=-rw-r--r-- ,inode=307828,size=2281,blksize=4096 }) = 0 (0x0)
 __sysctl(0x7fffdba0,0x2,0x800845b48,0x7fffdbb8,0x0,0x0) = 0 (0x0)
 __sysctl(0x7fffd6f0,0x2,0x8008547d8,0x7fffd6e8,0x0,0x0) = 0 (0x0)
 __sysctl(0x7fffd730,0x2,0x7fffd74c,0x7fffd740,0x0,0x0) = 0 (0x0)
 readlink(/etc/malloc.conf,0x7fffd790,1024) ERR#2 'No such file or 
 directory'
 issetugid(0x80071c2aa,0x7fffd790,0x,0x0,0x80ac1c40,0x7fffd768)
  = 0 (0x0)
 break(0x60)= 0 (0x0)
 break(0x70)= 0 (0x0)
 ioctl(3,IOCATADEVICES,0xe590)  ERR#6 'Device not configured'

I've snipped the truss output to the relevant piece.

fd 3 points to /dev/ata, and there are no man pages which document
the IOCATADEVICES ioctl.  I'll have to look at the source.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Sun, Sep 28, 2008 at 05:02:26PM -0700, Jeremy Chadwick wrote:
 On Sun, Sep 28, 2008 at 11:24:38PM +0100, Bruce Cran wrote:
  On Sun, 28 Sep 2008 10:43:58 + (UTC)
  Pegasus McCleaft [EMAIL PROTECTED] wrote:
  
   Hello everyone.
   
 I was wondering if anyone else is experiencing this problem.
   I have recently reloaded my machine (due to a meltdown of my primary
   boot drive) and noticed that under 7.0-rel the atacontrol command
   seems to work great, however, under 7.1 I get and error
   
   atacontrol: ioctl(IOCATADEVICES): Device not configured
   
 Has anyone else seen this error. I wouldent be conserned if
   it wasent for the fact that it worked under 7.0-rel but now dosent.
   The machine is using both the:
   
   atapci0: SiI SiI 3132 SATA300 controller
   atapci1: JMicron JMB363 SATA300 controller
  
  I'm also seeing this problem on my amd64 7.1-PRERELEASE system:
  
   atacontrol list
  ATA channel 0:
  Master: acd0 HL-DT-ST DVD+/-RW GSA-T11N/A102 ATA/ATAPI revision 5
  Slave:   no device present
  atacontrol: ioctl(IOCATADEVICES): Device not configured
  
  I've attached the dmesg, and truss output from atacontrol list.
 
 Your dmesg output implies you're not using atapicam, while Pegasus is.
 So I believe that rules that out.
 
 Are you using ATA_STATIC_ID?  If not, then I'm out of simple ideas as
 to what could be causing this.
 
  open(/dev/ata,O_RDWR,03766320) = 3 (0x3)
  ioctl(3,IOCATAGMAXCHANNEL,0xec20)= 0 (0x0)
  ioctl(3,IOCATADEVICES,0xe590)= 0 (0x0)
  fstat(1,{ mode=-rw-r--r-- ,inode=307828,size=2281,blksize=4096 }) = 0 (0x0)
  __sysctl(0x7fffdba0,0x2,0x800845b48,0x7fffdbb8,0x0,0x0) = 0 (0x0)
  __sysctl(0x7fffd6f0,0x2,0x8008547d8,0x7fffd6e8,0x0,0x0) = 0 (0x0)
  __sysctl(0x7fffd730,0x2,0x7fffd74c,0x7fffd740,0x0,0x0) = 0 (0x0)
  readlink(/etc/malloc.conf,0x7fffd790,1024) ERR#2 'No such file or 
  directory'
  issetugid(0x80071c2aa,0x7fffd790,0x,0x0,0x80ac1c40,0x7fffd768)
   = 0 (0x0)
  break(0x60)  = 0 (0x0)
  break(0x70)  = 0 (0x0)
  ioctl(3,IOCATADEVICES,0xe590)ERR#6 'Device not configured'
 
 I've snipped the truss output to the relevant piece.
 
 fd 3 points to /dev/ata, and there are no man pages which document
 the IOCATADEVICES ioctl.  I'll have to look at the source.

Bruce and Pegasus,

Can you please apply the below patch to src/sbin/atacontrol.c and let me
know what the output is when doing atacontrol list?

This won't solve the problem, but it will help in determining which
piece of code in src/sys/dev/ata/ata-all.c is returning an error to
ioctl() (different pieces of the code return different errors, either
ENXIO, ENODEV, or another error depending upon what gets returned
from ata_raid_ioctl_func()).

Thanks.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

--- atacontrol.c.orig   2008-04-08 03:48:20.0 -0700
+++ atacontrol.c2008-09-28 17:32:39.0 -0700
@@ -261,12 +261,14 @@
 static void
 info_print(int fd, int channel, int prchan)
 {
+   int ret;
struct ata_ioc_devices devices;
 
devices.channel = channel;
 
-   if (ioctl(fd, IOCATADEVICES, devices)  0)
-   err(1, ioctl(IOCATADEVICES));
+   if ((ret = ioctl(fd, IOCATADEVICES, devices))  0) {
+   err(1, ioctl(IOCATADEVICES) returned %d, ret);
+   }
 
if (prchan)
printf(ATA channel %d:\n, channel);
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Mon, Sep 29, 2008 at 03:07:48AM +0100, Bruce Cran wrote:
 On Sun, 28 Sep 2008 17:36:03 -0700
 Jeremy Chadwick [EMAIL PROTECTED] wrote:
  Bruce and Pegasus,
  
  Can you please apply the below patch to src/sbin/atacontrol.c and let
  me know what the output is when doing atacontrol list?
  
  This won't solve the problem, but it will help in determining which
  piece of code in src/sys/dev/ata/ata-all.c is returning an error to
  ioctl() (different pieces of the code return different errors, either
  ENXIO, ENODEV, or another error depending upon what gets returned
  from ata_raid_ioctl_func()).

I misread part of the code.  ata_raid_ioctl() only gets called if the
ata_raid_ioctl_func pointer is non-NULL (it defaults to NULL unless your
system is found to need/require ataraid support; need/require does not
mean compiled in, I assume it means we found devices/metadata that
ataraid can handle).

In your case, there are no arX devices, and the only ATA device you have
is an ATAPI CD/DVD drive.

 ATA channel 0:
 Master: acd0 HL-DT-ST DVD+/-RW GSA-T11N/A102 ATA/ATAPI revision 5
 Slave:   no device present
 atacontrol: ioctl(IOCATADEVICES) returned -1: Device not configured

Right, silly me.  Here I was hoping I could get the return code of
ata_ioctl(), but that's not the case.  There's no way for me to get that
information; ioctl() returns -1 on failure, and 0 on success.

truss isn't going to be enough for this, because I need to see into the
kernel ioctl() layer to find out what's going on in the ATA code.

Simply put, I don't know how to efficiently debug this problem under
FreeBSD.  dtrace is available on 7.1-PRERELEASE, but I'm unfamiliar with
it.

 This laptop's running GENERIC, so ATA_STATIC_ID is in my kernel config.

Thanks.  I realised on all of my systems I also use ATA_STATIC_ID (I
must've missed it when I was skimming the config), so neither atapicam
nor ATA_STATIC_ID are responsible for this.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Sun, Sep 28, 2008 at 08:07:44PM -0700, Jeremy Chadwick wrote:
 On Mon, Sep 29, 2008 at 03:07:48AM +0100, Bruce Cran wrote:
  On Sun, 28 Sep 2008 17:36:03 -0700
  Jeremy Chadwick [EMAIL PROTECTED] wrote:
   Bruce and Pegasus,
   
   Can you please apply the below patch to src/sbin/atacontrol.c and let
   me know what the output is when doing atacontrol list?
   
   This won't solve the problem, but it will help in determining which
   piece of code in src/sys/dev/ata/ata-all.c is returning an error to
   ioctl() (different pieces of the code return different errors, either
   ENXIO, ENODEV, or another error depending upon what gets returned
   from ata_raid_ioctl_func()).
 
 I misread part of the code.  ata_raid_ioctl() only gets called if the
 ata_raid_ioctl_func pointer is non-NULL (it defaults to NULL unless your
 system is found to need/require ataraid support; need/require does not
 mean compiled in, I assume it means we found devices/metadata that
 ataraid can handle).
 
 In your case, there are no arX devices, and the only ATA device you have
 is an ATAPI CD/DVD drive.
 
  ATA channel 0:
  Master: acd0 HL-DT-ST DVD+/-RW GSA-T11N/A102 ATA/ATAPI revision 5
  Slave:   no device present
  atacontrol: ioctl(IOCATADEVICES) returned -1: Device not configured
 
 Right, silly me.  Here I was hoping I could get the return code of
 ata_ioctl(), but that's not the case.  There's no way for me to get that
 information; ioctl() returns -1 on failure, and 0 on success.
 
 truss isn't going to be enough for this, because I need to see into the
 kernel ioctl() layer to find out what's going on in the ATA code.
 
 Simply put, I don't know how to efficiently debug this problem under
 FreeBSD.  dtrace is available on 7.1-PRERELEASE, but I'm unfamiliar with
 it.

Lucky!

While working on some other ATA-related code on a test/dev box I just
built about 30 minutes ago, I decided to do atacontrol list to see
what would happen:

testbox# atacontrol list
ATA channel 0:
Master:  ad0 ST3120026AS/3.05 Serial ATA v1.0
Slave:   no device present
ATA channel 1:
Master: acd0 CD-224E/1.9A ATA/ATAPI revision 0
Slave:   no device present
atacontrol: ioctl(IOCATADEVICES): Device not configured
testbox#

This box I have physical + serial access to, so I should be able to try
and track this down, now that I have something to work with.  :-)

I'll let you guys know what I find.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol broken in 7.1-PR

2008-09-28 Thread Jeremy Chadwick
On Mon, Sep 29, 2008 at 08:07:13AM +0400, Andrey V. Elsukov wrote:
 Bruce Cran wrote:
 I'm also seeing this problem on my amd64 7.1-PRERELEASE system:

 atacontrol list
 ATA channel 0:
 Master: acd0 HL-DT-ST DVD+/-RW GSA-T11N/A102 ATA/ATAPI revision 5
 Slave:   no device present
 atacontrol: ioctl(IOCATADEVICES): Device not configured


 I've attached the dmesg, and truss output from atacontrol list.

 This is known problem and it fixed in CURRENT.
 You need to apply this patch:
 http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/atacontrol/atacontrol.c.diff?r1=1.47;r2=1.48

 I cc'ed person, who commited this fix.
 Hi, Poul-Henning, I think it should be MFCed before release.

I agree, it should be MFC'd.

Cute bug too; never would've guessed it.  Saves me from the effort of
trying to get kgdb over serial working and all that jazz.  :-)

Thanks, Andrey!

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: freebsd-update missed?

2008-09-27 Thread Jeremy Chadwick
On Sat, Sep 27, 2008 at 10:17:33AM +0100, [EMAIL PROTECTED] wrote:
 On 20080927 00:07:57, Julian Elischer wrote:
  I'm sure it's there..
  it may be a different problem of course.
 
 
 I don't know... Checking with ident gives:
 
 $FreeBSD: src/lib/libpthread/thread/thr_kern.c,v 1.116.2.1 2006/03/16 
 23:29:07 deischen Exp $
 
 The patch claims 1.116.2.1.6.1
 
 Are these the same revisions?
 
 I mean, if I can determine that I definitely have the patch, then I have 
 another
 problem to worry about!

The advisory explicitly goes over what files were changed, and what
revisions include the fix.  The below versions include the fix.  If you
have older versions, then the answer is no, you do not have the fix.

http://security.freebsd.org/advisories/FreeBSD-EN-08:01.libpthread.asc

src/UPDATING1.416.2.37.2.6
src/sys/conf/newvers.sh 1.69.2.15.2.5
src/lib/libpthread/sys/lock.c   1.9.2.1.8.1
src/lib/libpthread/thread/thr_kern.c1.116.2.1.6.1

These are for CVS tag RELENG_6_3.

I do not use freebsd-update.  That said:

The man page for it states that it's a binary updater for pieces in the
base system, so you looking at your *source* files would indicate
absolutely nothing, other than when you last ran csup to update your
/usr/src tree.

I do not know of a way to verify if your libpthread library actually
contains the fix.  We will have to wait for Colin's answer.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Increasing partition size by removing partitions

2008-09-27 Thread Jeremy Chadwick
On Sat, Sep 27, 2008 at 07:22:20PM -0400, Aryeh M. Friedman wrote:
 I have a disk that is laid out with partion 0 being NTFS and 1 being  
 FreeBSD.  I want to remove the NTFS partition and grow the FreeBSD one  
 but all the docs I have seen only talk about how to do this if the new  
 part of the partition is at the end of the partition you wish to grow.
 How do I go about this?

There isn't a way to do this, as far as I know.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Jeremy Chadwick
On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
 Hi,
   There seems to be some serious degradation in performance.
 Under 7.0 I get about 90 MB/s (on write), while, on the same machine
 under 7.1 it drops to 20!
 Any ideas?

1) Network card driver changes,

2) This could be relevant, but rwatson@ will need to help determine
   that.
   http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Jeremy Chadwick
On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote:
  On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
   Hi,
 There seems to be some serious degradation in performance.
   Under 7.0 I get about 90 MB/s (on write), while, on the same machine
   under 7.1 it drops to 20!
   Any ideas?
  
  1) Network card driver changes,
 could be, but at least iperf/tcp is ok - can't get udp numbers, do you
 know of any tool to measure udp performance?
 BTW, I also checked on different hardware, and the badness is there.

According to INDEX, benchmarks/iperf does UDP bandwidth testing.

benchmarks/nttcp should as well.

What network card is in use?  If Intel, what driver version (should be
in dmesg).

  2) This could be relevant, but rwatson@ will need to help determine
 that.
 
  http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html
 
 gut feeling is that it's somewhere else:
 
 Writing 16 MB file
   BSCount / 7.0 --/ / 7.1 -/
1*512  32768 0.16s  98.11MB/s  0.43s 37.18MB/s
2*512  16384 0.17s  92.04MB/s  0.46s 34.79MB/s
4*512   8192 0.16s 101.88MB/s  0.43s 37.26MB/s
8*512   4096 0.16s  99.86MB/s  0.44s 36.41MB/s
   16*512   2048 0.16s 100.11MB/s  0.50s 32.03MB/s
   32*512   1024 0.26s  61.71MB/s  0.46s 34.79MB/s
   64*512512 0.22s  71.45MB/s  0.45s 35.41MB/s
  128*512256 0.21s  77.84MB/s  0.51s 31.34MB/s
  256*512128 0.19s  82.47MB/s  0.43s 37.22MB/s
  512*512 64 0.18s  87.77MB/s  0.49s 32.69MB/s
 1024*512 32 0.18s  89.24MB/s  0.47s 34.02MB/s
 2048*512 16 0.17s  91.81MB/s  0.30s 53.41MB/s
 4096*512  8 0.16s 100.56MB/s  0.42s 38.07MB/s
 8192*512  4 0.82s  19.56MB/s  0.80s 19.95MB/s
16384*512  2 0.82s  19.63MB/s  0.95s 16.80MB/s
32768*512  1 0.81s  19.69MB/s  0.96s 16.64MB/s
 
 Average:   75.8633.00
 
 the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in 
 the
 measurements, but the relation are similar, good on 7.0, bad on 7.1

Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Rare problems in upgrade process (corrupted FS?)

2008-09-26 Thread Jeremy Chadwick
On Fri, Sep 26, 2008 at 12:22:55PM +0200, Jordi Espasa Clofent wrote:
 Hi all,

 I'm traying to update a FreeBSD server box from 6.3p11 to 7.0 and I've  
 found a rare problems.

 1) I do the sync process with csup(1); next I go into  
 /usr/src/sys/amd64/conf to edit the GENERIC file (I use a custimized  
 kernels) and this file doesn't exists. Mmmm I decide to repeat the  
 process againt other cvsup mirror but I get the same results: GENERIC  
 file isn't there.

 2) I go to FreeBSD CVSWeb , locate the GENERIC file under the 7_0 tag,  
 copy and paste. Yes, I know: a very nasty process. The big problem  
 appears when I try to do 'make cleandir' and others. I get the next 
 outputs:

 # pwd
 /usr/src
 # make cleandir
 make: don't know how to make cleandir. Stop
 # make buildworld
 make: don't know how to make buildworld. Stop
 # ls -l /usr/bin/make
 -r-xr-xr-x  1 root  wheel  351024 Aug 18 13:19 /usr/bin/make
 # file /usr/bin/make
 /usr/bin/make: ELF 64-bit LSB executable, AMD x86-64, version 1  
 (FreeBSD), for FreeBSD 6.3, statically linked, stripped

Looks to me like you have no /usr/src/Makefile.

 * After the theorical FS reparation I'm again in the point 1.

None of the information you provided in your above output, however,
shows anything about the filesystem (other than /usr/bin/make).  But
this sounds honestly like some sort of corrupted supdb, or a cvsup
mirror that's broken.

I would do the following:

rm -fr /usr/src/*
rm -fr /var/db/sup/src-all
csup -h cvsupserver -L 2 -g /usr/share/examples/stable-supfile

I can assure you /sys/amd64/conf/GENERIC exists, and is on the cvsup
mirrors.

 * I reboot the machine (because of I suspect a very weird FS problem),  
 boot in single user mode and do a 'fsck -fy'. Effectively, the fsck(8)  
 found and repair several errors. Epecially, one error claims my  
 attention: SUPERBLOCK.

Superblock problems wouldn't explain this; there are hundreds of
superblocks available (you wouldn't be able to use your machine if they
were all horked).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bad NFS/UDP performance

2008-09-26 Thread Jeremy Chadwick
On Fri, Sep 26, 2008 at 04:35:17PM +0300, Danny Braniss wrote:
  On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote:
On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote:
 Hi,
   There seems to be some serious degradation in performance.
 Under 7.0 I get about 90 MB/s (on write), while, on the same machine
 under 7.1 it drops to 20!
 Any ideas?

1) Network card driver changes,
   could be, but at least iperf/tcp is ok - can't get udp numbers, do you
   know of any tool to measure udp performance?
   BTW, I also checked on different hardware, and the badness is there.
  
  According to INDEX, benchmarks/iperf does UDP bandwidth testing.
 
 I know, but I get about 1mgb, which seems somewhat low :-(
 
  
  benchmarks/nttcp should as well.
  
  What network card is in use?  If Intel, what driver version (should be
  in dmesg).
 
 bge: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x9003 
 and
 bce: Broadcom NetXtreme II BCM5708 1000Base-T (B2)
 and intels, but haven't tested there yet.

Both bge(4) and bce(4) claim to support checksum offloading.  You might
try disabling it (ifconfig ... -txcsum -rxcsum) to see if things
improve.  If not, more troubleshooting is needed.  You might also try
turning off TSO if it's supported (check your ifconfig output for TSO in
the options= section.  Then use ifconfig ... -tso)

  Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf?
  
 no, but diffing the sysctl show:
 
   -vfs.nfs.realign_test: 22141777
   +vfs.nfs.realign_test: 498351
 
   -vfs.nfsrv.realign_test: 5005908
   +vfs.nfsrv.realign_test: 0
 
   +vfs.nfsrv.commit_miss: 0
   +vfs.nfsrv.commit_blks: 0
 
 changing them did nothing - or at least with respect to nfs throughput :-)

I'm not sure what any of these do, as NFS is a bit out of my league.
:-)  I'll be following this thread though!

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Major SMP problems with lstat/namei

2008-09-24 Thread Jeremy Chadwick
On Wed, Sep 24, 2008 at 09:26:55AM +0200, Daniel Gerzo wrote:
 Hello Jeff,
 
 On Wed, 24 Sep 2008 00:52:59 -0400, Jeff Wheelhouse
 [EMAIL PROTECTED] wrote:
  
  We have encountered some serious SMP performance/scalability problems  
  that we've tracked back to lstat/namei calls.  I've written a quick  
 
 this all seems like a reason of very poor performance of PHP when used with
 open_basedir and safe_mode enabled. It would be nice to see if there's
 something what could be done to make it better.

Both of which are features which will, thankfully, be removed in PHP 6.
Whoever uses these features in PHP deserves the pain -- they're
worthless and provide no security what-so-ever.  Consider using suPHP
or an MPM like mpm-itk.

Also, PHP and performance shouldn't be put in the same sentence. /rant

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: proper types for printf()-ing pointers on amd64 that won't break i386?

2008-09-18 Thread Jeremy Chadwick
On Thu, Sep 18, 2008 at 10:41:42AM -0700, Steve Franks wrote:
 I'm trying to correct some warnings in a port marked
 ONLY_FOR_ARCHS=i386.  They stem from casting a pointer (which I assume
 is a 64-bit unsigned) to unsigned int which is apparently 32 bits?
 I sort of thought int was supposed to be the atomic register size, but
 no doubt that would break more than it would help, so it's 32-bits.
 Anyways, what's the right way to fix this?  The port actually works
 fine as-is on amd64, so I can only assume something was fixed for 7.1,
 or someone was being extra cautious with the i386 tag.
 
 The code:
 
typedef unsigned int cardinal;
...
fprintf(stderr, Mode Table Offset: $C + $%x\n,
 ((cardinal)map-mode_table) - ((cardinal)map-bios_ptr));
 
 Can I just ditch the cast+%x and use %p?  I don't have an i386 system
 to test on, and I don't want to break anything if I submit a patch...

Yes, use %p!  It works fine on all platforms.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Error: Can't find libjava.so

2008-09-15 Thread Jeremy Chadwick
On Sun, Sep 14, 2008 at 11:19:13AM +0200, Marcel Grandemange wrote:
 I do realize this is probably better suited for freebsd-questions , however
 haven't received any response and was simply hoping someone would be kind
 enough.
 
 I recently obtained a very decent ups, however it is not supported by NUT.
 
 It does however come with winpower software that does run on FreeBSD.
 
 However it rewuired java.
 
 So installed from ports
 
 And was presented with following error:
 
 Error: can't find libjava.so
 
 This is on system in folder /usr/local/Diablo-jre1.6.0/lib/amd64/libjava.so

Can you provide the output of ldconfig -r from that box?  I have
a feeling the ld.so pathing hints might lack a directory or two.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Error: Can't find libjava.so

2008-09-15 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 03:30:06PM +0200, Marcel Grandemange wrote:
  I do realize this is probably better suited for freebsd-questions ,
 however
  haven't received any response and was simply hoping someone would be kind
  enough.
  
  I recently obtained a very decent ups, however it is not supported by NUT.
  
  It does however come with winpower software that does run on FreeBSD.
  
  However it rewuired java.
  
  So installed from ports
  
  And was presented with following error:
  
  Error: can't find libjava.so
  
  This is on system in folder
 /usr/local/Diablo-jre1.6.0/lib/amd64/libjava.so
 
 Can you provide the output of ldconfig -r from that box?  I have
 a feeling the ld.so pathing hints might lack a directory or two.
 
 
 /var/run/ld-elf.so.hints:
   search directories: /lib:/usr/lib:/usr/lib/compat:/usr/local/lib

This is the problem as I see it.  ld.so, which is used for finding and
loading shared libraries, is not configured to look in
/usr/local/Diablo-jre1.6.0/lib/amd64 for libraries.

I'd like to know which port you installed, and how you installed it.

Based on the above, it appears to me the port itself may/does have a bug
-- it should be updating the hints path to include that directory, but
does/is not.  Please note I am in no way shape or form familiar with
Java or this port.

I do not know if this is specific to your machine or not -- however,
this is the first time I've seen it mentioned, and I quite active with
freebsd-ports.  (I'm subscribed to 15 separate FreeBSD mailing lists,
and I read/follow them all)

Regarding the problem itself: there are ways to work around this by
using the environment variable LD_LIBRARY_PATH.  I do not recommend
this, though -- properly configuring the ld.so search path when a
program (or port) is installed is the proper method.

Cross-posting to multiple lists is generally shunned upon, so answers to
the above questions will help determine if the discussion should be
moved to freebsd-ports@ or not.  I've a feeling it should be.

Thanks!

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-15 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 05:02:39PM +0200, Fabian Keil wrote:
 Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  On Mon, Sep 15, 2008 at 10:37:18AM +0800, Wilkinson, Alex wrote:
   0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 
   
About the only real improvement I'd like to see in this setup
is the ability to spin down idle drives.  That would be an
ideal setup for the home RAID array.
   
   There is a FreeBSD port which handles this, although such a
   feature should ideally be part of the ata(4) system (as should
   TCQ/NCQ and a slew of other things -- some of those are being
   worked on).
   
   And the port is ?
  
  Is it that hard to use 'make search' or grep?  :-)  sysutils/ataidle
 
 You also might want to have a look at atacontrol(8)'s spindown command.

The appropriate ata(4) changes and extension of atacontrol(8) to support
spindown was MFC'd (to RELENG_7) only 5 weeks ago.  It's fairly
unlikely that most users know this feature was MFC'd (case in point, I
was not).

http://www.freebsd.org/cgi/cvsweb.cgi/src/sbin/atacontrol/atacontrol.c
has the details, see Revision 1.43.2.2.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Error: Can't find libjava.so

2008-09-15 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 09:34:39PM +0200, Marcel Grandemange wrote:
   I do realize this is probably better suited for freebsd-questions ,
  however
   haven't received any response and was simply hoping someone would be
 kind
   enough.
   
   I recently obtained a very decent ups, however it is not supported by
 NUT.
   
   It does however come with winpower software that does run on FreeBSD.
   
   However it rewuired java.
   
   So installed from ports
   
   And was presented with following error:
   
   Error: can't find libjava.so
   
   This is on system in folder
  /usr/local/Diablo-jre1.6.0/lib/amd64/libjava.so
  
  Can you provide the output of ldconfig -r from that box?  I have
  a feeling the ld.so pathing hints might lack a directory or two.
  
  
  /var/run/ld-elf.so.hints:
  search directories: /lib:/usr/lib:/usr/lib/compat:/usr/local/lib
 
 This is the problem as I see it.  ld.so, which is used for finding and
 loading shared libraries, is not configured to look in
 /usr/local/Diablo-jre1.6.0/lib/amd64 for libraries.
 
 I'd like to know which port you installed, and how you installed it.
 
 I did a cvsup on ports to update to latest on FreeBSD7.0 release amd64
 Used port /usr/ports/java/Diablo-jre16
 Simply did
 Make
 Make install
 Make clean
 

Can you please apply the below patch and tell me if it solves your
problem?  Proper procedure should be:

# cd /usr/ports/java/diablo-jre16
# patch  /wherever/the/patch/is
# make clean
# make
# make deinstall
# make install

After this is done, use ldconfig -r and look at the search path
shown at the top; hopefully /usr/local/diablo-jre1.6.0/lib/amd64
will be there, and libjava.so should be found (hopefully).

 Regarding the problem itself: there are ways to work around this by
 using the environment variable LD_LIBRARY_PATH.  I do not recommend
 this, though -- properly configuring the ld.so search path when a
 program (or port) is installed is the proper method.
 
 Could you advise me how to do this? Hope you don't mind!

Set the LD_LIBRARY_PATH environment variable to the search paths
you desire.  Colon-delimited, and it overrides the defaults.  E.g.

export 
LD_LIBRARY_PATH=/lib:/usr/lib:/usr/lib/compat:/usr/local/lib:/usr/local/diablo-jre1.6.0/lib/amd64

But the below patch, assuming it works (and I got the paths right),
should not require you to do that.  LD_LIBRARY_PATH is somewhat evil,
and it's not recommended you use it.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

Index: Makefile
===
RCS file: /home/pcvs/ports/java/diablo-jre16/Makefile,v
retrieving revision 1.3
diff -u -r1.3 Makefile
--- Makefile20 Aug 2008 04:13:02 -  1.3
+++ Makefile16 Sep 2008 04:24:27 -
@@ -43,6 +43,8 @@
 
 INSTALL_DIR=   ${PREFIX}/${PKGNAMEPREFIX}jre${JRE_VERSION}
 
+USE_LDCONFIG=  ${PREFIX}/${PKGNAMEPREFIX}jre${JRE_VERSION}/lib/${ARCH}
+
 .include bsd.port.pre.mk
 
 .if ${OSVERSION} = 70
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-14 Thread Jeremy Chadwick
On Mon, Sep 15, 2008 at 01:23:39PM +0800, Wilkinson, Alex wrote:
 0n Sun, Sep 14, 2008 at 09:28:28PM -0700, Jeremy Chadwick wrote: 
 
 On Mon, Sep 15, 2008 at 10:37:18AM +0800, Wilkinson, Alex wrote:
  0n Fri, Sep 12, 2008 at 09:32:07AM -0700, Jeremy Chadwick wrote: 
  
   About the only real improvement I'd like to see in this setup 
 is the ability
   to spin down idle drives.  That would be an ideal setup for the 
 home RAID
   array.
  
  There is a FreeBSD port which handles this, although such a 
 feature
  should ideally be part of the ata(4) system (as should TCQ/NCQ 
 and a
  slew of other things -- some of those are being worked on).
  
  And the port is ?
 
 Is it that hard to use 'make search' or grep?  :-)  sysutils/ataidle
 
 When you dont know the string to search on ... yes.

Give me a break.  :-)  idle|sleep|suspend|spin


-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: loading multi threaded library into executable enabled for single thread

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 07:41:14AM -0400, Barry Andrews wrote:
 Do you know if this is documented in Release Notes or Known Issues or  
 somewhere?

Why would it be an issue?  gcc -pthread and libpthread linking is
documented pretty much everywhere on the web.  There isn't anything
broken about it, it's how it's done on older FreeBSD.

Note that all of this has significantly changed in later FreeBSD
versions, and that the 5.x series was deprecated a very long time ago.

 On Thu, 11 Sep 2008, Barry Andrews wrote:

 Hi All,

 I have a multi-threaded library that is linked against libpthread.  
 When I
 load this lib into a tclsh process on FreeBSD, I get this error,  
 Recurse on
 private mutex. and crash. I understand that I can have this issue  
 when the
 executable is not linked against libpthread but one of the loaded  
 libs is.
 Basically, it thinks it's in single threaded mode.

 This must be an older version of FreeBSD.  I think you must
 link your application (tclsh or whatever) against libpthread
 in order for this to work.  The libc functions won't get properly
 overloaded by their equivalents in libpthread unless you do
 this.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 10:45:24AM +0100, Karl Pielorz wrote:
 Recently, a ZFS pool on my FreeBSD box started showing lots of errors on  
 one drive in a mirrored pair.

 The pool consists of around 14 drives (as 7 mirrored pairs), hung off of 
 a couple of SuperMicro 8 port SATA controllers (1 drive of each pair is 
 on each controller).

 One of the drives started picking up a lot of errors (by the end of 
 things it was returning errors pretty much for any reads/writes issued) - 
 and taking ages to complete the I/O's.

 However, ZFS kept trying to use the drive - e.g. as I attached another  
 drive to the remaining 'good' drive in the mirrored pair, ZFS was still  
 trying to read data off the failed drive (and remaining good one) in 
 order to complete it's re-silver to the newly attached drive.

 Having posted on the Open Solaris ZFS list - it appears, under Solaris  
 there's an 'FMA Engine' which communicates drive failures and the like to 
 ZFS - advising ZFS when a drive should be marked as 'failed'.

 Is there anything similar to this on FreeBSD yet? - i.e. Does/can 
 anything on the system tell ZFS This drives experiencing failures 
 rather than ZFS just seeing lots of timed out I/O 'errors'? (as appears 
 to be the case).

As far as I know, there is no such standard mechanism in FreeBSD.  If
the drive falls off the bus entirely (e.g. detached), I would hope ZFS
would notice that.  I can imagine it (might) also depend on if the disk
subsystem you're using is utilising CAM or not (e.g. disks should be daX
not adX); Scott Long might know if something like this is implemented in
CAM.  I'm fairly certain nothing like this is implemented in ata(4).

Ideally, it would be the job of the controller and controller driver to
announce to underlying I/O operations fail/success.  Do you agree?

I hope this FMA Engine on Solaris only *tells* underlying pieces of
I/O errors, rather than acting on them (e.g. automatically yanking the
disk off the bus for you).  I'm in no way shunning Solaris, I'm simply
saying such a mechanism could be as risky/deadly as it could be useful.

 In the end, the failing drive was timing out literally every I/O - I did  
 recover the situation by detaching it from the pool (which hung the 
 machine - probably caused by ZFS having to update the meta-data on all 
 drives, including the failed one). A reboot bought the pool back, minus 
 the 'failed' drive, so enough of the 'detach' must have completed.

 The newly attached drive completed the re-silver in half an hour (as  
 opposed to an estimated 755 hours and climbing with the other drive still 
 in the pool, limping along).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: loading multi threaded library into executable enabled for single thread

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 09:26:37AM -0400, Barry Andrews wrote:
 I don't understand. If it was not broken, then why did it change in later
 FreeBSD versions?

I should be more explicit: the threading library and implementations
have changed over time.  There was libc_r, then there was libthr, then
there was libkse.  This is what we call evolution.  :-)

http://www.unobvious.com/bsd/freebsd-threads.html
http://kerneltrap.org/node/624
http://www.freebsd.org/kse/

The gcc -pthread flag is still there on present-day FreeBSD (6 through
HEAD), and *should* be used.  You can choose not to use it but you must
ensure during linktime that you explicitly link to -lpthread.

 On Fri, Sep 12, 2008 at 9:10 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  On Fri, Sep 12, 2008 at 07:41:14AM -0400, Barry Andrews wrote:
   Do you know if this is documented in Release Notes or Known Issues or
   somewhere?
 
  Why would it be an issue?  gcc -pthread and libpthread linking is
  documented pretty much everywhere on the web.  There isn't anything
  broken about it, it's how it's done on older FreeBSD.
 
  Note that all of this has significantly changed in later FreeBSD
  versions, and that the 5.x series was deprecated a very long time ago.
 
   On Thu, 11 Sep 2008, Barry Andrews wrote:
  
   Hi All,
  
   I have a multi-threaded library that is linked against libpthread.
   When I
   load this lib into a tclsh process on FreeBSD, I get this error,
   Recurse on
   private mutex. and crash. I understand that I can have this issue
   when the
   executable is not linked against libpthread but one of the loaded
   libs is.
   Basically, it thinks it's in single threaded mode.
  
   This must be an older version of FreeBSD.  I think you must
   link your application (tclsh or whatever) against libpthread
   in order for this to work.  The libc functions won't get properly
   overloaded by their equivalents in libpthread unless you do
   this.
 
  --
  | Jeremy Chadwickjdc at parodius.com |
  | Parodius Networking   http://www.parodius.com/ |
  | UNIX Systems Administrator  Mountain View, CA, USA |
  | Making life hard for others since 1977.  PGP: 4BD6C0CB |
 
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: loading multi threaded library into executable enabled for single thread

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 11:00:18AM -0400, Barry Andrews wrote:
 Thanks for the links! But I'm not sure what any of this has to do with  
 this particular issue. I have an exe that does not use threads that  
 loads a lib that is linked with libpthread. Why does different threading  
 implementations affect what I am seeing here? Is there no way for this  
 to work in FreeBSD v5.5? Why would this go away if I upgraded to 6.x or  
 better?

You're confusing me.  Earlier you said:

 I have a multi-threaded library that is linked against libpthread.
 When I
 load this lib into a tclsh process on FreeBSD, I get this error,

So what is the exe?  Are you referring to tclsh?  If so, you need to
rebuild tclsh from source to link with libpthread.  If not, you need to
contact whoever provided the binary and ask them to rebuild it from
source.

Additionally, please ensure that the tclsh binary is linked to the same
version of libpthread library as your own library.  You want to make
sure they're both built and linked on the same machine (from the same
source code) if possible; the simple .so.X versioning method works
great for major changes, but there are often minor changes that don't
result in X being increased.

I'm getting the impression that the tclsh binary you have was not built
on the same machine / from the same source as what your library (the one
linked with libpthread) was.

 Jeremy Chadwick wrote:
 On Fri, Sep 12, 2008 at 09:26:37AM -0400, Barry Andrews wrote:
   
 I don't understand. If it was not broken, then why did it change in later
 FreeBSD versions?
 

 I should be more explicit: the threading library and implementations
 have changed over time.  There was libc_r, then there was libthr, then
 there was libkse.  This is what we call evolution.  :-)

 http://www.unobvious.com/bsd/freebsd-threads.html
 http://kerneltrap.org/node/624
 http://www.freebsd.org/kse/

 The gcc -pthread flag is still there on present-day FreeBSD (6 through
 HEAD), and *should* be used.  You can choose not to use it but you must
 ensure during linktime that you explicitly link to -lpthread.

   
 On Fri, Sep 12, 2008 at 9:10 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote:

 
 On Fri, Sep 12, 2008 at 07:41:14AM -0400, Barry Andrews wrote:
   
 Do you know if this is documented in Release Notes or Known Issues or
 somewhere?
 
 Why would it be an issue?  gcc -pthread and libpthread linking is
 documented pretty much everywhere on the web.  There isn't anything
 broken about it, it's how it's done on older FreeBSD.

 Note that all of this has significantly changed in later FreeBSD
 versions, and that the 5.x series was deprecated a very long time ago.

   
 On Thu, 11 Sep 2008, Barry Andrews wrote:

   
 Hi All,

 I have a multi-threaded library that is linked against libpthread.
 When I
 load this lib into a tclsh process on FreeBSD, I get this error,
 Recurse on
 private mutex. and crash. I understand that I can have this issue
 when the
 executable is not linked against libpthread but one of the loaded
 libs is.
 Basically, it thinks it's in single threaded mode.
 
 This must be an older version of FreeBSD.  I think you must
 link your application (tclsh or whatever) against libpthread
 in order for this to work.  The libc functions won't get properly
 overloaded by their equivalents in libpthread unless you do
 this.
   
 --
 | Jeremy Chadwickjdc at parodius.com |
 | Parodius Networking   http://www.parodius.com/ |
 | UNIX Systems Administrator  Mountain View, CA, USA |
 | Making life hard for others since 1977.  PGP: 4BD6C0CB |


   
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

   

 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 03:34:30PM +0100, Karl Pielorz wrote:
 --On 12 September 2008 06:21 -0700 Jeremy Chadwick [EMAIL PROTECTED]  
 wrote:

 As far as I know, there is no such standard mechanism in FreeBSD.  If
 the drive falls off the bus entirely (e.g. detached), I would hope ZFS
 would notice that.  I can imagine it (might) also depend on if the disk
 subsystem you're using is utilising CAM or not (e.g. disks should be daX
 not adX); Scott Long might know if something like this is implemented in
 CAM.  I'm fairly certain nothing like this is implemented in ata(4).

 For ATA, at the moment - I don't think it'll notice even if a drive  
 detaches. I think like my system the other day, it'll just keep issuing 
 I/O commands to the drive, even if it's disappeared (it might get much 
 'quicker failures' if the device has 'gone' to the point of FreeBSD just 
 quickly returning 'fail' for every request).

I know ATA will notice a detached channel, because I myself have done
it: administratively, that is -- atacontrol detach ataX.  But the only
time that can happen automatically is if the actual controller does
so itself, or if FreeBSD is told to do it administratively.

What this does to other parts of the kernel and userland applications is
something I haven't tested.  I *can* tell you that there are major,
major problems with detach/reattach/reinit on ata(4) causing kernel
panics and other such things.  I've documented this quite thoroughly in
my Common FreeBSD issues wiki:

http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

I am also very curious to know the exact brand/model of 8-port SATA
controller from Supermicro you are using, *especially* if it uses ata(4)
rather than CAM and da(4).  Such Supermicro controllers were recently
discussed on freebsd-stable (or was it -hardware?), and no one was able
to come to a concise decision as to whether or not they were decent or
even remotely trusted.  Supermicro provides a few different SATA HBAs.

 Ideally, it would be the job of the controller and controller driver to
 announce to underlying I/O operations fail/success.  Do you agree?

 I hope this FMA Engine on Solaris only *tells* underlying pieces of
 I/O errors, rather than acting on them (e.g. automatically yanking the
 disk off the bus for you).  I'm in no way shunning Solaris, I'm simply
 saying such a mechanism could be as risky/deadly as it could be useful.

 Yeah, I guess so - I think the way it's meant to happen (and this is only 
 AFAIK) is that FMA 'detects' a failing drive by applying some 
 configurable policy to it. That policy would also include notifying ZFS, 
 so that ZFS could then decide to stop issuing I/O commands to that 
 device.

It sounds like that is done very differently than on FreeBSD.  If such a
condition happens on FreeBSD (disk errors scrolling by, etc.), the only
way I know of to get FreeBSD to stop sending commands through the ATA
subsystem is to detach the channel (atacontrol detach ataX).

 None of this seems to be in place, at least for ATA under FreeBSD - when 
 a drive goes bad, you can just end up with 'hours' worth of I/O timeouts, 
 until someone intervenes.

I can see the usefulness in Solaris's FMA thing.  My big concern is
whether or not FMA actually pulls the disk off the channel, or if it
just leaves the disk/channel connected and simply informs kernel pieces
not to use it.  If it pulls the disk off the channel, I have serious
qualms with it.

There are also chips on SATA and SCSI controllers which can cause chaos
as well -- specifically, SES/SES2 chips (I'm looking at you, QLogic).
These are supposed to be smart chips that detect when there are a
large number of transport or hardware errors (implying cabling issues,
etc.) and *automatically* yank the disk off the bus.  Sounds great on
paper, but in the field, I see these chips start pulling disks off the
bus, changing SCSI IDs on devices, or induce what appear to be full SCSI
subsystem timeouts (e.g. the SES/SES2 chip has locked up/crashed in some
way, and now your entire bus is dead in the water).  I have seen all of
the above bugs with onboard Adaptec 320 controllers, the systems running
Solaris 8, 9, and OpenSolaris.  Most times it turns out to be the
SES/SES2 chip getting in the way.

 I did enquire on the Open Solaris list about setting limits for 'errors' 
 in ZFS, which netted me a reply that it's FMA (at least in Solaris) 
 that's responsible for this - it just then informs ZFS of the condition. 
 We don't appear (again at least for ATA) to have anything similar for 
 FreeBSD yet :(

My recommendation to people these days is to avoid ata(4) on FreeBSD at
all costs if they expect to encounter disk or hardware failures.  The
ata(4) layer is in no way shape or form reliable in the case of
transport or disk failures, and even sometimes in the case of hot-
swapping.  Try your hardest to find a physical controller that supports
SATA disks and uses CAM/da(4), which WILL provide that reliability

Re: loading multi threaded library into executable enabled for single thread

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 11:55:01AM -0400, Barry Andrews wrote:
 Yes, the exe is tclsh. I understand that linking tclsh with libpthread is
 what would work. However this is very impractical. A user of my library
 shouldn't have to rebuild their tclsh to match my library specs. Another
 option would be to ship tclsh with my lib, but that also is a little weird.
 It seems like the only somewhat practical option I have is to use
 LD_PRELOAD, which is also weird but better than nothing.

This really isn't a FreeBSD problem, as the same sort of issue plagues
other operating systems.  When it comes to threading, you want
*everything* threaded as much as possible -- mix-matching usually does
not work.  The only OS I have seen where that kind of environment works
reliably is Solaris.  I still feel threading is too new of a
technology on UNIX.

Your options as I see them:

1) Require your users to ensure they have a threaded TCL installation,
   and do not promise support in the case they try to use your library
   on a non-threaded installation,
1) Provide two versions of your library -- a threaded and non-threaded
   version.  This may be impractical for performance reasons,
3) Require LD_PRELOAD, which is ugly, agreed.

I think those are pretty much the only options you have at this point.
Not a great set, I know, but it's reality.

 On Fri, Sep 12, 2008 at 11:45 AM, Jeremy Chadwick [EMAIL PROTECTED]wrote:
 
  On Fri, Sep 12, 2008 at 11:00:18AM -0400, Barry Andrews wrote:
   Thanks for the links! But I'm not sure what any of this has to do with
   this particular issue. I have an exe that does not use threads that
   loads a lib that is linked with libpthread. Why does different threading
   implementations affect what I am seeing here? Is there no way for this
   to work in FreeBSD v5.5? Why would this go away if I upgraded to 6.x or
   better?
 
  You're confusing me.  Earlier you said:
 
   I have a multi-threaded library that is linked against libpthread.
   When I
   load this lib into a tclsh process on FreeBSD, I get this error,
 
  So what is the exe?  Are you referring to tclsh?  If so, you need to
  rebuild tclsh from source to link with libpthread.  If not, you need to
  contact whoever provided the binary and ask them to rebuild it from
  source.
 
  Additionally, please ensure that the tclsh binary is linked to the same
  version of libpthread library as your own library.  You want to make
  sure they're both built and linked on the same machine (from the same
  source code) if possible; the simple .so.X versioning method works
  great for major changes, but there are often minor changes that don't
  result in X being increased.
 
  I'm getting the impression that the tclsh binary you have was not built
  on the same machine / from the same source as what your library (the one
  linked with libpthread) was.
 
   Jeremy Chadwick wrote:
   On Fri, Sep 12, 2008 at 09:26:37AM -0400, Barry Andrews wrote:
  
   I don't understand. If it was not broken, then why did it change in
  later
   FreeBSD versions?
  
  
   I should be more explicit: the threading library and implementations
   have changed over time.  There was libc_r, then there was libthr, then
   there was libkse.  This is what we call evolution.  :-)
  
   http://www.unobvious.com/bsd/freebsd-threads.html
   http://kerneltrap.org/node/624
   http://www.freebsd.org/kse/
  
   The gcc -pthread flag is still there on present-day FreeBSD (6 through
   HEAD), and *should* be used.  You can choose not to use it but you must
   ensure during linktime that you explicitly link to -lpthread.
  
  
   On Fri, Sep 12, 2008 at 9:10 AM, Jeremy Chadwick [EMAIL PROTECTED]
  wrote:
  
  
   On Fri, Sep 12, 2008 at 07:41:14AM -0400, Barry Andrews wrote:
  
   Do you know if this is documented in Release Notes or Known Issues or
   somewhere?
  
   Why would it be an issue?  gcc -pthread and libpthread linking is
   documented pretty much everywhere on the web.  There isn't anything
   broken about it, it's how it's done on older FreeBSD.
  
   Note that all of this has significantly changed in later FreeBSD
   versions, and that the 5.x series was deprecated a very long time ago.
  
  
   On Thu, 11 Sep 2008, Barry Andrews wrote:
  
  
   Hi All,
  
   I have a multi-threaded library that is linked against libpthread.
   When I
   load this lib into a tclsh process on FreeBSD, I get this error,
   Recurse on
   private mutex. and crash. I understand that I can have this issue
   when the
   executable is not linked against libpthread but one of the loaded
   libs is.
   Basically, it thinks it's in single threaded mode.
  
   This must be an older version of FreeBSD.  I think you must
   link your application (tclsh or whatever) against libpthread
   in order for this to work.  The libc functions won't get properly
   overloaded by their equivalents in libpthread unless you do
   this.
  
   --
   | Jeremy Chadwickjdc

Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 09:04:22AM -0700, Jeremy Chadwick wrote:
 What this does to other parts of the kernel and userland applications is
 something I haven't tested.  I *can* tell you that there are major,
 major problems with detach/reattach/reinit on ata(4) causing kernel
 panics and other such things.  I've documented this quite thoroughly in
 my Common FreeBSD issues wiki:
 
 http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

This should have read: ... in my ATA/SATA issues and troubleshooting
methods page:

http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 12:04:27PM -0400, Zaphod Beeblebrox wrote:
 On Fri, Sep 12, 2008 at 11:44 AM, Oliver Fromme [EMAIL PROTECTED]wrote:
  Did you try atacontrol detach to remove the disk from
  the bus?  I haven't tried that with ZFS, but gmirror
  automatically detects when a disk has gone away, and
  doesn't try to do anything with it anymore.  It certainly
  should not hang the machine.  After all, what's the
  purpose of a RAID when you have to reboot upon drive
  failure.  ;-)
 
 To be fair, many home users run RAID without the expectation of being able
 to hot swap the drives.  While RAID can provide high availability, but it
 can also provide simple data security.

RAID only ensures a very, very tiny part of data security, and it
depends greatly on what RAID implementation you use.  No RAID
implementation I know of provides against transparent data corruption
(bit-rot), and many RAID controllers and RAID drivers have bugs that
induce corruption (to date, that's (very old ATA) Highpoint chips,
nVidia/nForce chips, JMicron or Silicon Image chips -- all of these are
used on consumer boards).

A big problem is also that end-users *still* think RAID is a replacement
for doing backups.  :-(

 To your point... I suppose you have to reboot at some point after the drive
 failure, but my experience has been that the reboot has been under my
 control some time after the failure (usually when I have the replacement
 drive).

For home use, sure.  Since most home/consumer systems do not include
hot-swappable drive bays, rebooting is required.  Although more and more
consumer motherboards are offering AHCI -- which is the only reliable
way you'll get that capability with SATA.

In my case with servers in a co-lo, it's not acceptable.  Our systems
contain SATA backplanes that support hot-swapping, and it works how it
should (yank the disk, replace with a new one) on Linux -- there is no
need to do a bunch of hoopla like on FreeBSD.  On FreeBSD, with that
hoopla, also take the risk of inducing a kernel panic.  That risk does
not sit well with me, but thankfully I've only been in that situation
(replacing a bad disk + using hot-swapping) once -- and it did work.

At my home, I have a pseudo-NAS system running FreeBSD.  The case is
from Supermicro, a mid-tower, and has a SATA backplane that supports
hot-swapping.  I use ZFS on this system, sporting 3 disks and one
(non-ZFS) for boot/OS.  But because I'm using ata(4) -- see above.

Individuals on -stable and other lists using ZFS have posted their
experiences with disk failures.  I believe to date I've seen one which
worked flawlessly, and the others reporting strange issues with
resilvering, or in a couple cases, lost all their zpools permanently.
Of course, it's very rare in this day and age for people to mail a
mailing list reporting *successes* with something -- people usually only
mail if something *fails*.  :-)

That said, pjd@'s dedication to getting ZFS working reliably on FreeBSD
is outstanding.  It's a great filesystem replacement, and even the Linux
folks are a bit jealous over how simple and painless it is.  I can
share their jealousy -- I've looked at the LVM docs... never again.

 About the only real improvement I'd like to see in this setup is the ability
 to spin down idle drives.  That would be an ideal setup for the home RAID
 array.

There is a FreeBSD port which handles this, although such a feature
should ideally be part of the ata(4) system (as should TCQ/NCQ and a
slew of other things -- some of those are being worked on).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS w/failing drives - any equivalent of Solaris FMA?

2008-09-12 Thread Jeremy Chadwick
On Fri, Sep 12, 2008 at 10:12:09AM -0700, Freddie Cash wrote:
 On September 12, 2008 09:32 am Jeremy Chadwick wrote:
  For home use, sure.  Since most home/consumer systems do not include
  hot-swappable drive bays, rebooting is required.  Although more and
  more consumer motherboards are offering AHCI -- which is the only
  reliable way you'll get that capability with SATA.
 
  In my case with servers in a co-lo, it's not acceptable.  Our systems
  contain SATA backplanes that support hot-swapping, and it works how it
  should (yank the disk, replace with a new one) on Linux -- there is no
  need to do a bunch of hoopla like on FreeBSD.  On FreeBSD, with that
  hoopla, also take the risk of inducing a kernel panic.  That risk does
  not sit well with me, but thankfully I've only been in that situation
  (replacing a bad disk + using hot-swapping) once -- and it did work.
 
 Hrm, is this with software RAID or hardware RAID?

I do not use either, but have tried software RAID (Intel MatrixRAID) in
the past (and major, MAJOR bugs are why I do not any longer).  Speaking
(mostly) strictly of FreeBSD, let me list off the problems with both:

Software RAID:

1) Buggy as hell.  Using Intel MatrixRAID as an example, even with
   RAID 1, due to ata(4) driver bugs, you are practically guaranteed
   to lose your data,
3) Limited userland interface to RAID BIOS; many operations do not
   work with atacontrol, requiring a system reboot + entering BIOS
   to do things like add/remove disks or rebuild an array
3) SMART monitoring lost; if the card or BIOS supports passthrough
   (basically ATA version of pass(4)), FreeBSD will see the disks
   natively (e.g. arX for the RAID, ad4 and ad8 for the disks), and
   you can use smartmontools.  Otherwise, you're screwed
4) Support is questionable; numerous mainstream chips unsupported,
   including Adaptec HostRAID

Hardware RAID:

1) You are locked in to that controller.  Your data is at the
   mercy of the company who makes the HBA; if your controller dies
   and is no longer made, your data is dead in the water.  Chances
   are a newer model/revision of controller will not understand the
   the disk metadata from the previous controller
2) Performance problems as a result of excessive caching levels;
   onboard hardware cache vs. system memory cache vs. disk layer
   cache in OS vs. other kernel caching mechanisms
3) Controller firmware upgrades are risky -- 3Ware has a very nasty
   history of this, for sake of example.  I've heard of some upgrades
   changing the metadata format, requiring complete array re-creation
   
I can pull Ade Lovett [EMAIL PROTECTED] into this conversation if you
think any of the above is exaggerated.  :-)

The only hardware RAID controller I'd trust at this point would be
Areca -- but hardware RAID is not what I want.  On the other hand, I
really want Areca to make a standard 4 or 8-port SATA controller --
no RAID, but full driver support under arcmsr(4) (which uses CAM and
da(4)).  This would be perfect.

 With our hardware RAID systems, the process has always been the same, 
 regardless of which OS (Windows 2003 Servers, Debian Linux, FreeBSD) is 
 on the system:
   - go into RAID management GUI, remove drive
   - pull dead drive from system
   - insert new drive into system
   - go into RAID management GUI, make sure it picked up new drive and 
 started the rebuild

The simplicity there is correct -- that's really how simple it should
be.  But a GUI?  What card is this that requires a GUI?  Does it require
a reboot?  No command-line support?

 We've been lucky so far, and not had to do any drive replacements on our 
 non-ZFS software RAID systems (md on Debian, gmirror on FreeBSD).  I'm 
 not looking forward to a drive failing, as these systems have 
 non-hot-pluggable SATA setups.

I'm hearing you loud and clear.  :-)

 On the ZFS systems, we just zpool offline the drive, physically replace 
 the drive, and zpool replace the drive.  On one system, this was done 
 via hot-pluggable SATA backplane, on another, it required a reboot.

If this was done on the hardware RAID controller (presuming it uses
CAM and da(4)), I'm not surprised it worked perfectly.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Increasing KVM on amd64

2008-09-10 Thread Jeremy Chadwick
On Wed, Sep 10, 2008 at 04:12:25PM -0700, Artem Belevich wrote:
 Alan,
 
 Thanks a lot for the patch. I've applied it to RELENG_7 and it seems
 to work great - make -j8 buildworld succeeds, linux emulation seems
 to work well enough to run linux-sun-jdk14 binaries, ZFS ARC size is
 bigger, too. So far I didn't see any ZFS-related KVM shortages either.
 
 The only problem is that everything is fine as long as vm.kmem_size is
 set to less or equal to 4096M. As soon as I set it to 4100M or
 anything larger, kernel crashes on startup. I'm unable to capture
 exact crash messages as they keep scrolling really fast on the screen
 for a few seconds until the box reboots. Unfortunately  the box does
 not have built-in serial ports, so the messages are gone before I can
 see them. :-(
 
 Is there a way to bump up KVM size even further - beyond 6GB? I've got
 a box with 8GB or RAM and would like let ZFS ARC use most of it which
 would require pretty large vm.kmem_max to fit it in.

I was told fairly recently (a few days ago) that the 6GB limit was
increased to 512GB on HEAD/CURRENT.  The 6GB limit was during a
transitional phase of addressing the problem.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IPFW uid logging...

2008-09-08 Thread Jeremy Chadwick
On Mon, Sep 08, 2008 at 04:03:29PM -0400, Dan Mahoney, System Admin wrote:
 On Mon, 8 Sep 2008, Dan Nelson wrote:

 In the last episode (Sep 08), Dan Mahoney, System Admin said:
 I have the following rule set up in ipfw to limit the exposure of bad
 php scripts and trojans that try to send mail directly.

 allow tcp from any to any dst-port 25 uid root
 deny log tcp from any to any dst-port 25 out

 However, the log messages I get look like this:

 Sep  8 13:21:11 security.info prime kernel: ipfw: 610 Deny TCP 
 72.9.101.130:58117 209.85.133.114:25 out via em0
 Sep  8 13:21:16 security.info prime kernel: ipfw: 610 Deny TCP 
 72.9.101.130:56672 202.12.31.144:25 out via em0

 Which is to say, they don't include the UID -- and I have several hundred
 sites, each with its own UID.

 Yes, I could go ahead and set up a thousand deny rules, one for
 each UID -- but being able to log this info (since it IS being
 checked) would be great.

 It should be possible to add a couple more arguments to ipfw_log() so
 that ipfw_chk() can pass it the ugid_lookup flag and a pointer to the
 fw_ugid_cache struct.  Then you can edit ipfw_log to print the contents
 of that struct if ugid_lookup==1.  That would result in the logging of
 uid for any failed packet that had to go through a uid check on the way
 to the deny rule.

 Okay, so if it's fairly easy to do, the question would be since I don't  
 feel right hacking in this change myself -- how could I propose this as a 
 feature?  It's not a BUG per-se, but I think it could be useful to 
 others as well.

send-pr it.  Category=kern, Class=change-request.

Reference this thread in the Fix section:

http://lists.freebsd.org/pipermail/freebsd-hackers/2008-September/025920.html

FWIW, I think it's also a good idea.  The output formatting of the log
line might need to be adjusted carefully though, since any programs
which grep on a very strict regex will start failing.  I'm inclined
to recommend the string , UID xxx be appended to the existing string,
e.g.

Sep  8 13:21:11 security.info prime kernel: ipfw: 610 Deny TCP 
72.9.101.130:58117 209.85.133.114:25 out via em0, UID 6592

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Temp files in /etc

2008-09-06 Thread Jeremy Chadwick
On Fri, Sep 05, 2008 at 08:31:35PM -0700, Jeremy Chadwick wrote:
 ...
 If they still attempt to use /tmp, said programs could probably be
 modified to support TMPDIR.

This should have read /etc, not /tmp.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Temp files in /etc

2008-09-06 Thread Jeremy Chadwick
On Fri, Sep 05, 2008 at 07:40:13PM -0700, Joshua Piccari wrote:
 Hi all,
 I am setting up a few jails and I want them all to use the same /etc files
 (with the exception of the files related to the password files and
 databases), so I mounted a shared /etc folder as a nullfs with read-only
 permissions. The problem is that using utilities like pw or chpass create
 temporary files in /etc and that file system is mounted read-only.
 So is there a way to force any utilities that create temp files in /etc to
 use another location, something like /usr/local/etc for example?

I've had a chat with another user off-list about this, and the
conclusion reached is that your mounting of /etc read-only is a bad
idea, for many different reasons.  Let's step through things slowly, so
that hopefully it'll make sense.

Foremost, /etc is mounted read-only, so what purpose does it serve to be
using passwd or group-editing utilities on that system?  You'd need r/w
access to be able to accomplish that.

Secondly, utilities like vipw(8), chpass(1), pw(8), and many others all
create temporary files in /etc for security reasons: the temporary files
*must* be on the same filesystem.  In your case, /etc is its own
filesystem, mounted read-only.  So, placing the temporary files (e.g.
/etc/pw.XX when using vipw(8)) on a separate filesystem or separate
location is not plausible.  Regarding the security implications, others
will have to chime in here.

Thirdly, some (but not all) of the utilities support command-line flags
that allow an alternative directory to /etc:

pw(8)   -V flag
vipw(8) -d flag
pwd_mkdb(8) -d flag
chpass(1)   no support
passwd(1)   no support
rmuser(8)   no support
adduser(8)  no support

Fourthly, there are periodic(8) scripts which explicitly refer to
/etc/master.passwd and do not support an alternative directory.  Those
scripts will break, and disabling them is not recommended.

Finally, some other caveats/situations which will likely arise:

- The administrator (you) will have to remember to use the above flags
  every time they use said utilities; chances are you'll forget,
  especially since the flags aren't all the same,
- A user of your jail may become very surprised when they find
  passwd, group, or other files missing from /etc,
- Third-party software which reads /etc/passwd or related files will
  fail since you'd be using an alternative /etc directory.  I'm
  pretty sure we have some ports which use rmuser/adduser (meaning
  the software itself, not necessarily the port installation part).

Hope this sheds some light on things.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Extending find(1) to support -printf

2008-09-05 Thread Jeremy Chadwick
I've been working on $SUBJECT for the past few hours, and have managed
to implement a very crude subset of GNU find's features:

http://www.gnu.org/software/findutils/manual/html_node/find_html/Format-Directives.html#Format-Directives

I've implemented %f and %p (which appear identical to GNU find), and
some escaped characters.

Things I need help with, as string parsing in C has never been my forte
(which will become quite obvious):

1) Getting %h to behave like GNU find.  The GNU find code is
significantly different than ours.  As it stands, %h is broken.

2) find . -printf '\' results in odd output (SHELL=/usr/local/bin/bash
on my box).  Not sure why this is happening, but it's a big concern.

3) Security issues.  I believe use of a large number of formatting
variables could exceed the calloc()'d buffer (of MAXPATHLEN), causing
a segfault at bare minimum.  I'm not sure how to work around this.

Also, some folks on #bsdports asked why I was bothering with this in the
first place: mutt supports backticks to run shell commands inside of
a muttrc file.  See Building a list of mailboxes on the fly below:

http://wiki.mutt.org/?ConfigTricks

Note the find ... -printf '%h ' method.  I can accomplish (just
about) the same using `echo $HOME/Maildir/*`, but if I want to
exclude an entry, I can't use | grep -v, because mutt doesn't support
pipes within backticks.  :-)
  
-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

diff -ruN find.orig/extern.h find/extern.h
--- find.orig/extern.h  2006-05-14 13:23:00.0 -0700
+++ find/extern.h   2008-09-04 20:55:17.0 -0700
@@ -73,6 +73,7 @@
 creat_fc_nouser;
 creat_fc_perm;
 creat_fc_print;
+creat_fc_printf;
 creat_fc_regex;
 creat_fc_simple;
 creat_fc_size;
@@ -107,6 +108,7 @@
 exec_f f_perm;
 exec_f f_print;
 exec_f f_print0;
+exec_f f_printf;
 exec_f f_prune;
 exec_f f_regex;
 exec_f f_size;
diff -ruN find.orig/function.c find/function.c
--- find.orig/function.c2006-05-27 11:27:41.0 -0700
+++ find/function.c 2008-09-05 03:01:36.0 -0700
@@ -1272,6 +1272,86 @@
 /* c_print0 is the same as c_print */
 
 /*
+ * -printf functions --
+ *
+ * Always true, manipulates output based on printf()-like
+ * formatting characters.
+ */
+int
+f_printf(PLAN *plan, FTSENT *entry)
+{
+   char *scan;
+   char *outptr;
+   char *outidx;
+
+   if ((outptr = calloc(MAXPATHLEN, 1)) == NULL)
+   err(1, NULL);
+
+   outidx = outptr;
+
+   for (scan = plan-c_data; *scan; scan++) {
+   if (*scan == '%') {
+   if (scan[1] == 0) {
+   errx(1, missing format character);
+   }
+   else if (scan[1] == '%') {
+   *outidx++ = '%';
+   }
+   else if (scan[1] == 'f') {
+   strcpy(outidx, entry-fts_name);
+   outidx += entry-fts_namelen;
+   }
+   /* XXX - needs to behave like GNU find %h */
+   /*
+   else if (scan[1] == 'h') {
+   strcpy(outidx, entry-fts_path);
+   outidx += entry-fts_pathlen;
+   }
+   */
+   else if (scan[1] == 'p') {
+   strcpy(outidx, entry-fts_path);
+   outidx += entry-fts_pathlen;
+   }
+   scan++;
+   }
+   else if (*scan == '\\') {
+   if (scan[1] == '\\') {
+   *outidx++ = '\\';
+   }
+   else if (scan[1] == 'n') {
+   *outidx++ = '\n';
+   }
+   else if (scan[1] == 't') {
+   *outidx++ = '\t';
+   }
+   scan++;
+   }
+   else {
+   *outidx++ = *scan;
+   }
+   }
+
+   (void)printf(outptr);
+   free(outptr);
+   return 1;
+}
+
+PLAN *
+c_printf(OPTION *option, char ***argvp)
+{
+   char *argstring;
+   PLAN *new;
+
+   argstring = nextarg(option, argvp);
+   ftsoptions = ~FTS_NOSTAT;
+   isoutput = 1;
+
+   new = palloc(option);
+   new-c_data = argstring;
+   return new;
+}
+
+/*
  * -prune functions --
  *
  * Prune a portion of the hierarchy.
diff -ruN find.orig/option.c find/option.c
--- find.orig/option.c  2006-04-05 16:06:11.0 -0700

Re: Extending find(1) to support -printf

2008-09-05 Thread Jeremy Chadwick
On Fri, Sep 05, 2008 at 03:12:53AM -0700, Jeremy Chadwick wrote:
 Also, some folks on #bsdports asked why I was bothering with this in the
 first place: mutt supports backticks to run shell commands inside of
 a muttrc file.  See Building a list of mailboxes on the fly below:
 
 http://wiki.mutt.org/?ConfigTricks
 
 Note the find ... -printf '%h ' method.  I can accomplish (just
 about) the same using `echo $HOME/Maildir/*`, but if I want to
 exclude an entry, I can't use | grep -v, because mutt doesn't support
 pipes within backticks.  :-)

Follow-up:

mutt's backtick support does in fact respect pipes.  My echo|grep -v was
doing exactly what I requested: the grep -v was removing all output of
the echo, since echo returned the results in a space-delimited format,
not one per line.  Hence, mailboxes was being executed without any
arguments.

Equally as frustrating, mutt's backtick support will only honour the
first line of input.  If a backticked command returns multiple lines,
only the first is read; the rest are ignored.  This makes using BSD find
annoying, since find always outputs results terminated with a newline.
One of my peers uses find | perl -ne 'chomp; print =, $_,  ' to deal
with this limit, which is quite disgusting.

I realise there are workarounds for the dilemma (e.g. write a shell
script that provides the exact output needed), but it seems like one
could kill two birds with one stone by extending BSD find to support
-printf, which does not output a newline unless \n is used within the
output formatting.  (This also explains why the Mutt Wiki entry uses
-printf '%h ', note the space.)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Temp files in /etc

2008-09-05 Thread Jeremy Chadwick
On Fri, Sep 05, 2008 at 07:40:13PM -0700, Joshua Piccari wrote:
 Hi all,
 I am setting up a few jails and I want them all to use the same /etc files
 (with the exception of the files related to the password files and
 databases), so I mounted a shared /etc folder as a nullfs with read-only
 permissions. The problem is that using utilities like pw or chpass create
 temporary files in /etc and that file system is mounted read-only.
 So is there a way to force any utilities that create temp files in /etc to
 use another location, something like /usr/local/etc for example?

It depends entirely on how each individual program makes temporary
files; there is no standard.

libc offers a many different methods of creating temporary files:
tmpfile(3), tmpnam(3), tempnam(3), mktemp(3), and mkstemp(3).  You can
read the manpages to get an idea of how chaotic the situation is.

Other programs may implement their own temporary file creation methods
entirely, and may/may not support TMPDIR.

I would try export TMPDIR=/some/place and then attempt using pw and
chpass, and see what happens.  If they still attempt to use /tmp,
said programs could probably be modified to support TMPDIR.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: lighttpd failing to accept new connections ( connection reset )

2008-08-28 Thread Jeremy Chadwick
On Thu, Aug 28, 2008 at 01:13:57PM +0100, Steven Hartland wrote:
 We're using lighttpd here for a new project and we're having issues
 where by it simply stops processing after a 1-2 days.

 Having looked at it in some detail this morning it seems that
 the kernel is resetting the connection without notifying the
 lighttpd process there is a new connection attempt. I assume
 that the listen queue is full but why kevent is not notifying
 lighttpd that there are outstanding events is beyond me.


 The following is a truss of the process which is currently in
 this state:-
 kevent(6,0x0,0,{},11096,{1.0})   = 0 (0x0)
 gettimeofday({1219920575.149428},0x0)= 0 (0x0)
 kevent(6,0x0,0,{},11096,{1.0})   = 0 (0x0)
 gettimeofday({1219920576.150443},0x0)= 0 (0x0)

 ktrace of the operation as well:-
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)
 28363 lighttpd GIO   fd 6 wrote 0 bytes
   
 28363 lighttpd GIO   fd 6 read 0 bytes
   
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)
 28363 lighttpd GIO   fd 6 wrote 0 bytes
   
 28363 lighttpd GIO   fd 6 read 0 bytes
   
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)
 28363 lighttpd GIO   fd 6 wrote 0 bytes
   
 28363 lighttpd GIO   fd 6 read 0 bytes
   
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)
 28363 lighttpd GIO   fd 6 wrote 0 bytes
   
 28363 lighttpd GIO   fd 6 read 0 bytes
   
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)
 28363 lighttpd GIO   fd 6 wrote 0 bytes
   
 28363 lighttpd GIO   fd 6 read 0 bytes
   
 28363 lighttpd RET   kevent 0
 28363 lighttpd CALL  gettimeofday(0x7fffeb20,0)
 28363 lighttpd RET   gettimeofday 0
 28363 lighttpd CALL  kevent(0x6,0,0,0x800e66000,0x2b58,0x7fffeb20)


 tcpdump shows:-
 12:10:29.475255 IP (tos 0x10, ttl  64, id 9536, offset 0, flags [DF], 
 proto: TCP (6), length: 64) client.61224  server.80: S, cksum 0x6d22 
 (incorrect (- 0xedfa), 291994449:291994449(0) win 65535 mss 
 1460,nop,wscale 1,nop,nop,timestamp 3661727139 0,sackOK,eol
 12:10:29.481396 IP (tos 0x0, ttl  61, id 25503, offset 0, flags [DF], 
 proto: TCP (6), length: 60) server.80  client.61224: S, cksum 0xbf22 
 (correct), 3444532576:3444532576(0) ack 291994450 win 65535 mss 
 1460,nop,wscale 9,sackOK,timestamp 3136311843 3661727139
 12:10:29.481419 IP (tos 0x10, ttl  64, id 9538, offset 0, flags [DF], 
 proto: TCP (6), length: 52) client.61224  server.80: ., cksum 0x6d16 
 (incorrect (- 0x6bd2), 1:1(0) ack 1 win 33304 nop,nop,timestamp 
 3661727145 3136311843
 12:10:29.487519 IP (tos 0x10, ttl  61, id 25504, offset 0, flags [DF], 
 proto: TCP (6), length: 40) server.80  client.61224: R, cksum 0x20c7 
 (correct), 3444532577:3444532577(0) win 0

 This may have been raised before back 2003 as bug kern/57380
 but it was closed after no response from the reporter.

 Another possible issues related to this is:-
 http://trac.lighttpd.net/trac/ticket/1734


 I've currently got one of the production machines offline
 with this error ( hence the important flag ) in the hope
 that someone can suggest a test which will shed more light
 on the issue before I restart it.

Can you change the polling method in lighttpd to use poll or select
instead of kqueue?  This would help in determining if the problem is
with the daemon itself or the kevent system.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: netstat: kvm_read: Bad address

2008-08-25 Thread Jeremy Chadwick
On Mon, Aug 25, 2008 at 05:39:52PM +0530, vasanth raonaik wrote:
 Hello Hackers,
 
 I am facing with this Issue. Though netstat -a does show some output but the
 error is consistently seen. Does any one has some pointers to the cause and
 fix for the same.

I've seen this message when a user upgrades the kernel to newer sources
(e.g. csup/cvsup), and rebuilds/reinstalls the kernel, but **does not**
rebuild/reinstall userland program (e.g. world).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: netstat: kvm_read: Bad address

2008-08-25 Thread Jeremy Chadwick
On Mon, Aug 25, 2008 at 07:22:19PM +0530, vasanth raonaik wrote:
 Both kernel and utility are in sync. Any more ideas?
 
 On Mon, Aug 25, 2008 at 6:19 PM, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  On Mon, Aug 25, 2008 at 05:39:52PM +0530, vasanth raonaik wrote:
   Hello Hackers,
  
   I am facing with this Issue. Though netstat -a does show some output but
  the
   error is consistently seen. Does any one has some pointers to the cause
  and
   fix for the same.
 
  I've seen this message when a user upgrades the kernel to newer sources
  (e.g. csup/cvsup), and rebuilds/reinstalls the kernel, but **does not**
  rebuild/reinstall userland program (e.g. world).

Nope, don't have any.  Others will have to help.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: local network throughput issues

2008-08-16 Thread Jeremy Chadwick
On Sat, Aug 16, 2008 at 02:43:11PM -0500, Rick C. Petty wrote:
 Hello.  I've been having some serious network throughput issues recently.
 Prior to these issues, I was running 7.0-STABLE from a few months back.  I
 recently build a world/kernel from csup(1) on 2008-Jul-25 and started
 noticing the issues.  I'm currently on a RELENG_7 from 2008-Aug-02 and am
 building another one today.
 
 The server is running samba, dnsmasq, an NFS server, and is the gateway for
 a couple of freebsd machines.  I first noticed the issue when I was unable
 to play mp3s served over samba.  I was getting a throughput of about 3
 KBytes/sec over my gigabit switched network.  After a few weeks of trying
 different versions of samba and watching tcpdump/trafshow, I decided to try
 another test:
 
 % ssh gateway cat /dev/zero | dd of=/dev/null
 load: 0.00  cmd: dd 68286 [runnable] 0.00u 0.00s 0% 844k
 3680+0 records in
 3680+0 records out
 1884160 bytes transferred in 512.856684 secs (3674 bytes/sec)
 ^C3680+0 records in
 3680+0 records out
 1884160 bytes transferred in 513.392858 secs (3670 bytes/sec)
 Killed by signal 2.
 
 Again, not even 4 KB/s throughput over ssh.  What's weird is that not all
 network traffic is slow.  I'm able to download at almost my full DSL speed
 (7168 Kbps) and here's the strange bit:  NFS is fast!  From the same
 machine as above, I have an NFS mountpoint:
 
 % dd if=/nfs/some_large_file of=/dev/null
 55980+0 records in
 55980+0 records out
 28661760 bytes transferred in 0.147159 secs (194767476 bytes/sec)
 
 Which is what I would expect for a gigabit network.  Even if I perform the
 same test over ssh:
 
 % ssh workstation cat /nfs/some_large_file | dd of=/dev/null
 55980+0 records in
 55980+0 records out
 28661760 bytes transferred in 2.791392 secs (10267909 bytes/sec)
 
 This is reasonable considering ssh overhead.  The kernel config for the
 server contains:
 
 include GENERIC
 ident   DDB
 options KDB
 options KDB_TRACE
 options DDB
 options DDB_NUMSYM
 
 My internal network is on 172.23.20.x and the DSL modem is connected to the
 same NIC on 192.168.0.1:
 
 # ifconfig nfe0
 nfe0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
 options=10bRXCSUM,TXCSUM,VLAN_MTU,TSO4
 ether 00:15:f2:17:0c:20
 inet 172.23.20.1 netmask 0xff00 broadcast 172.23.20.255
 inet 192.168.0.3 netmask 0xff00 broadcast 192.168.0.255
 media: Ethernet autoselect (1000baseTX full-duplex,flag0,flag1)
 status: active
 
 # ipfw show
 0010062968 8170100 allow ip from any to any via lo0
 001100   0 deny ip from any to 127.0.0.0/8
 001200   0 deny ip from 127.0.0.0/8 to any
 00130 25444031 23419621364 divert 8668 ip from any to any via 192.168.0.3
 001400   0 deny ip from not 172.23.20.0/24 to 172.23.20.0/24 
 dst-port 137-149,445
 00150 20639141 15067620538 allow tcp from any to any established
 00160   21   10683 allow ip from any to any frag
 0017079791 4696696 allow tcp from any to any setup
 65530  5713367  8864760207 allow ip from any to any
 655350   0 deny ip from any to any
 
 Other than that, I'm not doing anything out of the ordinary.  Why is NFS
 behaving correctly and why are ssh/smbd connections so slow?  I've pasted
 my dmesg output below.  I've used this configuration for years and it
 wasn't until a recent RELENG_7 upgrade that I've had any problems.  The box
 was 99-100% idle during those tests and I don't see an interrupt storm or
 anything funny like that.  Any ideas?

1) Please provide netstat -in output.

2) NFS (unless you're explicitly disabling it) is UDP-based, while SSH
and Samba are TCP-based.  Your nfe0 device has TSO4 enabled on it, so
I'm left wondering if the TCP offloading support for your nfe(4) device
is broken.

Can you try disabling it by adding -tso to your ifconfig_nfe0 line in
/etc/rc.conf?  If you're using DHCP on that interface, that may pose
somewhat of a problem.

3) Can you disable the firewall (disable ipfw entirely) and see if the
problem continues?

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: local network throughput issues

2008-08-16 Thread Jeremy Chadwick
On Sat, Aug 16, 2008 at 05:28:31PM -0500, Rick C. Petty wrote:
  2) NFS (unless you're explicitly disabling it) is UDP-based, while SSH
  and Samba are TCP-based.  Your nfe0 device has TSO4 enabled on it, so
  I'm left wondering if the TCP offloading support for your nfe(4) device
  is broken.
  
  Can you try disabling it by adding -tso to your ifconfig_nfe0 line in
  /etc/rc.conf?  If you're using DHCP on that interface, that may pose
  somewhat of a problem.
 
 Yes, that seems to have made all the difference in the world:
 
 % ssh gateway cat /dev/zero | dd of=/dev/null
 load: 0.08  cmd: ssh 68698 [runnable] 1.20u 0.25s 11% 3020k
 94384+0 records in
 94384+0 records out
 48324608 bytes transferred in 5.314707 secs (9092619 bytes/sec)
 load: 0.08  cmd: ssh 68698 [runnable] 1.81u 0.33s 15% 3020k
 147664+0 records in
 147664+0 records out
 75603968 bytes transferred in 7.652768 secs (9879297 bytes/sec)
 
 So I'm thinking TSO wasn't an option in the older 7-stable I was running
 and now it is, but the support for it is broken.

TSO support is something that's implemented in each network driver.  TSO
in nfe(4) was commited to HEAD (CURRENT) on June 12th 2007.  See
revision 1.17 below:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/nfe/if_nfe.c

That would have been what is now considered RELENG_7 (since FreeBSD
7.0-RELEASE was announced and made available on February 27th, 2008).

I'm under the impression that TSO was available/enabled for you before,
but I'm not sure because there's not a lot of historic data available
here.  I don't know why/how it broke, or what has changed.

I've CC'd one of the nfe(4) maintainers, PYUN Yong-Hyeon, who should be
able to help determine what's going on.  Yong-Hyeon, his dmesg output is
available here, but you'll probably need more than that:

http://lists.freebsd.org/pipermail/freebsd-hackers/2008-August/025706.html

 Your comment about DHCP, would that affect dhcpd or dhclient?  This is my
 server machine so I don't run dhclient on it.  I hardcode the IP
 connecting to the DSL modem.  I'm currently hardcoding all the other
 machines also so I should be okay.

My comment about DHCP is WRT to the FreeBSD box acting as a DHCP client
(e.g. fetching an IP from your ISP).  I believe dhclient (when getting a
new IP) might override any interface options you set in rc.conf; purely
speculative on my part, but I wanted to mention it in the case you
didn't have a static IP configured in rc.conf.

  3) Can you disable the firewall (disable ipfw entirely) and see if the
  problem continues?
 
 Well the firewall is primarily for NAT and port forwarding.  There's
 nothing special about it.  It looks like the TSO disabling fixed my
 problems.  Thank you for the suggestion!

No problem.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/98388: [ata] FreeBSD 6.1 - WDC WD1200JS SATA II disks are seen as older SATA

2008-08-14 Thread Jeremy Chadwick
On Thu, Aug 14, 2008 at 10:57:53AM +0400, sam wrote:
 Andrey V. Elsukov wrote:
 sam wrote:
 FreeBSD  7.0-STABLE FreeBSD 7.0-STABLE #5: Tue Aug 12 13:54:27 MSD  
 2008root@:/usr/obj/usr/src/sys/GENERIC  i386
 |

 please, any solution ?

 Probably speed is limited via jumpers on your hard drive.

 http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1409p_created=#jumper
 tried it

 without results

FWIW, the only time I've seen this happen is when there's a jumper
limiting the capability.  You should have **removed** the OPT1 jumper,
and left any other jumpers alone.

If you're absolutely sure the jumper is removed, I'll purchase one of
these drives and test it on an ICH7 (Supermicro PDSMi+) to confirm your
findings.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/98388: [ata] FreeBSD 6.1 - WDC WD1200JS SATA II disks are seen as older SATA

2008-08-14 Thread Jeremy Chadwick
On Thu, Aug 14, 2008 at 12:16:16AM -0700, Jeremy Chadwick wrote:
 On Thu, Aug 14, 2008 at 10:57:53AM +0400, sam wrote:
  Andrey V. Elsukov wrote:
  sam wrote:
  FreeBSD  7.0-STABLE FreeBSD 7.0-STABLE #5: Tue Aug 12 13:54:27 MSD  
  2008root@:/usr/obj/usr/src/sys/GENERIC  i386
  |
 
  please, any solution ?
 
  Probably speed is limited via jumpers on your hard drive.
 
  http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1409p_created=#jumper
  tried it
 
  without results
 
 FWIW, the only time I've seen this happen is when there's a jumper
 limiting the capability.  You should have **removed** the OPT1 jumper,
 and left any other jumpers alone.
 
 If you're absolutely sure the jumper is removed, I'll purchase one of
 these drives and test it on an ICH7 (Supermicro PDSMi+) to confirm your
 findings.

Actually, I don't need to -- I'm using WD5000AAKS disks myself on
that exact system:

atapci1: Intel AHCI controller port 
0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf mem 
0xe8600400-0xe86007ff irq 19 at device 31.2 on pci0
atapci1: [ITHREAD]
atapci1: AHCI Version 01.10 controller with 4 ports detected
ata2: ATA channel 0 on atapci1
ata2: [ITHREAD]
ata3: ATA channel 1 on atapci1
ata3: [ITHREAD]
ata4: ATA channel 2 on atapci1
ata4: [ITHREAD]
ata5: ATA channel 3 on atapci1
ata5: [ITHREAD]

ad6: 476940MB WDC WD5000AAKS-00YGA0 12.01C02 at ata3-master SATA300
ad8: 476940MB WDC WD5000AAKS-00TMA0 12.01C01 at ata4-master SATA300

icarus# atacontrol cap ad6

Protocol  Serial ATA II
device model  WDC WD5000AAKS-00YGA0
serial number WD-WCAS83974519
firmware revision 12.01C02
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported   976773168 sectors
dma supported
overlap not supported

Feature  Support  EnableValue   Vendor
write cacheyes  yes
read ahead yes  yes
Native Command Queuing (NCQ)   yes   -  31/0x1F
Tagged Command Queuing (TCQ)   no   no  31/0x1F
SMART  yes  yes
microcode download yes  yes
security   yes  no
power management   yes  yes
advanced power management  no   no  0/0x00
automatic acoustic management  yes  no  254/0xFE128/0x80

icarus# atacontrol cap ad8

Protocol  Serial ATA II
device model  WDC WD5000AAKS-00TMA0
serial number WD-WCAPW2137942
firmware revision 12.01C01
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported   976773168 sectors
dma supported
overlap not supported

Feature  Support  EnableValue   Vendor
write cacheyes  yes
read ahead yes  yes
Native Command Queuing (NCQ)   yes   -  31/0x1F
Tagged Command Queuing (TCQ)   no   no  31/0x1F
SMART  yes  yes
microcode download yes  yes
security   yes  no
power management   yes  yes
advanced power management  no   no  0/0x00
automatic acoustic management  yes  no  254/0xFE128/0x80

icarus# atacontrol mode ad6
current mode = SATA300

icarus# atacontrol mode ad8
current mode = SATA300

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/98388: [ata] FreeBSD 6.1 - WDC WD1200JS SATA II disks are seen as older SATA

2008-08-14 Thread Jeremy Chadwick
On Thu, Aug 14, 2008 at 12:02:50PM +0400, sam wrote:
 Jeremy Chadwick wrote:
 On Thu, Aug 14, 2008 at 12:16:16AM -0700, Jeremy Chadwick wrote:
   
 On Thu, Aug 14, 2008 at 10:57:53AM +0400, sam wrote:
 
 Andrey V. Elsukov wrote:
   
 sam wrote:
 
 FreeBSD  7.0-STABLE FreeBSD 7.0-STABLE #5: Tue Aug 12 13:54:27 
 MSD  2008root@:/usr/obj/usr/src/sys/GENERIC  i386
 |

 please, any solution ?
   
 Probably speed is limited via jumpers on your hard drive.

 
 http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1409p_created=#jumper
 tried it

 without results
   
 FWIW, the only time I've seen this happen is when there's a jumper
 limiting the capability.  You should have **removed** the OPT1 jumper,
 and left any other jumpers alone.

 If you're absolutely sure the jumper is removed, I'll purchase one of
 these drives and test it on an ICH7 (Supermicro PDSMi+) to confirm your
 findings.
 

 Actually, I don't need to -- I'm using WD5000AAKS disks myself on
 that exact system:

 atapci1: Intel AHCI controller port 
 0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf mem 
 0xe8600400-0xe86007ff irq 19 at device 31.2 on pci0
 atapci1: [ITHREAD]
 atapci1: AHCI Version 01.10 controller with 4 ports detected
 ata2: ATA channel 0 on atapci1
 ata2: [ITHREAD]
 ata3: ATA channel 1 on atapci1
 ata3: [ITHREAD]
 ata4: ATA channel 2 on atapci1
 ata4: [ITHREAD]
 ata5: ATA channel 3 on atapci1
 ata5: [ITHREAD]

 ad6: 476940MB WDC WD5000AAKS-00YGA0 12.01C02 at ata3-master SATA300
 ad8: 476940MB WDC WD5000AAKS-00TMA0 12.01C01 at ata4-master SATA300
   
 may issue in Intel ICH7 SATA300 controller driver ?

 
 atapci1: Intel ICH7 SATA300 controller port  
 0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0xc08f  
 irq 19 at device 31.2 on pci0
 

Possibly.  All my Intel ICH7 boards have AHCI capability, and I use it.
See Chapter 4 here:

http://www.supermicro.com/manuals/motherboard/3000/MNL-0889.pdf

If your motherboard does, I'd recommend enabling it as well and see if
things change.

Regarding the Enhanced vs. Compatible mode: use Enhanced.  On
my boards, choosing Enhanced makes the AHCI and Intel MatrixRAID
options appear.

I'm fairly certain you don't need AHCI to get SATA300, though.

I would recommend you re-check the jumpers on your disks to make sure
you didn't make a mistake when adjusting things.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/98388: [ata] FreeBSD 6.1 - WDC WD1200JS SATA II disks are seen as older SATA

2008-08-14 Thread Jeremy Chadwick
On Thu, Aug 14, 2008 at 01:01:44PM +0400, sam wrote:
 Jeremy Chadwick wrote:
 On Thu, Aug 14, 2008 at 12:02:50PM +0400, sam wrote:
   
 Jeremy Chadwick wrote:
 
 On Thu, Aug 14, 2008 at 12:16:16AM -0700, Jeremy Chadwick wrote:
 
 On Thu, Aug 14, 2008 at 10:57:53AM +0400, sam wrote:
 
 Andrey V. Elsukov wrote:
 
 sam wrote:
 
 FreeBSD  7.0-STABLE FreeBSD 7.0-STABLE #5: Tue Aug 12 
 13:54:27 MSD  2008root@:/usr/obj/usr/src/sys/GENERIC  
 i386
 |

 please, any solution ?
 
 Probably speed is limited via jumpers on your hard drive.

 
 http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1409p_created=#jumper
 tried it

 without results
 
 FWIW, the only time I've seen this happen is when there's a jumper
 limiting the capability.  You should have **removed** the OPT1 jumper,
 and left any other jumpers alone.

 If you're absolutely sure the jumper is removed, I'll purchase one of
 these drives and test it on an ICH7 (Supermicro PDSMi+) to confirm your
 findings.
 
 Actually, I don't need to -- I'm using WD5000AAKS disks myself on
 that exact system:

 atapci1: Intel AHCI controller port 
 0x30e8-0x30ef,0x30dc-0x30df,0x30e0-0x30e7,0x30d8-0x30db,0x30b0-0x30bf mem 
 0xe8600400-0xe86007ff irq 19 at device 31.2 on pci0
 atapci1: [ITHREAD]
 atapci1: AHCI Version 01.10 controller with 4 ports detected
 ata2: ATA channel 0 on atapci1
 ata2: [ITHREAD]
 ata3: ATA channel 1 on atapci1
 ata3: [ITHREAD]
 ata4: ATA channel 2 on atapci1
 ata4: [ITHREAD]
 ata5: ATA channel 3 on atapci1
 ata5: [ITHREAD]

 ad6: 476940MB WDC WD5000AAKS-00YGA0 12.01C02 at ata3-master SATA300
 ad8: 476940MB WDC WD5000AAKS-00TMA0 12.01C01 at ata4-master SATA300
 
 may issue in Intel ICH7 SATA300 controller driver ?

 
 atapci1: Intel ICH7 SATA300 controller port   
 0xc880-0xc887,0xc800-0xc803,0xc480-0xc487,0xc400-0xc403,0xc080-0xc08f 
  irq 19 at device 31.2 on pci0
 
 

 Possibly.  All my Intel ICH7 boards have AHCI capability, and I use it.
 See Chapter 4 here:

 http://www.supermicro.com/manuals/motherboard/3000/MNL-0889.pdf

 If your motherboard does, I'd recommend enabling it as well and see if
 things change.

 Regarding the Enhanced vs. Compatible mode: use Enhanced.  On
 my boards, choosing Enhanced makes the AHCI and Intel MatrixRAID
 options appear.

 I'm fairly certain you don't need AHCI to get SATA300, though.

 I would recommend you re-check the jumpers on your disks to make sure
 you didn't make a mistake when adjusting things.

   
 - HDD on position: all jumpers removed;
 - SATA controller  in  Enchanced mode;
 - no option AHCI in BIOS

 without results

I'm out of ideas.  Others will have to continue helping...

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kern/98388: [ata] FreeBSD 6.1 - WDC WD1200JS SATA II disks are seen as older SATA

2008-08-14 Thread Jeremy Chadwick
On Thu, Aug 14, 2008 at 03:56:32PM +0400, Andrey V. Elsukov wrote:
 sam wrote:
 Can you apply attached patch, rebuild your kernel, reboot in verbose
 mode and show /var/run/dmesg.boot ?

 http://cs.udmvt.ru/files/temp/dmesg.boot_0814

 It seems that driver couldn't allocate IO resource at BAR5 and
 without this resource it can't read SATA Status register and
 determine negotiated speed. I think the problem is in your BIOS.
 If your BIOS doesn't have any AHCI or RAID specific options
 I don't know how correctly fix this problem.

Andrey, please correct me if I'm wrong here.  I'm not familiar these
kernel functions, but assuming pci_read_config() handles proper byte
order, and device_printf() prints it in correct order, then I believe
you may be missing something important.

I haven't looked for any product Errata, but see Section 12.1 below
(specific to ICH7): http://www.intel.com/assets/pdf/datasheet/307013.pdf

Index 0x94 = SIR   (SATA Index Register)
Index 0xAC = SCAP1 (SATA Capability Register 1)

I'm not sure why you called SIR SCRD in your device_printf().  For
SIR's description, see Section 12.1.35.  For SCAP1, see Section 12.1.39.

The SIR value is 0x4180, broken down into binary nibbles:

  %0100     0001 1000 
^
^
This indicates bit 30 is set.  According to Intel's docs, bit 30
disables SCAP0 and SCAP1, thus will cause them to always return 0:


Bit 30
SATA Capability Registers Disable (SCRD)

When this bit is set, the SATA Capability Registers are disabled. That
is, SATA Capability Registers 0 and 1 are both changed to Read Only with
the value of h. Also, the Next Capability bits in the PCI Power
Management Capability Information Register (D31:F2;Offset 70h bits 15:8)
are changed to 00h, to indicate that the PCI Power Management Capability
structure is the last PCI capability structure in the SATA controller.
When this bit is cleared, the SATA Capability Registers are enabled.


A quick glance seems to indicate we're not initialising some of the SATA
registers at all, case in point.

Someone should make a patch for the user that zeros out bit 30 of SIR,
then check the xBAR and LBAR values; zeroing bit 30 might get him
SATA300 support (I haven't looked at the rest of the FreeBSD ATA code
yet).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Debugging reboot with Linux emulation

2008-08-13 Thread Jeremy Chadwick
On Wed, Aug 13, 2008 at 04:03:53PM +0200, Alexander Leidinger wrote:
 Quoting Kostik Belousov [EMAIL PROTECTED] (from Wed, 13 Aug 2008  
 14:54:13 +0300):

 On Wed, Aug 13, 2008 at 01:28:22PM +0200, Alexander Leidinger wrote:
 Quoting Nate Eldredge [EMAIL PROTECTED] (from Tue, 12 Aug
 2008 23:52:35 -0700 (PDT)):

 Hi folks,
 
 I recently tried to run a Linux binary of Maple (commercial math
 software) on my FreeBSD 7.0-RELEASE/amd64 box, and the machine
 rebooted.  I tried it again while watching the console, and no panic
 message appeared to be produced.  Does anyone have any ideas on how
 to debug problems of this nature?  I realize I may not be able to
 get Maple to work, but in any case the system should not die like
 this, so I can at least try to fix that bug.
 
 Incidentally, is it possible to run kdb with a USB keyboard?
 Hitting Ctrl-Alt-Esc gives me the kdb prompt, but I can't type, so I
 can do nothing except hit the power button.  I do have
 hint.atkbd.0.flags=0x1 in /boot/device.hints.  Unfortunately I
 don't have a PS/2 keyboard on hand, though I can try and get a hold
 of one if all else fails.

 A guess out of my cristallball:
 That's one of the cases which happen if you run a linux program
 without branding it as a linux program first. People tend to think it
 is not needed, but in some rare circumstances it just causes what you
 see, a reboot. So go and identify all binaries (IMPORTANT: but not the
 libraries!), e.g. with the file(1), and use brandelf -t Linux on
 those programs.

 That would be an enormous local hole, assuming an native FreeBSD binary
 may cause system crash. I actually doubt that non-branded elf binary
 ever start, due to unsatisfied dynamic dependencies.

 You see this behavior only for static binaries. In the non-branded case 
 the image activator takes the FreeBSD image and unfortunately there's a 
 common syscall in linux which matches the syscall number in FreeBSD which 
 causes the reboot (IIRC reboot syscall, do we have something like this?). 
 It's not a system crash (kernel panic), it's a real reboot. AFAIR this 
 also only works if you run the program as root. So...

There is indeed a reboot syscall, #55:

/usr/include/sys/sycall.h:
#define SYS_reboot  55

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Idea for FreeBSD

2008-08-07 Thread Jeremy Chadwick
On Wed, Aug 06, 2008 at 07:14:51PM -0400, [EMAIL PROTECTED] wrote:
 To who it may concern,
 
I am A FreeBSD administrator as well as a Solaris Administrator. I use
 BSD at home but Solaris at work. I love both OS's but I would like to
 increase the administrative capability of FreeBSD.
 
In Solaris 10 the Services Management Facility (SMF) was introduced.
 Basically what it does, is take all the rc.d scripts and puts them into
 a database to manage. Everything is converted to XML and two basic
 commands (svcs and svcadm) are used to manage everything.

I highly recommend you and anyone advocating the use of XML for such
things read the following whitepaper/study, in full:

http://www.cs.kent.ac.uk/pubs/2004/2102/content.pdf

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB key kernel: da0: Attempt to query device size failed: UNIT ?ATTENTION, Medium not present

2008-08-07 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 09:54:53AM +0200, Matthias Apitz wrote:
 El día Wednesday, August 06, 2008 a las 07:29:31PM +0200, Oliver Fromme 
 escribió:
 
  Matthias Apitz wrote:
I've updated usb/80361, see
http://www.freebsd.org/cgi/query-pr.cgi?pr=80361
because I have the same problem as well that an USB key attaches fine
when plugged in at boot time, but not later:
  
  I'm just wondering what happens if you enforce a rescan
  on the (virtual) SCSI bus.  That is, after you have
  plugged in the USB stick and the problem occured, type
  camcontrol rescan 0.
  
 
 this did not helped; I tried it a lot of times; also reading with dd(1)
 from /dev/da0 did not helped;
 
  If that doesn't help, please try this patch:
   ...
 
 The problem is that this was a USB stick of a friend of me in which I
 have created a booting FreeBSD so he can make the installation of it in
 an eeePC; will try to get back this USB stick from him for further
 tests...

Can we get the brand and model of USB stick, and any specific
model/version numbers that are on the device?  The dmesg you provided
doesn't have very good vendor strings in it (that's not your fault).

I can tell you that in the case of *some* SanDisk USB sticks (and
despite this, I still buy/use their products -- I like them), there are
versions with buggy USB code on them.

I spent quite some time with Supermicro trying to find out why a SanDisk
USB stick would not boot on some of their servers -- it turned out to be
broken/buggy firmware code inside of the USB stick itself.  Replacing it
with a different version (same size/model though) fixed the issue.

Just something to be aware of...

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB key kernel: da0: ...

2008-08-07 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 02:01:22PM +0200, Michel Talon wrote:
  Matthias Apitz wrote:
  Aug  6 10:06:12 rebelion kernel: umass0: Verbatim Store'n'go, class 0/0, 
  rev 2.00/2.00, addr 2 on uhub4
  Aug  6 10:06:12 rebelion root: Unknown USB device: vendor 0x08ec product 
  0x0020 bus uhub4
  Aug  6 10:06:12 rebelion kernel: da0 at umass-sim0 bus 0 target 0 lun 0 Aug 
   6 10:06:12 
  rebelion kernel: da0: VBTM Store'n'go 6.51 Removable Direct Access SCSI-0 
  device
  Aug  6 10:06:12 rebelion kernel: da0: 40.000MB/s transfers
  Aug  6 10:06:12 rebelion kernel: da0: Attempt to query device size failed: 
  UNIT ATTENTION, Medium not present
 
 Here is another example:
 
 Aug  5 14:48:59 niobe kernel: umass0: KINGSTON DataTraveler 2.0, class 0/0, 
 rev 2.00/2.00, addr 2 on uhub3
 Aug  5 14:48:59 niobe root: Unknown USB device: vendor 0x0951 product 0x1603 
 bus uhub3
 Aug  5 14:48:59 niobe kernel: da1 at umass-sim0 bus 0 target 0 lun 0
 Aug  5 14:48:59 niobe kernel: da1: KINGSTON DataTraveler 2.0 1.00 Removable 
 Direct Access SCSI-2 device 
 Aug  5 14:48:59 niobe kernel: da1: 40.000MB/s transfers
 Aug  5 14:48:59 niobe kernel: da1: 1905MB (3902464 512 byte sectors: 255H 
 63S/T 242C)
 Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR 
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR 
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, IOERROR
 Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, IOERROR
 Aug  5 14:55:57 niobe kernel: umass0: at uhub3 port 5 (addr 2) disconnected
 Aug  5 14:55:57 niobe kernel: (da1:umass-sim0:0:0:0): lost device
 Aug  5 14:55:57 niobe kernel: (da1:umass-sim0:0:0:0): removing device entry
 Aug  5 14:55:57 niobe kernel: umass0: detached
 
 Needless to say, this stick works perfectly OK under Windows and Linux.

I have the 4GB model of this USB stick/drive.  I'll give it a try on my
FreeBSD RELENG_7 box when I get home in about an hour.

If I can reproduce the issue, I will be more than happy to send it to
someone who wants to debug it (and they can keep it as my way of saying
thanks).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB key kernel: da0: ...

2008-08-07 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 05:50:45AM -0700, Jeremy Chadwick wrote:
 On Thu, Aug 07, 2008 at 02:01:22PM +0200, Michel Talon wrote:
   Matthias Apitz wrote:
   Aug  6 10:06:12 rebelion kernel: umass0: Verbatim Store'n'go, class 0/0, 
   rev 2.00/2.00, addr 2 on uhub4
   Aug  6 10:06:12 rebelion root: Unknown USB device: vendor 0x08ec product 
   0x0020 bus uhub4
   Aug  6 10:06:12 rebelion kernel: da0 at umass-sim0 bus 0 target 0 lun 0 
   Aug  6 10:06:12 
   rebelion kernel: da0: VBTM Store'n'go 6.51 Removable Direct Access 
   SCSI-0 device
   Aug  6 10:06:12 rebelion kernel: da0: 40.000MB/s transfers
   Aug  6 10:06:12 rebelion kernel: da0: Attempt to query device size 
   failed: UNIT ATTENTION, Medium not present
  
  Here is another example:
  
  Aug  5 14:48:59 niobe kernel: umass0: KINGSTON DataTraveler 2.0, class 
  0/0, rev 2.00/2.00, addr 2 on uhub3
  Aug  5 14:48:59 niobe root: Unknown USB device: vendor 0x0951 product 
  0x1603 bus uhub3
  Aug  5 14:48:59 niobe kernel: da1 at umass-sim0 bus 0 target 0 lun 0
  Aug  5 14:48:59 niobe kernel: da1: KINGSTON DataTraveler 2.0 1.00 
  Removable Direct Access SCSI-2 device 
  Aug  5 14:48:59 niobe kernel: da1: 40.000MB/s transfers
  Aug  5 14:48:59 niobe kernel: da1: 1905MB (3902464 512 byte sectors: 255H 
  63S/T 242C)
  Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR 
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB reset failed, IOERROR 
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-in clear stall failed, 
  IOERROR
  Aug  5 14:49:25 niobe kernel: umass0: BBB bulk-out clear stall failed, 
  IOERROR
  Aug  5 14:55:57 niobe kernel: umass0: at uhub3 port 5 (addr 2) disconnected
  Aug  5 14:55:57 niobe kernel: (da1:umass-sim0:0:0:0): lost device
  Aug  5 14:55:57 niobe kernel: (da1:umass-sim0:0:0:0): removing device entry
  Aug  5 14:55:57 niobe kernel: umass0: detached
  
  Needless to say, this stick works perfectly OK under Windows and Linux.
 
 I have the 4GB model of this USB stick/drive.  I'll give it a try on my
 FreeBSD RELENG_7 box when I get home in about an hour.
 
 If I can reproduce the issue, I will be more than happy to send it to
 someone who wants to debug it (and they can keep it as my way of saying
 thanks).

As promised, when I got home I inserted the Kingston I have.  I should
note this disk was formatted as FAT32 on a Windows machine, and was also
made bootable via a Windows utility made by Hewlett Packard.

This is what I got upon inserting the device:

umass0: Kingston DataTraveler 2.0, class 0/0, rev 2.00/2.00, addr 2 on uhub4
da0 at umass-sim0 bus 0 target 0 lun 0
da0: Kingston DataTraveler 2.0 1.00 Removable Direct Access SCSI-2 device
da0: 40.000MB/s transfers
da0: 3836MB (7856128 512 byte sectors: 255H 63S/T 489C)
GEOM_LABEL: Label for provider da0s1 is msdosfs/KINGSTON.

icarus# camcontrol devlist
Kingston DataTraveler 2.0 1.00   at scbus0 target 0 lun 0 (da0,pass0)

icarus# camcontrol inquiry da0
pass0: Kingston DataTraveler 2.0 1.00 Removable Direct Access SCSI-2 device
pass0: Serial Number
40.000MB/s transfers

icarus# mount_msdosfs /dev/da0s1 /mnt
icarus# df -k /mnt
Filesystem 1024-blocks Used   Avail Capacity  Mounted on
/dev/da0s1 3920364   12 3920352 0%/mnt

This looks correct (there is no data on the FAT32 filesystem).

icarus# umount /mnt
icarus#

I then removed the stick, and got this:


umass0: at uhub4 port 6 (addr 2) disconnected
(da0:umass-sim0:0:0:0): lost deviceGEOM_LABEL
(da0:umass-sim0:0:: 0:Label 0): msdosfs/KINGSTON removeremoving device
entryd.

umass0: detached

Kernel messages are being printed atop one another is a known bug (it
really needs to get fixed already, since increasing PRINTF_BUFR_SIZE
to 256 only makes the problem slightly better), but as you can see, it
worked fine.

I'm thinking this may boil down to a problem with udbp(4) getting in the
way, since it's responsible for the bulk (BBB) stuff.  I yank udbp(4)
out of my kernel because I don't see the point in including support
for something I'll never use.  (And are bulk pipes even part of the
USB standard?  I don't remember reading about them as part of the USB
1.0

Re: strange issue reading /dev/null

2008-08-07 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 04:46:37PM +0200, Gabor Kovesdan wrote:
 Hello,

 I'm wondering why fgetc() returns 0xff if called with /dev/null:

 #include stdio.h
 #include stdlib.h

 int
 main(void)
 {
int  c;
FILE*f;

f = fopen(/dev/null, r);

if (c != EOF)
printf(%c\n, fgetc(f));
 }

  gcc foo.c
  ./a.out
 ÿ

 This causes a bug in BSD grep as /dev/null is not distinguished from  
 ordinary files in the code, thus I was expecting it just returned EOF,  
 but in reality this is not the case. How such cases should be handled?

Your code is wrong -- you're not calling feof().  Please read
the RETURN VALUES section of fgetc(3) in full, and slowly.  :-)

And your if() statement serves no purpose there.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: strange issue reading /dev/null

2008-08-07 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 11:54:10AM -0500, Sean C. Farley wrote:
 On Thu, 7 Aug 2008, Gabor Kovesdan wrote:

 Sean C. Farley ha scritto:
 You are testing c which has not been set.  It works OK if you set c
 then do the test:

 +   c = fgetc(f);
 if (c != EOF)
 -   printf(%c\n, fgetc(f));
 +   printf(%c\n, c);
 Yes, you are right, this is what I meant, I'm just a bit
 disorganised
 Thanks!

 You are welcome.

 Actually, what I found odd was that the base gcc did not warn about
 using an uninitialized variable using -Wall.

Probably because you didn't use -O.  -Wall includes -Wuninitialized, but
-Wuninitialized only applies if you use optimisation.  gcc won't bail if
you use -Wall without -O, for obvious reasons.  Case in point:

$ gcc -Wall -o x x.c
x.c: In function 'main':
x.c:14: warning: control reaches end of non-void function

$ gcc -Wuninitialized -o x x.c
cc1: warning: -Wuninitialized is not supported without -O

$ gcc -Wall -O -o x x.c
x.c: In function 'main':
x.c:14: warning: control reaches end of non-void function
x.c:12: warning: 'c' is used uninitialized in this function

gcc -- finding new ways every day to drive programmers crazy.  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Q: case studies about scalable, enterprise-class firewall w/ IPFilter

2008-08-06 Thread Jeremy Chadwick
On Wed, Aug 06, 2008 at 10:21:51AM +0200, Jordi Espasa Clofent wrote:
 Well, there are always Juniper Networks boxes :-)

 I do the same (even more in some points) as Juniper boxes with simple  
 standard boxes with OpenBSD and PF.

 At present day my central FWs are simply standard 2 boxes (each one cost  
 1000 euros aprox); I remember the Juniper guy offering me a 'cheap'  
 7000/12000 euros solution.. :P

I'm amazed at the fact that people are actually comparing FreeBSD with
pf to Juniper routers.  I've a bit of experience with M20s and M40s, and
I can assure you they're VERY different than a little x86 PC routing
packets, and are significantly faster due to hardware routing.

For example, you should be aware of a pf(4) bug that was only recently
fixed.  Our FreeBSD systems only use ACLs + state track, and have low
network I/O (600kbit/sec) -- yet this sort of thing impacts production
packets on a webserver:

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/125261
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/contrib/pf/net/pf.c

Max committed the fix to CURRENT, and it should be MFC'd on the 11th.  I
hope it gets backported to RELENG_6 as well, since it's pretty major
(IMHO).

My point isn't to insult or poke fun at pf or FreeBSD.  I'm simply
stating if you really think an x86 box with pf is better than a
Juniper, you're sadly mistaken.  I'm not telling you to go out and buy
a Juniper either, especially if it's out of your price range -- but you
really need to be more aware of the differences before toting the my
FreeBSD box can do the job better! attitude.  I'm glad FreeBSD with pf
works for you, though.

 Moreover, as far I know, the core of Juniper devices is BSD (FreeBSD  
 especially) based.

Correct, JunOS is FreeBSD 4.x-based.

On the other hand, I find it amusing that Juniper's routers use ATA
disks.  A single disk failure results in the system becoming unusable
administratively (requiring a reboot), while the routing engine still
works fine (e.g.  packets are still routed properly, ACLs applied,
etc.).  Config data is kept on CF, so that isn't lost.  You just can't
SSH into it, and all you'll see on serial console is repetitive ATA and
SMART errors.  I've seen this happen on three separate routers on three
separate occasions at my workplace.

For something that costs so much money, you'd have expected them to go
with some form of disk redundancy, SCSI disks, or SSDs.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em0: The EEPROM Checksum Is Not Valid

2008-08-06 Thread Jeremy Chadwick
On Thu, Aug 07, 2008 at 08:34:44AM +0400, Vladimir Ermakov wrote:
 Hello

 my trouble with nic

 part of `dmesg`  output
 -
 em0: Intel(R) PRO/1000 Network Connection 6.9.5 port 0xec00-0xec3f mem  
 0xfebc-0xfebd,0xfeb8-0xfebb irq 19 at device 2.0 on pci2
 em0: The EEPROM Checksum Is Not Valid
 device_attach: em0 attach returned 5
 --

 part of `pciconf -lv` output
 --
 [EMAIL PROTECTED]:2:2:0: class=0x02 card=0x10018086 chip=0x10268086 
 rev=0x01  
 hdr=0x00
vendor = 'Intel Corporation'
device = '82545GM Gigabit Ethernet Controller'
class  = network
subclass   = ethernet
 --

 uname output
 --
 FreeBSD  7.0-STABLE FreeBSD 7.0-STABLE #2: Wed Jul 16 20:36:12 UTC 2008   
   root@:/usr/obj/usr/src/sys/STONE  i386
 --

 please, any solution?

Intel probably has a utility to reset the EEPROM settings on the NIC.
Jack Vogel may know where to get such a utility.

I do not believe this problem is FreeBSD-related.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IPv6 CVS

2008-08-05 Thread Jeremy Chadwick
On Tue, Aug 05, 2008 at 12:04:33PM +0100, Pegasus Mc Cleaft wrote:
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:owner-freebsd-
  [EMAIL PROTECTED] On Behalf Of Stefan Sperling
  Sent: 05 August 2008 11:51
  To: Maxim Konovalov
  Cc: freebsd-hackers@freebsd.org; Pegasus Mc Cleaft; Tim Clewlow
  Subject: Re: IPv6 CVS
  
  On Tue, Aug 05, 2008 at 02:16:35PM +0400, Maxim Konovalov wrote:
   On Tue, 5 Aug 2008, 19:52+1000, Tim Clewlow wrote:
  
   
 Hi all,

 Does anyone know if there are any IPv6 CVS servers for FreeBSD?
  (As
 in
 receiving the STABLE and ports branches) I currently use
 cvs.freebsd.org but
 it dosent have an  record.

 Ta

 Peg
   
 dig  cvsup4.freebsd.org
   
   cvs != cvsup.  Speaking of cvsup -- cvsup4.ru.freebsd.org has an ipv6
   address as well.
  
  AFAIK the Modula3 runtime does not support IPv6.
  
  Stefan
 
 Thanks everyone, 
 
   Looks like Tim is correct where I am able to ping cvsup4, but
 unfortunately the csup utility reports a fail (Connection Refused) as it
 tries to connect to the V6 address. It will quite happily connect to the
 same machine V4. 

csup is written in C; it does not use Modula3/ezm3.  cvsup uses Modula3/ezm3.

cvsup4, despite having a public IPv6 address, does not have the cvsup
server bound to IPv6.  Meaning: it's IPV4 only.

Try a different server.  Get a list (in sh/bash):

for i in `jot 30 1`; do echo == cvsup$i ; (host cvsup$i.freebsd.org) | grep 
-i ipv6; done

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-31 Thread Jeremy Chadwick
On Thu, Jul 31, 2008 at 11:13:03AM +0300, Stefan Lambrev wrote:
 Matt Olander wrote:
 On Jul 25, 2008, at 3:23 PM, Jeremy Messenger wrote:

 On Thu, 24 Jul 2008 09:34:32 -0500, Frank Mayhar [EMAIL PROTECTED] wrote:

 My old Dell Inspiron 5160 has developed problems that I can't fix,  
 sigh,
 so it's time to replace it.  I'm hoping for some good suggestions from
 this list (cc'd to hackers for the exposure, I know everyone doesn't
 read -mobile).

 My criteria:
  * 3D acceleration.
  * MiniPCI wireless (don't care which card, I'll replace it
anyway).
  * At least 15 screen.
  * Decent power consumption.
  * Plays well with FreeBSD 7-stable.

 Nice to have:
  * Dual core.
  * 4GB memory.
  * Working suspend/hibernate mode (and no, I'm not holding my
breath).

 So, suggestions?  BTW, if I get a decent response I'll summarize it for
 the list, along with the one I chose and my experience after
 ordering/installing it.

 Maybe you can wait for this:

 http://www.ixsystems.com/products/bsd-laptop.html

 Hi everyone! I actually had our prototype of this laptop up at the  
 OSCON show in Portland and it was pretty well received.
 Everything works for the most part although we're still tweaking some  
 things for ACPI.

 I'll have one at the FreeBSD booth at LinuxWorld in San Francisco next  
 week, August 5-7. We'll announce as soon as this thing is 100% and  
 we're comfortable bringing the product line up as an item that we're  
 comfortable supporting long term. Most likely, available to the  
 general public in September.
 Does it have web cam btw ? I do not saw in spec, but on the picture  
 looks like it have.

FreeBSD has support for webcams?  News to me.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-31 Thread Jeremy Chadwick
On Thu, Jul 31, 2008 at 11:17:54AM +0200, Achim Patzner wrote:
 Drivers? Who cares. Serial port? Just plug in an USB-to-serial.

You've obviously never used a USB-to-serial adapter.  Are you aware of
the fact that there is no serial device class as part of the USB
specification?  (Quite a great irony, if you ask me.  Universal SERIAL
Bus, yet no serial device class...)  AFAIK, there isn't even a draft
proposal for such.

You *must* have drivers for a USB-to-serial adapter.  And every adapter
is different, depending upon the adapter chipset used, many of which are
not disclosed in product specifications, so there's no way to guarantee
it'll work with FreeBSD.  On -stable (I believe) some people have
mentioned which USB-to-serial adapters work great under FreeBSD and
Windows, while others are horrible (dropping characters, broken flow
control, interrupt issues, and many other problems).

 It's a perfect machine for the desktop; I've forbidden FreeBSD to come
 creeping out the server room some years ago. I need it for keeping the
 penguins away, it's really good at that (no wonder - pitchforks do  
 hurt).
 But it's a pain for desktoppy things - so why shouldn't I use something
 less useful? And the other way round: Running Mac OS X Server is the
 most painful thing I've ever been paid for; I've been replacing a lot of
 them with FreeBSD-based servers.

The amount of rhetoric in these two paragraphs is amazing; I literally
cannot tell if you're trolling with anti-FreeBSD propaganda, or if
you're trolling with pro-FreeBSD propaganda.  Congratulations, you've
confused at least one reader.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-31 Thread Jeremy Chadwick
On Thu, Jul 31, 2008 at 12:48:02PM +0200, Achim Patzner wrote:
 I don't care.

I can see that; thanks for summing it up.

 The amount of rhetoric in these two paragraphs is amazing; I literally
 cannot tell if you're trolling with anti-FreeBSD propaganda, or if
 you're trolling with pro-FreeBSD propaganda.  Congratulations, you've
 confused at least one reader.

 Wrong on both counts. I'm just using the appropriate tools for the jobs
 that need to be done. And on the desktop FreeBSD just plain sucks in
 comparison to Mac OS. And after all, Mac users need FreeBSD - who else
 should provide them with all the nice things from ipfw to the user land?
 Would you really expect Apple to do it all on its own?

 Face it: The real difference between servers and desktops is the who
 has to bend over-question. Servers are adapted to the software they
 are going to run while on personal computers the software has to adapt
 to the machine (I want that shiny Sony. I don't care if the hardware
 sucks, it's beautiful.). And Chuck is quite definitely lacking at  
 bending
 over...

You just did it again -- anti-FreeBSD propaganda and pro-FreeBSD
propaganda in a single paragraph, followed by an oddly-skewed
server-to-desktop comparison, something about computer cosmetics, then a
strange comment about the beastie/Chuck which seems to be negative but
could be positive depending on how you look at it.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: forcefsck on booting stage

2008-07-28 Thread Jeremy Chadwick
On Mon, Jul 28, 2008 at 04:11:49PM +0400, sam wrote:
 Hello,

 How to make 'fsck -f' on booting stage of remote system?

I believe by setting background_fsck=no in /etc/rc.conf?  That's the
only way I know of, besides booting single user and doing it manually.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-25 Thread Jeremy Chadwick
On Fri, Jul 25, 2008 at 01:51:35PM +0800, Wilkinson, Alex wrote:
  If cost is not a big problem, then IBM/Lenovo Thinkpad Series (I prefer
  T-series) is the best from my past experiences. And you may check 
 following
 As someone who used (and use) 360, 701C, T30, T42p, X60 and T61p, I
 wholeheartedly agree with past experiences... with past being a key
 word. While I could not complain about FreeBSD support (none of the
 FreeBSD problems I have are ThinkPad-specific), manufacturing quality
 has gone down considerably. My not-two-years-old X60 chipped in places
 and my wife's 8-months-old T61p is no longer capable of keeping the
 screen upright. This is in the stark contrast with T42p I (ab)used for
 $work for more than three years, with the only visible outcome being
 loss of the caption on the Enter key.
 
 So if Thinkpads are no longer the go ... what is ?

I'm buying a new computer, what should I buy?

Buy whatever suits your needs, and feels comfortable for you.  I'd
recommend, if at all possible, going to a major computer store or
electronics outlet and trying out a Lenovo.  Spend 30 minutes with it.
I realise you can't run FreeBSD on them, but get a feel for the machine
itself -- if the keyboard works well with your fingers, if you like the
mixed touchpad/fingertip mouse, if it feels sturdy to you, if you like
the LCD, etc...

Example:

I really did not like the weight of the T60p.  I'm a cyclist and do not
drive, so hauling a laptop around means I prefer it to be light.  My
employer requires that all the T60ps use the larger battery, which plays
a significant role.  I tried a smaller model (I believe one with a 14
screen), and the weight was wonderful -- but after 20 minutes of use, I
started experiencing headaches and nausea.  The backlighting on the LCD
was the cause, while I had no such problems using the T60p.

My point is, you gotta use the machine for a little bit (even if in
Windows) and get a feel for it.  I know this is hard to do when most
vendors nowadays expect people to just click-and-buy, but when spending
that kind of money on something, it's worth trying first.

With regards to OS compatibility, this is a difficult one.  Googling to
see what other people have experienced is pretty much the only option,
or you get to find out yourself.

Ideally in this day and age, you shouldn't have to worry about hardware
compatibility with an OS; the OS should work with what you have, not
the other way around.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-25 Thread Jeremy Chadwick
On Fri, Jul 25, 2008 at 01:36:37PM +0300, Aggelidis Nikos wrote:
 On Fri, Jul 25, 2008 at 9:44 AM, Jeremy Chadwick [EMAIL PROTECTED] wrote:
 
  I'm buying a new computer, what should I buy?
 
  Buy whatever suits your needs, and feels comfortable for you.
 
 
  With regards to OS compatibility, this is a difficult one.  Googling to
  see what other people have experienced is pretty much the only option,
  or you get to find out yourself.
 
 But this is the reason Frank asks which computer to buy. I don't think
 he is expecting to be told about the weight of a laptop; he can figure
 this himself. But how can you figure OS compatibility from 20minutes
 test drive? That's why you ask for other persons experiences.
 Personally if i were to buy a laptop right now, i would buy one  that
 would be fully compatible with bsd or linux even if this meant paying
 a few more euros or getting something heavier... Unfortunately this
 kind of info {OS-- compatibility} isn't advertised, or written in
 specs.
 
 From my perspective freebsd should advertise(*)  the laptops that
 work with it, out of the box, so that new users {like me} know what to
 buy; and large corporations have a benefit for promoting OS
 compatibility other than Windows(tm).
 
 
 -best regards
 nikos
 
 (*) when i say advertise , i mean make this info publicly available
 and easily accessible from the website

You really have no idea to what granularity/extreme laptop vendors make
changes to their laptops.  Do not, even for a moment, think that any
time they make a hardware modification that they change a model number
or increase a version number: they don't.  Hell, it's hard enough
getting ASIC manufacturers to do this (Realtek I'm looking at you).

Here's a real example: do you know how many actual hardware
revisions/models of the T60p there are, just in the United States?
Let's look:

http://www-307.ibm.com/pc/support/site.wss/homeLenovo.do?country=us

Select Notebooks and Handhelds from the pulldown.
For Family, select ThinkPad T60p.
Now look at how many entries there are under Type.  Choose one.
Now look at how many entries there are under Model.

Now do you still feel what you want is reasonable?  :-)

I understand where it is you're coming from -- you essentially want the
same thing Microsoft totes with their Certified for xxx logos on
hardware -- which as I'm sure you also know amounts to nothing more than
marketing schmooze.

User X would report that FreeBSD works on their laptop, but then 3
months later, user Y would report X feature doesn't work on their
laptop, which then amounts to is laptop really compatible with
FreeBSD?  Etc. etc...

I'm sure you see where I'm coming from.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-07-25 Thread Jeremy Chadwick
On Fri, Jul 25, 2008 at 08:35:44PM +0300, Razmig K wrote:
 How about Dell models which come with Ubuntu preinstalled? (Inspiron  
 1525N and 1420N, XPS M1330). Don't they have higher chances of running  
 FreeBSD smoothly? A quick glance over the hardware notes of 7.0-RELEASE  
 and some googling around show that wireless, video and audio are 
 supported.

A co-worker of mine has a Dell (I forget which model; I'll ask him this
coming week), running Kubuntu.  The overall compatibility is quite good,
and I haven't heard any complaints from him.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


  1   2   >