latest memstick image fails to mountroot by default on Thinkpad e565

2016-06-03 Thread Matthew Macy

In order to boot USB reliably on recent laptop hardware (both my thinkpad  and 
XPS13 need this) you need to add the following to the installer images 
loader.conf:

kern.cam.boot_delay="1"
kern.cam.scsi_delay="3000"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildworld: /usr/bin/ar segfault

2016-06-03 Thread Tim Kientzle

> On Jun 3, 2016, at 10:23 AM, Eric van Gyzen  wrote:
> 
> My buildworld just failed very early with a segfault from /usr/bin/ar:
> 
>--
 stage 1.1: legacy release compatibility shims
>--
>...
>--- libegacy.a ---
>building static egacy library
>ar -crD libegacy.a `NM='nm' NMFLAGS='' lorder dummy.o  | tsort -q`
>Segmentation fault (core dumped)
>*** [libegacy.a] Error code 139
> 
> 
> In __archive_write_allocate_filter(), a->filter_last was pointing to
> archive_write_ar_header().  a->format_write_header should have pointed
> to this function.  The offset between these two fields in struct archive
> is 48 bytes.  Sure enough, that structure recently grew by 48 bytes.
> 
> This would seem to indicate that ar (or libarchive.a) was built with
> mismatched objects.  Unfortunately, I don't have good records of what
> build options and flags I used.  I /think/ I used either
> -DWITH_SYSTEM_COMPILER or no options at all.

The build of 'ar' shouldn't matter since it's a client of libarchive
and libarchive clients do not ever see or manipulate the internals of
struct archive_write.

The problem would be with the build of the libarchive library.
It sounds like you somehow had a stale archive_write_set_format_ar.o
that did not get rebuilt when archive_write_private.h got updated recently.

If you still have the /usr/obj tree around, could you check the dates on these
files:
   archive_write_set_format_ar.o (in /usr/obj)
   archive_write_private.h (in /usr/src)

If those dates are in the wrong order (the .o should be newer), then
the make definitely went awry somewhere.

Tim

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


thread suspension when dumping core

2016-06-03 Thread Mark Johnston
Hi,

I've recently observed a hang in a multi-threaded process that had hit
an assertion failure and was attempting to dump core. One thread was
sleeping interruptibly on an advisory lock with TDF_SBDRY set (our
filesystem sets VFCF_SBDRY). SIGABRT caused the receipient thread to
suspend other threads with thread_single(SINGLE_NO_EXIT), which fails
to interrupt the sleeping thread, resulting in the hang.

My question is, why does the SA_CORE handler not force all threads to
the user boundary before attempting to dump core? It must do so later
anyway in order to exit. As I understand it, TDF_SBDRY is intended to
avoid deadlocks that can occur when stopping a process, but in this
case we don't stop the process with the intention of resuming it, so it
seems erroneous to apply this flag.

Thanks,
-Mark
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VirtualBox network connectivity broken on recent -CURRENT

2016-06-03 Thread Don Lewis
On  3 Jun, Don Lewis wrote:
> It looks like something changed in -CURRENT to break network
> connectivity to VirtualBox guests.  This was last known to work with
> r299139 (May 6th) and is definitely broken with r301229.  The VirtualBox
> port revisions are:
>   virtualbox-ose-4.3.38_1
>   virtualbox-ose-kmod-4.3.38
> It looks like there was one change to the VirtualBox on May 9th, but it
> looks unlikely to be the cause of the problem.
> 
> The network settings are:
>   Attached to: Bridged Adapter
>   Name: re0
>   Adapter Type: Paravirtualized Network (virtio-net)
>   Promiscuous Mode: Deny
>   MAC Address: [snip]
> Ifconfig says that the interface is up, but I am unable to ping either
> the host or anything else on the LAN from the guest.  It looks like the
> problem is with outbound traffic.  If I attempt to ping the guest, the
> source IP address and MAC address show up in the guest's arp table, but
> ping reports:
>   ping: sendto: Host is down
> That makes me think that the arp responses from the guest are not
> getting transmitted.  None of the machines involved are running
> firewalls.  If I ping from the guest, I don't see any arp requests on
> the wire and the arp command shows the table entry as incomplete.
> 
> The problem shows up with both FreeBSD -CURRENT and Debian guests.

I see the same behaviour if I set:
Attached to: NAT
or
Adapter Type: 82540EM

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


VirtualBox network connectivity broken on recent -CURRENT

2016-06-03 Thread Don Lewis
It looks like something changed in -CURRENT to break network
connectivity to VirtualBox guests.  This was last known to work with
r299139 (May 6th) and is definitely broken with r301229.  The VirtualBox
port revisions are:
virtualbox-ose-4.3.38_1
virtualbox-ose-kmod-4.3.38
It looks like there was one change to the VirtualBox on May 9th, but it
looks unlikely to be the cause of the problem.

The network settings are:
Attached to: Bridged Adapter
Name: re0
Adapter Type: Paravirtualized Network (virtio-net)
Promiscuous Mode: Deny
MAC Address: [snip]
Ifconfig says that the interface is up, but I am unable to ping either
the host or anything else on the LAN from the guest.  It looks like the
problem is with outbound traffic.  If I attempt to ping the guest, the
source IP address and MAC address show up in the guest's arp table, but
ping reports:
ping: sendto: Host is down
That makes me think that the arp responses from the guest are not
getting transmitted.  None of the machines involved are running
firewalls.  If I ping from the guest, I don't see any arp requests on
the wire and the arp command shows the table entry as incomplete.

The problem shows up with both FreeBSD -CURRENT and Debian guests.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


New FreeBSD snapshots available: head (20160528 r301230)

2016-06-03 Thread Glen Barber
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

New FreeBSD development branch installation ISOs and virtual machine
disk images have been uploaded to the FTP mirrors.

As with any development branch, the installation snapshots are not
intended for use on production systems.  We do, however, encourage
testing on non-production systems as much as possible.

Please also consider installing the sysutils/panicmail port, which can
help in providing FreeBSD developers the necessary information regarding
system crashes.

Checksums for the installation ISOs and the VM disk images follow at
the end of this email.

=== Installation ISOs ===

Installation images are available for:

o 11.0-ALPHA2 amd64 GENERIC
o 11.0-ALPHA2 powerpc GENERIC
o 11.0-ALPHA2 powerpc64 GENERIC64
o 11.0-ALPHA2 sparc64 GENERIC
o 11.0-ALPHA2 armv6 BANANAPI
o 11.0-ALPHA2 armv6 BEAGLEBONE
o 11.0-ALPHA2 armv6 CUBIEBOARD
o 11.0-ALPHA2 armv6 CUBIEBOARD2
o 11.0-ALPHA2 armv6 CUBOX-HUMMINGBOARD
o 11.0-ALPHA2 armv6 GUMSTIX
o 11.0-ALPHA2 armv6 RPI-B
o 11.0-ALPHA2 armv6 RPI2
o 11.0-ALPHA2 armv6 PANDABOARD
o 11.0-ALPHA2 armv6 WANDBOARD
o 11.0-ALPHA2 aarch64 GENERIC

Note: The i386 build failed due to an issue generating the doc.txz
distribution, which is still undergoing investigation.

Note regarding arm/armv6 images: For convenience for those without
console access to the system, a freebsd user with a password of
freebsd is available by default for ssh(1) access.  Additionally,
the root user password is set to root, which it is strongly
recommended to change the password for both users after gaining
access to the system.

Snapshots may be downloaded from the corresponding architecture
directory from:

ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/

Please be patient if your local FTP mirror has not yet caught
up with the changes.

Problems, bug reports, or regression reports should be reported through
the Bugzilla PR system or the appropriate mailing list such as -current@
or -stable@ .

=== Virtual Machine Disk Images ===
 
VM disk images are available for the following architectures:

o 11.0-ALPHA2 amd64
o 11.0-ALPHA2 aarch64

Disk images may be downloaded from the following URL (or any of the
FreeBSD FTP mirrors):

ftp://ftp.freebsd.org/pub/FreeBSD/snapshots/VM-IMAGES/

Images are available in the following disk image formats:

~ RAW
~ QCOW2 (qemu)
~ VMDK (qemu, VirtualBox, VMWare)
~ VHD (qemu, xen)

The partition layout is:

~ 512k - freebsd-boot GPT partition type (bootfs GPT label)
~ 1GB  - freebsd-swap GPT partition type (swapfs GPT label)
~ ~17GB - freebsd-ufs GPT partition type (rootfs GPT label)

Note regarding arm64/aarch64 virtual machine images: a modified QEMU EFI
loader file is needed for qemu-system-aarch64 to be able to boot the
virtual machine images.  The file can be found here, for now, until
various patches are available upstream:

http://people.FreeBSD.org/~gjb/QEMU_EFI.fd

The checksums for this file are:

SHA256 (QEMU_EFI.fd) = 
a35335a418781fc0963c80ab12d548b6972d2c0b955f45664a4b780f4e5f48a2
MD5 (QEMU_EFI.fd) = ec03d51a3c4374a515cf32ab0c2721cf

To boot the VM image, run:

% qemu-system-aarch64 -m 4096M -cpu cortex-a57 -M virt  \\
-bios QEMU_EFI.fd -serial telnet::,server -nographic \\
-drive if=none,file=VMDISK,id=hd0 \\
-device virtio-blk-device,drive=hd0 \\
-device virtio-net-device,netdev=net0 \\
-netdev user,id=net0

Be sure to replace "VMDISK" with the path to the virtual machine image.

=== Vagrant Images ===

FreeBSD/amd64 images are available on the Hashicorp Atlas site for the
VMWare Desktop and VirtualBox providers, and can be installed by
running:

% vagrant init freebsd/FreeBSD-11.0-ALPHA2
% vagrant up

== ISO CHECKSUMS ==

o 11.0-ALPHA2 amd64 GENERIC:
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-bootonly.iso) = 
b6ecbad09f01e1044343229ee93552c7c6adfc1c0cbe07d1a876a679544c626775be05534ec619ef5383b5419acc13110df7f47301522b6f0393e62626e0d3cb
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-bootonly.iso.xz) = 
3a5d9a57a38363d9c4def1df07ae13e814be72af77a1932e39c97ea11ffb28ac92c6290a50099ab8380cc6265f45214c307042a9f5c0727f87150a2b74479eb9
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-disc1.iso) = 
105c02b3736a2b7453a16a72b75be528362be5ebe0c5e1bb0a28f36814f298cb683e5de83d6813e249bbb27b820f55e1c93bb5b5f86cf07006efd07cdc80379c
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-disc1.iso.xz) = 
8ded96a1fd3ff4d9456918db469d114df01f94222751ca5028a58b08b851bc365887e761d10b8b9f66234e832f1657223d1e09b00533210cf520aa5582c5fd5d
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-dvd1.iso) = 
fb4af5a3fe0dc84c5768ae21e71bd21275a3bdd827acde2a36d2a610ae08113f24d60f6dc27a14ff2acac0b54a165385bbe75bedc2601a161df20a7e4630aeba
  SHA512 (FreeBSD-11.0-ALPHA2-amd64-20160528-r301230-dvd1.iso.xz) = 
70f55770d4aa9c9964be562224935db1674ef7dd01062347ac6680cfa82ddf989d33ca6da2b4e7a51f4a539719aac2d824651deaeebdd634dbc6cc26ca8

Re: amd64 11.0 -r301139 installworld (WITH_META_MODE=yes) fails for "Unable to determine compiler type"

2016-06-03 Thread Bryan Drewery
On 6/1/2016 12:38 PM, Bryan Drewery wrote:
>> WITHOUT_CROSS_COMPILER=
> It's likely related to this flag.  I'll look into it.

I've fixed this in r301287.

-- 
Regards,
Bryan Drewery



signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Adrian Chadd
On 3 June 2016 at 11:27, Adrian Chadd  wrote:

> That and the other NUMA stuff is something to address in -12.

And, I completely welcome continued development in NUMA scaling in
combination with discussion. The iterator changes I committed are a
more generic version of a patch people were applying on top of -10 and
-head for at least what, three years now? Maybe more if -9 also just
did round-robin and not first-touch?



-adrian
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Adrian Chadd
On 3 June 2016 at 10:55, Konstantin Belousov  wrote:
> On Fri, Jun 03, 2016 at 11:29:13AM -0600, Alan Somers wrote:
>> On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
>>  wrote:
>> > On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
>> >> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
>> >> patches never made it into head or ports.  Are they unsuitable for low
>> >> core-count machines, or is there some other reason not to commit them?
>> >>  If not, what would it take to get these into 11.0 or 11.1 ?
>> >
>> > The fast page fault handler was redesigned and committed in r269728
>> > and r270011 (with several follow-ups).
>> > Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
>> > to use uma, see r289279.  Other improvement to the buffer cache was
>> > committed as r267255.
>> >
>> > What was not committed is the aggressive pre-population of the phys objects
>> > mem queue, and a knob to further split NUMA domains into smaller domains.
>> > The later change is rotten.
>> >
>> > In fact, I think that with that load, what you would see right now on
>> > HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
>> > handle it.
>>
>> Thanks for the update.  Is it still recommended to enable the
>> multithreaded pagedaemon?
>
> Single-threaded pagedaemon cannot maintain the good system state even
> on non-NUMA systems, if machine has large memory.  This was the motivation
> for the NUMA domain split patch.  So yes, to get better performance you
> should enable VM_NUMA_ALLOC option.
>
> Unfortunately, there were some code changes of quite low quality which
> resulted in the NUMA-enabled system to randomly fail with NULL pointer
> deref in the vm page alloc path.  Supposedly that was fixed, but you
> should try that yourself.  One result of the mentioned changes was that
> nobody used/tested NUMA-enabled systems under any significant load, for
> quite long time.

The iterator bug was fixed, so it still behaves like it used to if
NUMA is enabled circa what, freebsd-9? If you'd like that older
behavior, you can totally flip back to the global policy being
round-robin only, and it's then a glorified, configurable-at-runtime
no-op.

The difference now is that you can tickle imbalances if you have too
many processes that need pages from a specific domain instead of round
robin, because the underlying tracking mechanisms still assume a
single global pool and global method of cleaning things.

That and the other NUMA stuff is something to address in -12.


-adrian
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Konstantin Belousov
On Fri, Jun 03, 2016 at 11:29:13AM -0600, Alan Somers wrote:
> On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
>  wrote:
> > On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
> >> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
> >> patches never made it into head or ports.  Are they unsuitable for low
> >> core-count machines, or is there some other reason not to commit them?
> >>  If not, what would it take to get these into 11.0 or 11.1 ?
> >
> > The fast page fault handler was redesigned and committed in r269728
> > and r270011 (with several follow-ups).
> > Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
> > to use uma, see r289279.  Other improvement to the buffer cache was
> > committed as r267255.
> >
> > What was not committed is the aggressive pre-population of the phys objects
> > mem queue, and a knob to further split NUMA domains into smaller domains.
> > The later change is rotten.
> >
> > In fact, I think that with that load, what you would see right now on
> > HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
> > handle it.
> 
> Thanks for the update.  Is it still recommended to enable the
> multithreaded pagedaemon?

Single-threaded pagedaemon cannot maintain the good system state even
on non-NUMA systems, if machine has large memory.  This was the motivation
for the NUMA domain split patch.  So yes, to get better performance you
should enable VM_NUMA_ALLOC option.

Unfortunately, there were some code changes of quite low quality which
resulted in the NUMA-enabled system to randomly fail with NULL pointer
deref in the vm page alloc path.  Supposedly that was fixed, but you
should try that yourself.  One result of the mentioned changes was that
nobody used/tested NUMA-enabled systems under any significant load, for
quite long time.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Alan Somers
On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
 wrote:
> On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
>> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
>> patches never made it into head or ports.  Are they unsuitable for low
>> core-count machines, or is there some other reason not to commit them?
>>  If not, what would it take to get these into 11.0 or 11.1 ?
>
> The fast page fault handler was redesigned and committed in r269728
> and r270011 (with several follow-ups).
> Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
> to use uma, see r289279.  Other improvement to the buffer cache was
> committed as r267255.
>
> What was not committed is the aggressive pre-population of the phys objects
> mem queue, and a knob to further split NUMA domains into smaller domains.
> The later change is rotten.
>
> In fact, I think that with that load, what you would see right now on
> HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
> handle it.

Thanks for the update.  Is it still recommended to enable the
multithreaded pagedaemon?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Konstantin Belousov
On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
> patches never made it into head or ports.  Are they unsuitable for low
> core-count machines, or is there some other reason not to commit them?
>  If not, what would it take to get these into 11.0 or 11.1 ?

The fast page fault handler was redesigned and committed in r269728
and r270011 (with several follow-ups).
Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
to use uma, see r289279.  Other improvement to the buffer cache was
committed as r267255.

What was not committed is the aggressive pre-population of the phys objects
mem queue, and a knob to further split NUMA domains into smaller domains.
The later change is rotten.

In fact, I think that with that load, what you would see right now on
HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
handle it.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


buildworld: /usr/bin/ar segfault

2016-06-03 Thread Eric van Gyzen
My buildworld just failed very early with a segfault from /usr/bin/ar:

--
>>> stage 1.1: legacy release compatibility shims
--
...
--- libegacy.a ---
building static egacy library
ar -crD libegacy.a `NM='nm' NMFLAGS='' lorder dummy.o  | tsort -q`
Segmentation fault (core dumped)
*** [libegacy.a] Error code 139


In __archive_write_allocate_filter(), a->filter_last was pointing to
archive_write_ar_header().  a->format_write_header should have pointed
to this function.  The offset between these two fields in struct archive
is 48 bytes.  Sure enough, that structure recently grew by 48 bytes.

This would seem to indicate that ar (or libarchive.a) was built with
mismatched objects.  Unfortunately, I don't have good records of what
build options and flags I used.  I /think/ I used either
-DWITH_SYSTEM_COMPILER or no options at all.  Well, I always use -j4. 
The "ar" that segfaulted seems to be /usr/bin/ar, so it's from r300692. 
Thanks to my list of Boot Environments (yay), I'm pretty sure I was
running r298525 when I built r300692 (and I was running r297692 when I
built r298525).  I installed r297692 from the snapshot memstick.img.

I recovered by restoring ar (and ranlib) from an old BE (yay again!). 
I'm reporting this in case there is a bug in the build and someone is
willing to go hunting for it based on this vague report.  :)

Eric
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CLOCK_MONOTONIC / CLOCK_UPTIME is not really monotonic between threads

2016-06-03 Thread Konstantin Belousov
On Fri, Jun 03, 2016 at 08:04:29AM -0700, Maxim Sobolev wrote:
> Konstantin,
> 
> Thanks for taking your time looking into it and sorry for somewhat messy
> problem report. I've been trying to fix that problem all day yesterday
> thinking it's just application logic that is broken and indeed has been
> able to find some bigger issues that were obscuring this one. But it got
> very frustrating when the bug popped out anew at a seemingly lower level
> now. The issue that triggered this is in some high level python code. Which
> makes it quite difficult to narrow and isolate. There is still slight
> chance that it's something about threading within the python that screws
> this up somehow, however I don't quite see how that could lead to a
> consistent result that is just off by few hundred microseconds and not in
> some random garbage.
> 
> So, I take from you message, that high level
> clock_gettime(CLOCK_MONOTONIC*) is supposed to be monotonic with respect to
> the wall time even when called in different threads? I always though it is,
> but was not 100% sure about that and wanted to confirm it before I dive
> deeper into this and spend more time writing a test case to expose this.
Yes, CLOCK_MONOTONIC should be monotonic across all processors.
Until the time travel is made possible, of course.

> The test case you gave me is interesting, but somewhat low-level. What I
> would do if it comes to it, is to make something that uses pthreads and
> plain clock_gettime(2). Should not be too difficult to reproduce if it's
> real issue.
The test I give you verifies clock_gettime() in several threads going
backward.

> 
> P.S. I've also tried kern.timecounter.fast_gettime=0, made no difference.
> Assuming it does not take a reboot to test it. Neither does
> switching kern.timecounter.hardware, I've tested TSC-low(1000)
> ACPI-fast(900) HPET(950) i8254(0), all are the same.
I am almost sure this is app-level issue.

To make me confident, run the test I provided.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Matthew Macy

 > >>> A couple small steps have been taken toward eliminating the need for 
 > >>> this 
 > >>> hack: the addition of the "page size index" field to struct vm_page and 
 > >>> the 
 > >>> addition of a similarly named parameter to pmap_enter().  However, at 
 > >>> the 
 > >>> moment, the only tangible effect is in the automatic prefaulting by 
 > >>> mmap(2).  Instead of establishing 96 4KB page mappings, the automatic 
 > >>> prefaulting establishes 96 page mappings whose size is determined by the 
 > >>> size of the physical pages that it finds in the vm object.  So, the 
 > >>> prefaulting overhead remains constant, but the coverage provided by the 
 > >>> automatic prefaulting will vary with the underlying page size. 
 > >> Yes, I think what we might actually want is what I mentioned in person at 
 > >> BSDCan: some sort of flag to mmap() that malloc() could use to assume 
 > >> that any 
 > >> reservations are fully used when they are reserved.  This would avoid the 
 > >> need 
 > >> to wait for all pages to be dirtied before promotion provides a superpage 
 > >> mapping and would avoid demotions while still allowing the kernel to 
 > >> gracefully 
 > >> fall back to regular pages if a reservation can't be made. 
 > >> 
 > > 
 > > I agree. 
 >  
 > I notice that, with the exception of the VM_PHYSSEG_MAX change, these 
 > patches never made it into head or ports.  Are they unsuitable for low 
 > core-count machines, or is there some other reason not to commit them? 
 >  If not, what would it take to get these into 11.0 or 11.1 ? 
 >  

I think the two big issues are: a) there's a lot more work that needs to be 
done b) Adrian has had a lot of other things on his plate in the meantime. 
Adrian is hoping to get back to it post 11.0-RELEASE.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: No debug info for statically linked stuff

2016-06-03 Thread Ed Maste
On 3 June 2016 at 12:49, Eric van Gyzen  wrote:
> I'm running head from Tuesday (r301045).  I just noticed that "ar" is
> missing the debuginfo for libarchive:

I discussed this with Eric off this thread, but for the sake of others
this is happening because WITH_DEBUG_FILES enables -g when building
binaries and shared libs, but doesn't make any change for static
libraries.

Presumably we want WITH_DEBUG_FILES to just enable -g when building
static libs as well (and avoid stripping on install). Probably need to
leave it disabled for the Clang/LLVM/LLDB libs, because enabling that
would add a significant amount of time to buildworld. I think GNU ld
2.17.50 might not even be able to link a debug build of Clang on i386.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


No debug info for statically linked stuff

2016-06-03 Thread Eric van Gyzen
I'm running head from Tuesday (r301045).  I just noticed that "ar" is
missing the debuginfo for libarchive:

$ /usr/local/bin/gdb /usr/bin/ar
GNU gdb (GDB) 7.11 [GDB v7.11 for FreeBSD]
[...]
Reading symbols from /usr/bin/ar...Reading symbols from
/usr/lib/debug//usr/bin/ar.debug...done.
done.
(gdb) ptype struct archive
No struct type named archive.
(gdb) p archive_write_open_filename
$1 = {} 0x406be0


Is this a known issue?

Has libarchive.a already been stripped when ar is linked?

Eric

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: MMap in FreeBSD 11

2016-06-03 Thread Adrian Chadd
On 3 June 2016 at 05:34, Kubiak, Patryk  wrote:
> Hi ,
>
> I have question about mmap and /dev/mem in the FreeBSD 11. We found, that 
> ours applications does not work correct in the 11 - there is a problem with 
> mapping physical memory via mmap and /dev/mem device - we are receiving 
> "Permission Denied" error. Is there any important change in the new version, 
> that prohibits this operation, or it is just a bug in the OS or App?

Can you provide any further information? Maybe some example code?



-adrian

> Regards,
> Patryk
> 
>
> Intel Technology Poland sp. z o.o.
> ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII 
> Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 
> 957-07-52-316 | Kapital zakladowy 200.000 PLN.
>
> Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
> moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
> wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
> jakiekolwiek
> przegladanie lub rozpowszechnianie jest zabronione.
> This e-mail and any attachments may contain confidential material for the 
> sole use of the intended recipient(s). If you are not the intended recipient, 
> please contact the sender and delete all copies; any review or distribution by
> others is strictly prohibited.
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Alan Somers
On Thu, Aug 14, 2014 at 12:19 PM, Alan Cox  wrote:
> On 08/14/2014 10:47, John Baldwin wrote:
>> On Wednesday, August 13, 2014 1:00:22 pm Alan Cox wrote:
>>> On Tue, Aug 12, 2014 at 1:09 PM, John Baldwin  wrote:
>>>
 On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
> Hi!
>
>
> On 16 July 2014 06:29, Konstantin Belousov  wrote:
>> On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
>>> Hi,
>>> I did some measurements and hacks to see about the performance and
>>> scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
>>> Foundation.
>>>
>>> The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
>>> The uncommitted patches, referenced in the article, are available as
>>> https://kib.kiev.ua/kib/pig1.patch.txt
>>> https://kib.kiev.ua/kib/patch-2
>> A followup to the original paper.
>>
>> Most importantly, I identified the cause for the drop on the graph
>> after the 30 clients, which appeared to be the debugging version
>> of malloc(3) in libc.
>>
>> Also there are some updates on the patches.
>>
>> New version of the paper is available at
>> https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
>> The changes are marked as 'update for version 2.0'.
> Would you mind trying a default (non-PRODUCTION) build, but with junk
> filling turned off?
>
> adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
>
> lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf -> junk:false
>
> That fixes almost all of the malloc debug performance issues that I
> see without having to recompile.
>
> I'd like to know if you see any after that.
 OTOH, I have actually seen junk profiling _improve_ performance in certain
 cases as it forces promotion of allocated pages to superpages since all
 pages
 are dirtied.  (I have a local hack that adds a new malloc option to
 explicitly
 memset() new pages allocated via mmap() that gives the same benefit without
 the junking overheadon each malloc() / free(), but it does increase
 physical
 RAM usage.)


>>> John,
>>>
>>> A couple small steps have been taken toward eliminating the need for this
>>> hack: the addition of the "page size index" field to struct vm_page and the
>>> addition of a similarly named parameter to pmap_enter().  However, at the
>>> moment, the only tangible effect is in the automatic prefaulting by
>>> mmap(2).  Instead of establishing 96 4KB page mappings, the automatic
>>> prefaulting establishes 96 page mappings whose size is determined by the
>>> size of the physical pages that it finds in the vm object.  So, the
>>> prefaulting overhead remains constant, but the coverage provided by the
>>> automatic prefaulting will vary with the underlying page size.
>> Yes, I think what we might actually want is what I mentioned in person at
>> BSDCan: some sort of flag to mmap() that malloc() could use to assume that 
>> any
>> reservations are fully used when they are reserved.  This would avoid the 
>> need
>> to wait for all pages to be dirtied before promotion provides a superpage
>> mapping and would avoid demotions while still allowing the kernel to 
>> gracefully
>> fall back to regular pages if a reservation can't be made.
>>
>
> I agree.

I notice that, with the exception of the VM_PHYSSEG_MAX change, these
patches never made it into head or ports.  Are they unsuitable for low
core-count machines, or is there some other reason not to commit them?
 If not, what would it take to get these into 11.0 or 11.1 ?

-Alan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CLOCK_MONOTONIC / CLOCK_UPTIME is not really monotonic between threads

2016-06-03 Thread Maxim Sobolev
Konstantin,

Thanks for taking your time looking into it and sorry for somewhat messy
problem report. I've been trying to fix that problem all day yesterday
thinking it's just application logic that is broken and indeed has been
able to find some bigger issues that were obscuring this one. But it got
very frustrating when the bug popped out anew at a seemingly lower level
now. The issue that triggered this is in some high level python code. Which
makes it quite difficult to narrow and isolate. There is still slight
chance that it's something about threading within the python that screws
this up somehow, however I don't quite see how that could lead to a
consistent result that is just off by few hundred microseconds and not in
some random garbage.

So, I take from you message, that high level
clock_gettime(CLOCK_MONOTONIC*) is supposed to be monotonic with respect to
the wall time even when called in different threads? I always though it is,
but was not 100% sure about that and wanted to confirm it before I dive
deeper into this and spend more time writing a test case to expose this.
The test case you gave me is interesting, but somewhat low-level. What I
would do if it comes to it, is to make something that uses pthreads and
plain clock_gettime(2). Should not be too difficult to reproduce if it's
real issue.

P.S. I've also tried kern.timecounter.fast_gettime=0, made no difference.
Assuming it does not take a reboot to test it. Neither does
switching kern.timecounter.hardware, I've tested TSC-low(1000)
ACPI-fast(900) HPET(950) i8254(0), all are the same.

In any case thanks for your valuable input, I think I have enough
information for now to investigate it further.

-Max
On Jun 2, 2016 10:06 PM, "Konstantin Belousov"  wrote:

> On Thu, Jun 02, 2016 at 09:16:47PM -0700, Maxim Sobolev wrote:
> > Hi there, we have an application here which is trying to measure UDP
> > command/response round-trip-time. It runs two posix threads (more
> actually,
> > but that's probably irrelevant), one (let's call it A) that does
> high-level
> > logic and the second one (B) that does network packet I/O.
> >
> > The sending side is done by first thread A forming the request, then
> > calling the function clock_gettime(CLOCK_MONOTONIC) and passing the
> packet
> > into the thread B. Obtained timestamp is stored with some logical
> > transaction ID allowing us to pull that stored value later on when
> response
> > arrives. Then we have a separate process that receives those requests,
> > processing them and sending back some form of response.
> >
> > Upon receiving a response from the network, the network I/O thread (B)
> > timestamps it by running clock_gettime(CLOCK_MONOTONIC) and passes the
> > packet data along with that value via queue to the thread A for
> processing.
> >
> > So if we put things into timeline, what our app does would probably look
> > something like the following:
> >
> > 1. Thread A generates request.
> > 2. A calls clock_gettime(CLOCK_MONOTONIC), storing value as t1 internally
> > 3. A passes packet to thread B
> > 4. B sends out packet via sendto() to server process running on the same
> > box (fully separate, not a thread)
> >
> > [some microseconds later]
> >
> > 5. B receives response from server with recvfrom()
> > 6. B instantly calls clock_gettime(CLOCK_MONOTONIC), assigns returned
> value
> > to t2
> > 7. B passes packet data along with t2 to the A via queue
> > 8. A picks up packet, parses it and retrieves corresponding t1 stored at
> > step 2.
> > 9. A calculates RTT by doing t2 - t1 assuming it's going to be
> positive...
> >
> > As you might have guessed if you are still reading, from time to time t2
> -
> > t1 comes out slightly negative! Provided it's not some obscure bug in our
> > app, there is no way this could happen if clock_gettime(CLOCK_MONOTONIC)
> > would work as advertised. Event (2) could not possibly happen earlier
> than
> > (6), which is guaranteed by the fact that the request needs to be
> processed
> > by the external entity first in order for the response to be seen by our
> > app at all. I've added some logs and it seems to be confirming that the
> > server only sees a single request, there is no chance for the client to
> > receive some other packet and confuse it. I've also confirmed with
> tcpdump,
> > which shows reasonable time delay between request and reply of few
> hundreds
> > microseconds.
> >
> > I've checked all logic and I could not find any mistakes on my end here,
> so
> > I added some logging for such events. The distribution appears to be
> > centered around 0.6s, but there are some events that go as far up as
> > 0.012s.
> >
> > I've also tried using CLOCK_UPTIME_PRECISE instead, but it makes no
> > difference whatsoever.
> >
> > My questions therefore are:
> >
> > 1. Is it intended/expected behavior of the said API?
> No.
>
> > 2. Has anyone else bumped into this?
> Not that I am aware of.
>
> > 3. I know we are doing some clever optimizations using

Re: CLOCK_MONOTONIC / CLOCK_UPTIME is not really monotonic between threads

2016-06-03 Thread Maxim Sobolev
a. multiple cores.
b. makes no difference
c. yes, I believe so

kern.timecounter.tsc_shift: 1
kern.timecounter.smp_tsc_adjust: 0
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
machdep.tsc_freq: 2658118740
machdep.disable_tsc_calibration: 0
machdep.disable_tsc: 0

d. no, single socket Intel Q6700. I've also seen this problem on core
i7-4770 running virtualbox 5.x. I have a hints that this also happens on
our bigger production boxes, but I have no specifics yet.

On Thu, Jun 2, 2016 at 10:05 PM, Adrian Chadd 
wrote:

> [snip]
>
> a) is it on one core, or multiple cores?
> b) CLOCK_MONOTONIC_FAST?
> c) is it on a system that /has/ invariant-TSC ?
> d) is this a multi-socket system?
>
>
>
> -adrian
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


MMap in FreeBSD 11

2016-06-03 Thread Kubiak, Patryk
Hi ,

I have question about mmap and /dev/mem in the FreeBSD 11. We found, that ours 
applications does not work correct in the 11 - there is a problem with mapping 
physical memory via mmap and /dev/mem device - we are receiving "Permission 
Denied" error. Is there any important change in the new version, that prohibits 
this operation, or it is just a bug in the OS or App?

Regards,
Patryk


Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CLOCK_MONOTONIC / CLOCK_UPTIME is not really monotonic between threads

2016-06-03 Thread Pieter de Goeje

Op 2016-06-03 om 06:16 schreef Maxim Sobolev:

4. If the answer for (3) is yes, then what is the method to disable using
TSC and use slower but possibly more reliable method?


sysctl kern.timecounter.choice lists the available timers.
sysctl kern.timecounter.hardware selects the timer. Changes take effect 
immediately.


- Pieter

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"