sanitizers broken (was RE: libc/libsys split coming soon)

2024-02-21 Thread Hartmut.Brandt
Hi,

I updated yesterday and now event a minimal program with

cc -fsanitize=address

produces

ld: error: undefined symbol: __elf_aux_vector
>>> referenced by sanitizer_linux_libcdep.cpp:950 
>>> (/usr/src/contrib/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp:950)
>>>   sanitizer_linux_libcdep.o:(__sanitizer::ReExec()) in archive 
>>> /usr/lib/clang/17/lib/freebsd/libclang_rt.asan-x86_64.a
cc: error: linker command failed with exit code 1 (use -v to see invocation)

I think this is caused by the libsys split.

Cheers,
Harti

-Original Message-
From: owner-freebsd-curr...@freebsd.org  On 
Behalf Of Brooks Davis
Sent: Friday, February 2, 2024 11:32 PM
To: curr...@freebsd.org
Subject: libc/libsys split coming soon

TL;DR: The implementation of system calls is moving to a seperate library 
(libsys).  No changes are required to existing software (except to ensure that 
libsys is present when building custom disk images).

Code: https://github.com/freebsd/freebsd-src/pull/908

After nearly a decade of intermittent work, I'm about to land a series of 
patches which moves system calls, vdso support, and libc's parsing of the ELF 
auxiliary argument vector into a separate library (libsys).  I plan to do this 
early next week (February 5th).

This change serves three primary purposes:
  1. It's easier to completely replace system call implementations for
 tracing or compartmentalization purposes.
  2. It simplifies the implementation of restrictions on system calls such
 as those implemented by OpenBSD's msyscall(2)
 (https://man.openbsd.org/msyscall.2).
  3. It allows language runtimes to link with libsys for system call
 implementations without requiring libc.

libsys is an auxiliary filter for libc.  This means that for any symbol defined 
by both, the libsys version takes precedence at runtime.  For system call 
implementations, libc contains empty stubs.  For others it contains copies of 
the functions (this could be further refined at a later date).  The statically 
linked libc contains the full implementations so linking libsys is not required.

Additionally, libthr is now linked with libsys to provide _umtx_op_err().

The overall implementation follows https://reviews.freebsd.org/D14609,
but is redone from scratch as multiple commits to facilitate review and assist 
git's rename detection.

Testing:
  - Boot testing on amd64, aarch64, and riscv
  - make tinderbox (prior version, final run in progress)
  - exp-run: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=276391
  - Kyua tests in poudriere amd64 jails: same 359 failures as with the
latest freebsdci build

Thanks to Ali Mashtizadeh and Tal Garfinkel for D14609 and many apologies for 
not landing this in a timely manner.  Additional thanks to kib@ for many rounds 
of review, markj@ and kib@ for debugging rtld issues exposed by this patch, and 
antoine@ for exp-runs.

Future work:
  - Purely functional interfaces to system calls (no errorno).
Unfortunately there isn't an obvious way to do this without
significant (possibly generated) assembly code.
  - Investigate msyscall(2) and pinsyscalls(2).
  - Reduce the size of stubs in libc.  I’ve errored on the
side of not touching the copies that end up in libc to keep diff
size down.  We might want to generate empty stubs instead.

See also:
  - Solaris Linker and Libraries Guide:
https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter4-4.html

-- Brooks



RE: Is "/usr/bin/sscop" still relevant? (related to ATM)

2020-11-11 Thread Hartmut.Brandt
Yes.

harti

From: Warner Losh 
Sent: Tuesday, November 10, 2020 10:27 PM
To: Brandt, Hartmut 
Cc: mj-mailingl...@gmx.de; FreeBSD Current 
Subject: Re: Is "/usr/bin/sscop" still relevant? (related to ATM)

So both the kernel and userland parts can go away?

./contrib/ngatm/sscop
./sys/modules/netgraph/atm/sscop
./sys/netgraph/atm/sscop
./usr.bin/atm/sscop

Warner

On Tue, Nov 10, 2020 at 1:27 AM 
mailto:hartmut.bra...@dlr.de>> wrote:
Hi,

this can go away. It is the transport protocol underlying ATM signaling.

harti

-Original Message-
From: 
owner-freebsd-curr...@freebsd.org 
mailto:owner-freebsd-curr...@freebsd.org>> 
On Behalf Of mj-mailingl...@gmx.de
Sent: Monday, November 9, 2020 10:07 PM
To: freebsd-current@freebsd.org
Subject: Is "/usr/bin/sscop" still relevant? (related to ATM)

Is "/usr/bin/sscop" still relevant? The sscop tool implements the Q.2110 
transport protocol, which is used in ATM-Networks.
The NATM framework was removed in April 2017, but sscop depends on netgraph 
(libngatm.so.4), so it seems to be independent from NATM.

The manpage refers to libunimsg(3), which does not exist, but unimsg(3) does.
It also depends on libngatm.so.4, which also does not have a man page.

So, is it still useful? The most documents i found about ATM are from the early 
2000 to mid 2010s.

--
Martin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 
"freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: Is "/usr/bin/sscop" still relevant? (related to ATM)

2020-11-10 Thread Hartmut.Brandt
Hi,

this can go away. It is the transport protocol underlying ATM signaling.

harti

-Original Message-
From: owner-freebsd-curr...@freebsd.org  On 
Behalf Of mj-mailingl...@gmx.de
Sent: Monday, November 9, 2020 10:07 PM
To: freebsd-current@freebsd.org
Subject: Is "/usr/bin/sscop" still relevant? (related to ATM)

Is "/usr/bin/sscop" still relevant? The sscop tool implements the Q.2110 
transport protocol, which is used in ATM-Networks.
The NATM framework was removed in April 2017, but sscop depends on netgraph 
(libngatm.so.4), so it seems to be independent from NATM.
 
The manpage refers to libunimsg(3), which does not exist, but unimsg(3) does.
It also depends on libngatm.so.4, which also does not have a man page.
 
So, is it still useful? The most documents i found about ATM are from the early 
2000 to mid 2010s.
 
--
Martin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


make problem

2018-08-15 Thread Hartmut.Brandt
Hi,

what is wrong with the following Makefile?

FILES=  a.in b.in a b
FILESDIR= /tmp/foo

.include 

.SUFFIXES: .in
.in:
cp $(.IMPSRC) $(.TARGET)

Given that a.in and b.in exist and 'make' has been executed, 'make install' 
gives the following error:

# sudo make install
installing DIRS FILESDIR
install  -d -m 0755 -o root  -g wheel  /tmp/foo
install  -o root  -g wheel -m 444  a.in /tmp/foo/a.in
install  -o root  -g wheel -m 444  b.in /tmp/foo/b.in
install  -o root  -g wheel -m 444  a /tmp/foo/a
cp _FILESINS1_a.in _FILESINS1_a
cp: _FILESINS1_a.in: No such file or directory
*** Error code 1

Stop.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RiscV tinderbox fails

2018-06-29 Thread Hartmut.Brandt
Hi,

is it supposed not to fail? I get:

/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libgcc.a(comparedf2.o): In function
 `__gedf2':
/usr/src/contrib/compiler-rt/lib/builtins/comparedf2.c:101: multiple definition
of `__gedf2'
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libc.a(gedf2.o):/usr/src/lib/libc/s
oftfloat/gedf2.c:18: first defined here
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libgcc.a(comparedf2.o): In function
 `__eqdf2':
/usr/src/contrib/compiler-rt/lib/builtins/comparedf2.c:127: multiple definition
of `__eqdf2'
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libc.a(eqdf2.o):/usr/src/lib/libc/s
oftfloat/eqdf2.c:18: first defined here
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libgcc.a(comparedf2.o): In function
 `__ltdf2':
/usr/src/contrib/compiler-rt/lib/builtins/comparedf2.c:127: multiple definition
of `__ltdf2'
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libc.a(ltdf2.o):/usr/src/lib/libc/s
oftfloat/ltdf2.c:18: first defined here
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libgcc.a(comparedf2.o): In function
 `__nedf2':
/usr/src/contrib/compiler-rt/lib/builtins/comparedf2.c:127: multiple definition
of `__nedf2'
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libc.a(nedf2.o):/usr/src/lib/libc/s
oftfloat/nedf2.c:18: first defined here
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libgcc.a(comparedf2.o): In function
 `__gtdf2':
/usr/src/contrib/compiler-rt/lib/builtins/comparedf2.c:142: multiple definition
of `__gtdf2'
/usr/obj/usr/src/riscv.riscv64sf/tmp/usr/lib/libc.a(gtdf2.o):/usr/src/lib/libc/s
oftfloat/gtdf2.c:18: first defined here
collect2: error: ld returned 1 exit status
*** [nologin.full] Error code 1

make[6]: stopped in /usr/src/usr.sbin/nologin
1 error

Regards,
harti

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: int128_t and uint128_t typeinfo

2017-02-23 Thread Hartmut.Brandt
Now that appears to work.

Thanks,
harti

-Original Message-
From: Dimitry Andric [mailto:d...@freebsd.org] 
Sent: Wednesday, February 22, 2017 7:49 PM
To: Brandt, Hartmut
Cc: curr...@freebsd.org
Subject: Re: int128_t and uint128_t typeinfo

I had to commit a follow-up fix in r314104: when C++ names are used in the 
version script, they have to be surrounded by an extern "C++" {} block, 
otherwise the symbols end up as locals in the final library, and thus get 
stripped out of the installed version.

-Dimitry

On 22 Feb 2017, at 16:19, hartmut.bra...@dlr.de wrote:
> 
> Looks like they are still not there. I've rebuilt world.
> 
> nm -D -C /usr/lib/libcxxrt.so  | grep 128
> 
> should show me the symbols, right? It does not.
> 
> harti
> 
> -Original Message-
> From: Dimitry Andric [mailto:d...@freebsd.org]
> Sent: Tuesday, February 21, 2017 10:52 PM
> To: Brandt, Hartmut
> Cc: curr...@freebsd.org
> Subject: Re: int128_t and uint128_t typeinfo
> 
> On 21 Feb 2017, at 18:26, Dimitry Andric  wrote:
>> 
>> On 21 Feb 2017, at 13:48, Hartmut Brandt  wrote:
>>> 
>>> it looks like the typeinfo for __int128_t and __uint128_t is missing from 
>>> our dynamically linked libcxxrt.
> ...
>> * We also need to add the typeinfo for __u?int128_t * and 
>> __u?int128_t const *
>> * Maybe these should be under the CXXABI_2.0 version, since that is 
>> where newer libstdc++ places them
>> * Maybe these should be dependent on whether the architecture 
>> supports
>> 128 bit integers at all
>> 
>> I need to think a bit on the above, then I'll commit a fix.
> 
> Okay, can you please try r314061?
> 
> -Dimitry
> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: int128_t and uint128_t typeinfo

2017-02-22 Thread Hartmut.Brandt
Looks like they are still not there. I've rebuilt world.

nm -D -C /usr/lib/libcxxrt.so  | grep 128

should show me the symbols, right? It does not.

harti

-Original Message-
From: Dimitry Andric [mailto:d...@freebsd.org] 
Sent: Tuesday, February 21, 2017 10:52 PM
To: Brandt, Hartmut
Cc: curr...@freebsd.org
Subject: Re: int128_t and uint128_t typeinfo

On 21 Feb 2017, at 18:26, Dimitry Andric  wrote:
> 
> On 21 Feb 2017, at 13:48, Hartmut Brandt  wrote:
>> 
>> it looks like the typeinfo for __int128_t and __uint128_t is missing from 
>> our dynamically linked libcxxrt.
...
> * We also need to add the typeinfo for __u?int128_t * and __u?int128_t  
> const *
> * Maybe these should be under the CXXABI_2.0 version, since that is  
> where newer libstdc++ places them
> * Maybe these should be dependent on whether the architecture supports
>  128 bit integers at all
> 
> I need to think a bit on the above, then I'll commit a fix.

Okay, can you please try r314061?

-Dimitry

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


asio and kqueue (2nd trye) (was: RE: (boost::)asio and kqueue problem)

2016-10-14 Thread Hartmut.Brandt
Hi all,

here is the 2nd try taking into account the comments I received. Since I'm not 
familiar with the locking in the sockets area I ask somebody with that 
knowledge to check it before I commit it.

Thanks,
harti




From: Scott Mitchell [mailto:scott.k.mit...@gmail.com] 
Sent: Friday, October 14, 2016 2:16 AM
To: freebsd-current@freebsd.org
Cc: sepher...@gmail.com; kostik...@gmail.com; Brandt, Hartmut; 
adrian.ch...@gmail.com
Subject: (boost::)asio and kqueue problem

I am not using boost but I have also encountered this unexpected behavior when 
calling listen after kevent. Is their any update on the approach to merge 
filt_soread and filt_solisten?

FYI - MacOS does not have this unexpected behavior. Read events are not 
"missed" if the listen is done after the kevent EVFILT_READ change is 
registered.

Thanks,
-Scott


asio_listen.diff
Description: asio_listen.diff
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

RE: (boost::)asio and kqueue problem

2016-10-14 Thread Hartmut.Brandt
I have a fix that works and is better and simpler than the previous and will 
try to put it together in the next few days.

harti

From: Scott Mitchell [mailto:scott.k.mit...@gmail.com]
Sent: Friday, October 14, 2016 2:16 AM
To: freebsd-current@freebsd.org
Cc: sepher...@gmail.com; kostik...@gmail.com; Brandt, Hartmut; 
adrian.ch...@gmail.com
Subject: (boost::)asio and kqueue problem

I am not using boost but I have also encountered this unexpected behavior when 
calling listen after kevent. Is their any update on the approach to merge 
filt_soread and filt_solisten?

FYI - MacOS does not have this unexpected behavior. Read events are not 
"missed" if the listen is done after the kevent EVFILT_READ change is 
registered.

Thanks,
-Scott
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: files disappearing from ls on NFS

2013-05-14 Thread Hartmut.Brandt
Now I've also changed NFS_DIRBLKSIZ to 4k - no change.

harti

-Original Message-
From: Rick Macklem [mailto:rmack...@uoguelph.ca] 
Sent: Tuesday, May 14, 2013 2:50 PM
To: Brandt, Hartmut
Cc: curr...@freebsd.org
Subject: Re: files disappearing from ls on NFS

Hartmut Brandt wrote:
> On Mon, 13 May 2013, Rick Macklem wrote:
> 
> RM>Hartmut Brandt wrote:
> RM>> On Sun, 12 May 2013, Rick Macklem wrote:
> RM>>
> RM>> RM>Hartmut Brandt wrote:
> RM>> RM>> Hi,
> RM>> RM>>
> RM>> RM>> I've updated one of my -current machines this week (previous
> RM>> update
> RM>> RM>> was in
> RM>> RM>> february). Now I see a strange effect (it seems only on NFS
> RM>> mounts):
> RM>> RM>> ls or
> RM>> RM>> even echo * will list only some files (strange enough the
> first
> RM>> files
> RM>> RM>> from
> RM>> RM>> the normal, alphabetically ordered list). If I change
> something
> RM>> in the
> RM>> RM>> directory (delete a file or create a new one) for some time
> the
> RM>> RM>> complete
> RM>> RM>> listing will appear but after sime time (seconds to a minute
> or
> RM>> so)
> RM>> RM>> again
> RM>> RM>> only part of the files is listed.
> RM>> RM>>
> RM>> RM>> A ktrace on ls /usr/src/lib/libc/gen shows that
> getdirentries is
> RM>> RM>> called
> RM>> RM>> only once (returning 4096). For a full listing getdirentries
> is
> RM>> called
> RM>> RM>> 5
> RM>> RM>> times with the last returning 0.
> RM>> RM>>
> RM>> RM>> I can still open files that are not listed if I know their
> name,
> RM>> RM>> though.
> RM>> RM>>
> RM>> RM>> The NFS server is a Windows 2008 server with an OpenText NFS
> RM>> Server
> RM>> RM>> which
> RM>> RM>> works without problems to all the other FreeBSD machines.
> RM>> RM>>
> RM>> RM>> So what could that be?
> RM>> RM>>
> RM>> RM>I've attached a patch that might be worth trying. It is a
> "shot in
> RM>> the dark",
> RM>> RM>but brings the new NFS client's readdir closer to the old one
> RM>> (which you
> RM>> RM>mentioned still works ok).
> RM>> RM>
> RM>> RM>Please let me know how it goes, if you have a chance to test
> it,
> RM>> rick
> RM>>
> RM>> Hi Rick,
> RM>>
> RM>> the patch doesn't help.
> RM>>
> RM>> I wrote a small test program, which opens a directory, calls
> RM>> getdents(2)
> RM>> in a loop and dumps that. I figured out, that the return of the
> system
> RM>> call depends on the buffer size I pass to it. The directory has a 
> RM>> block size of 4k according to fstat(2). If I use that, I get some 
> RM>> 300
> of the
> RM>> almost 500 directory entries. If I use 8k, I get just around 200
> and
> RM>> if I
> RM>> use 16k I get a handfull. If I dump the buffer in this case I see
> RM>> 0x200
> RM>> bytes filled with directory entries, then a lot of zeros and
> starting
> RM>> from
> RM>> 0x1000 again data. This is of course ignored because of the zeros 
> RM>> before.
> RM>>
> RM>And for this case getdents(2) returned 16K? It is normal for
> getdents(2)
> RM>to return less than requested and when end of dir occurs, it should
> return 0.
> RM>
> RM>But if it returns 16K, there shouldn't be zeroed space in the
> middle of
> RM>it.
> RM>
> RM>And this always occurs or only after you wait a while? (You noted
> in the
> RM>above description that it would be ok for a little while after a
> directory
> RM>change and then would break, which suggests some kind of caching
> problem.)
> 
> Today in the morning everything was fine. After waiting 5 minutes, 
> again only partial directories. When I do a read with 8k buffer size,
> getdents(2) returns 8k, but starting from 0x200 until 0x1000 the 
> buffer is filled with zeros. The entry just before the zeroes ends 
> exactly at
> 0x200
> (that would be the first byte of the next entry) and at 0x1000 a new 
> entry starts. The rest of the buffer is fine. The next read returns 
> only 4k and seems to be fine - altough it contains some junk non-zero 
> bytes in the padding.
> 
Directory entries should never cross DIRBLKSIZ boundaries (512 or 0x200), so it 
makes sense that one ends at 0x200 and one starts at 0x1000. What doesn't make 
sense are the 0 bytes in between.

One difference between the old and new NFS clients, which the patch I sent you 
changed to the way the old one does it, is filling in the last block.
The old NFS client just leaves the block short and depends on n_direofoffset to 
recognize it is the last block with b_resid indicating where it ends.
For the new client (unless you've applied the patch I emailed you), it fills 
the rest of the last block in with "empty directories". This was in the OpenBSD 
code when I did the original NFSv4 stuff and port. I left it in, because I 
thought it might avoid problems if n_direofoffset was ever bogus. That is why 
there might be "different junk" at the end of the directory, but it shouldn't 
matter.

It almost sounds like something else is bzero()ing out part of the buffer cache 
block. Unless the directory has changed, the getdents() after 5 minutes would 
just return the same buffer cache 

RE: files disappearing from ls on NFS

2013-05-14 Thread Hartmut.Brandt
Hi Rick,

sorry for top-posting - this is Outlook :-(

Attached is the system configuration. I use this more or less unchanged since 
years. The machine is an 8-core AMD64 with 144GByte memory.

The nfsstats -m output for the two file systems I'm testing with is:

knopfs01:/OP_UserUnix on /home
nfsv3,tcp,resvport,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=6126856,timeout=120,retrans=2
knopfs01:/op_software on /software
nfsv3,tcp,resvport,hard,cto,lockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=6126856,timeout=120,retrans=2

I did the tcpdump/wireshark thing and I'm puzzled that I see no readdir 
requests. I see a lookup, followed by getattr, access and fsstat for the 
directory and that's it. Looks that even after hours the stuff returned by 
getdirents(2) comes from the cache. I assume that the NFS client uses getattr 
to check whether
the directory has changed? If I knew what happens when calling getdirents() I 
could add some debugging printfs() here and there to figure out...

harti

-Original Message-
From: Rick Macklem [mailto:rmack...@uoguelph.ca] 
Sent: Tuesday, May 14, 2013 2:50 PM
To: Brandt, Hartmut
Cc: curr...@freebsd.org
Subject: Re: files disappearing from ls on NFS

Hartmut Brandt wrote:
> On Mon, 13 May 2013, Rick Macklem wrote:
> 
> RM>Hartmut Brandt wrote:
> RM>> On Sun, 12 May 2013, Rick Macklem wrote:
> RM>>
> RM>> RM>Hartmut Brandt wrote:
> RM>> RM>> Hi,
> RM>> RM>>
> RM>> RM>> I've updated one of my -current machines this week (previous
> RM>> update
> RM>> RM>> was in
> RM>> RM>> february). Now I see a strange effect (it seems only on NFS
> RM>> mounts):
> RM>> RM>> ls or
> RM>> RM>> even echo * will list only some files (strange enough the
> first
> RM>> files
> RM>> RM>> from
> RM>> RM>> the normal, alphabetically ordered list). If I change
> something
> RM>> in the
> RM>> RM>> directory (delete a file or create a new one) for some time
> the
> RM>> RM>> complete
> RM>> RM>> listing will appear but after sime time (seconds to a minute
> or
> RM>> so)
> RM>> RM>> again
> RM>> RM>> only part of the files is listed.
> RM>> RM>>
> RM>> RM>> A ktrace on ls /usr/src/lib/libc/gen shows that
> getdirentries is
> RM>> RM>> called
> RM>> RM>> only once (returning 4096). For a full listing getdirentries
> is
> RM>> called
> RM>> RM>> 5
> RM>> RM>> times with the last returning 0.
> RM>> RM>>
> RM>> RM>> I can still open files that are not listed if I know their
> name,
> RM>> RM>> though.
> RM>> RM>>
> RM>> RM>> The NFS server is a Windows 2008 server with an OpenText NFS
> RM>> Server
> RM>> RM>> which
> RM>> RM>> works without problems to all the other FreeBSD machines.
> RM>> RM>>
> RM>> RM>> So what could that be?
> RM>> RM>>
> RM>> RM>I've attached a patch that might be worth trying. It is a
> "shot in
> RM>> the dark",
> RM>> RM>but brings the new NFS client's readdir closer to the old one
> RM>> (which you
> RM>> RM>mentioned still works ok).
> RM>> RM>
> RM>> RM>Please let me know how it goes, if you have a chance to test
> it,
> RM>> rick
> RM>>
> RM>> Hi Rick,
> RM>>
> RM>> the patch doesn't help.
> RM>>
> RM>> I wrote a small test program, which opens a directory, calls
> RM>> getdents(2)
> RM>> in a loop and dumps that. I figured out, that the return of the
> system
> RM>> call depends on the buffer size I pass to it. The directory has a 
> RM>> block size of 4k according to fstat(2). If I use that, I get some 
> RM>> 300
> of the
> RM>> almost 500 directory entries. If I use 8k, I get just around 200
> and
> RM>> if I
> RM>> use 16k I get a handfull. If I dump the buffer in this case I see
> RM>> 0x200
> RM>> bytes filled with directory entries, then a lot of zeros and
> starting
> RM>> from
> RM>> 0x1000 again data. This is of course ignored because of the zeros 
> RM>> before.
> RM>>
> RM>And for this case getdents(2) returned 16K? It is normal for
> getdents(2)
> RM>to return less than requested and when end of dir occurs, it should
> return 0.
> RM>
> RM>But if it returns 16K, there shouldn't be zeroed space in the
> middle of
> RM>it.
> RM>
> RM>And this always occurs or only after you wait a while? (You noted
> in the
> RM>above description that it would be ok for a little while after a
> directory
> RM>change and then would break, which suggests some kind of caching
> problem.)
> 
> Today in the morning everything was fine. After waiting 5 minutes, 
> again only partial directories. When I do a read with 8k buffer size,
> getdents(2) returns 8k, but starting from 0x200 until 0x1000 the 
> buffer is filled with zeros. The entry just before the zeroes ends 
> exactly at
> 0x200
> (that would be the first byte of the next entry) and at 0x1000 a new 
> entry starts. The rest of the buffer is fine. The next read returns 
> only 4k and seems to be

RE: files disappearing from ls on NFS

2013-05-05 Thread Hartmut.Brandt
Hi Rick,

the patch doesn't help. So how can I help to fix that? Of course, I can use the 
work-around with oldnfs, but ...

harti

-Original Message-
From: Rick Macklem [mailto:rmack...@uoguelph.ca] 
Sent: Saturday, May 04, 2013 11:33 PM
To: Brandt, Hartmut
Cc: curr...@freebsd.org; Andrzej Tobola
Subject: Re: files disappearing from ls on NFS

Hartmut Brandt wrote:
> On Fri, 3 May 2013, Rick Macklem wrote:
> 
> RM>Ok, if you succeed in isolating the commit, that would be great.
> 
> Hmm. I'm somewhat stuck. clang from yesterday can't compile clang from 
> a month ago...
> 
> harti
> 
Oh well. You could try this patch (which is the one to fix readdir for union 
mounts), since I can see that VOP_VPTOCNP() will also be broken without it. (I 
can't see how that would break "ls", but it breaks __getcwd() and friends, so 
maybe it can affect "ls" somehow?)

It's a cut/paste under windows, so I'm afraid the whitespace will be messed up, 
but it's pretty simple to apply by hand.

Index: nfs_clvnops.c
===
--- nfs_clvnops.c(revision 249568)
+++ nfs_clvnops.c(working copy)
@@ -2221,6 +2221,7 @@
 !NFS_TIMESPEC_COMPARE(&np->n_mtime, 
&vattr.va_mtime)) {
 mtx_unlock(&np->n_mtx);
 NFSINCRGLOBAL(newnfsstats.direofcache_hits);
+*ap->a_eofflag = 1;
 return (0);
 } else
 mtx_unlock(&np->n_mtx); @@ -2233,8 +2234,10 @@
 tresid = uio->uio_resid;
 error = ncl_bioread(vp, uio, 0, ap->a_cred);
 
-if (!error && uio->uio_resid == tresid)
+if (!error && uio->uio_resid == tresid) {
 NFSINCRGLOBAL(newnfsstats.direofcache_misses);
+*ap->a_eofflag = 1;
+}
 return (error);
 }

I haven't yet succeeded in reproducing the problem, but will be poking at it 
some more, rick

> RM>
> RM>rick
> RM>
> RM>> harti
> RM>>
> RM>> On Fri, 3 May 2013, Rick Macklem wrote:
> RM>>
> RM>> RM>Hartmut Brandt wrote:
> RM>> RM>> Hi,
> RM>> RM>>
> RM>> RM>> I've updated one of my -current machines this week (previous
> RM>> update
> RM>> RM>> was in
> RM>> RM>> february). Now I see a strange effect (it seems only on NFS
> RM>> mounts):
> RM>> RM>> ls or
> RM>> RM>> even echo * will list only some files (strange enough the
> first
> RM>> files
> RM>> RM>> from
> RM>> RM>> the normal, alphabetically ordered list). If I change
> something
> RM>> in the
> RM>> RM>> directory (delete a file or create a new one) for some time
> the
> RM>> RM>> complete
> RM>> RM>> listing will appear but after sime time (seconds to a minute
> or
> RM>> so)
> RM>> RM>> again
> RM>> RM>> only part of the files is listed.
> RM>> RM>>
> RM>> RM>> A ktrace on ls /usr/src/lib/libc/gen shows that
> getdirentries is
> RM>> RM>> called
> RM>> RM>> only once (returning 4096). For a full listing getdirentries
> is
> RM>> called
> RM>> RM>> 5
> RM>> RM>> times with the last returning 0.
> RM>> RM>>
> RM>> RM>> I can still open files that are not listed if I know their
> name,
> RM>> RM>> though.
> RM>> RM>>
> RM>> RM>> The NFS server is a Windows 2008 server with an OpenText NFS
> RM>> Server
> RM>> RM>> which
> RM>> RM>> works without problems to all the other FreeBSD machines.
> RM>> RM>>
> RM>> RM>> So what could that be?
> RM>> RM>>
> RM>> RM>Someone else reported missing files returned via "ls"
> recently,
> RM>> when
> RM>> RM>they used a small readdirsize (below 8K). I haven't yet had a
> RM>> change to try
> RM>> RM>and reproduce it or do any snooping around.
> RM>> RM>
> RM>> RM>There haven't been any recent changes to readdir in the NFS
> client,
> RM>> RM>except a trivial one that adds a check for vnode type being
> VDIR,
> RM>> RM>so I don't see that it can be a recent NFS change.
> RM>> RM>
> RM>> RM>If you can increase the readdirsize, try that to see if it
> avoids
> RM>> RM>the problem. "nfsstat -m" shows you what the mount options end
> up
> RM>> RM>being after doing the mount. The server might be limiting the
> RM>> readdirsize
> RM>> RM>to 4K, so you should check, even if you specify a large value
> for
> RM>> RM>the mount.
> RM>> RM>
> RM>> RM>rick
> RM>> RM>
> RM>> RM>> Regards,
> RM>> RM>> harti
> RM>> RM>> ___
> RM>> RM>> freebsd-current@freebsd.org mailing list 
> RM>> RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> RM>> RM>> To unsubscribe, send any mail to 
> RM>> RM>> "freebsd-current-unsubscr...@freebsd.org"
> RM>> RM>
> RM>> ___
> RM>> freebsd-current@freebsd.org mailing list 
> RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> RM>> To unsubscribe, send any mail to
> RM>> "freebsd-current-unsubscr...@freebsd.org"
> RM>
> ___

RE: files disappearing from ls on NFS

2013-05-05 Thread Hartmut.Brandt
Looks like the problem is in the new NFS code - the old code does the right 
thing. I've still to try your patch...

harti

From: Rick Macklem [rmack...@uoguelph.ca]
Sent: Sunday, May 05, 2013 12:49 AM
To: Brandt, Hartmut
Cc: curr...@freebsd.org
Subject: Re: files disappearing from ls on NFS

Hartmut Brandt wrote:
> On Fri, 3 May 2013, Rick Macklem wrote:
>
> RM>Ok, if you succeed in isolating the commit, that would be great.
>
> Hmm. I'm somewhat stuck. clang from yesterday can't compile clang from
> a
> month ago...
>
> harti
>
Oh, and one other thing you can try is switching to the old client
"mount -t oldnfs ...".

rick

> RM>
> RM>rick
> RM>
> RM>> harti
> RM>>
> RM>> On Fri, 3 May 2013, Rick Macklem wrote:
> RM>>
> RM>> RM>Hartmut Brandt wrote:
> RM>> RM>> Hi,
> RM>> RM>>
> RM>> RM>> I've updated one of my -current machines this week (previous
> RM>> update
> RM>> RM>> was in
> RM>> RM>> february). Now I see a strange effect (it seems only on NFS
> RM>> mounts):
> RM>> RM>> ls or
> RM>> RM>> even echo * will list only some files (strange enough the
> first
> RM>> files
> RM>> RM>> from
> RM>> RM>> the normal, alphabetically ordered list). If I change
> something
> RM>> in the
> RM>> RM>> directory (delete a file or create a new one) for some time
> the
> RM>> RM>> complete
> RM>> RM>> listing will appear but after sime time (seconds to a minute
> or
> RM>> so)
> RM>> RM>> again
> RM>> RM>> only part of the files is listed.
> RM>> RM>>
> RM>> RM>> A ktrace on ls /usr/src/lib/libc/gen shows that
> getdirentries is
> RM>> RM>> called
> RM>> RM>> only once (returning 4096). For a full listing getdirentries
> is
> RM>> called
> RM>> RM>> 5
> RM>> RM>> times with the last returning 0.
> RM>> RM>>
> RM>> RM>> I can still open files that are not listed if I know their
> name,
> RM>> RM>> though.
> RM>> RM>>
> RM>> RM>> The NFS server is a Windows 2008 server with an OpenText NFS
> RM>> Server
> RM>> RM>> which
> RM>> RM>> works without problems to all the other FreeBSD machines.
> RM>> RM>>
> RM>> RM>> So what could that be?
> RM>> RM>>
> RM>> RM>Someone else reported missing files returned via "ls"
> recently,
> RM>> when
> RM>> RM>they used a small readdirsize (below 8K). I haven't yet had a
> RM>> change to try
> RM>> RM>and reproduce it or do any snooping around.
> RM>> RM>
> RM>> RM>There haven't been any recent changes to readdir in the NFS
> client,
> RM>> RM>except a trivial one that adds a check for vnode type being
> VDIR,
> RM>> RM>so I don't see that it can be a recent NFS change.
> RM>> RM>
> RM>> RM>If you can increase the readdirsize, try that to see if it
> avoids
> RM>> RM>the problem. "nfsstat -m" shows you what the mount options end
> up
> RM>> RM>being after doing the mount. The server might be limiting the
> RM>> readdirsize
> RM>> RM>to 4K, so you should check, even if you specify a large value
> for
> RM>> RM>the mount.
> RM>> RM>
> RM>> RM>rick
> RM>> RM>
> RM>> RM>> Regards,
> RM>> RM>> harti
> RM>> RM>> ___
> RM>> RM>> freebsd-current@freebsd.org mailing list
> RM>> RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> RM>> RM>> To unsubscribe, send any mail to
> RM>> RM>> "freebsd-current-unsubscr...@freebsd.org"
> RM>> RM>
> RM>> ___
> RM>> freebsd-current@freebsd.org mailing list
> RM>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> RM>> To unsubscribe, send any mail to
> RM>> "freebsd-current-unsubscr...@freebsd.org"
> RM>
> ___
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"