Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
On Thu, Apr 25, 2024 at 8:51 PM Rick Macklem  wrote:
>
> On Thu, Apr 25, 2024 at 8:09 PM Konstantin Belousov  wrote:
> >
> > On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> > > Hi,
> > >
> > > This week I have been doing active testing as a part of an IETF
> > > bakeathon for NFSv4. During the week I had a NFSv4 client
> > > crash. On the surface, it is straightforward, in that it called
> > > ncl_doio_directwrite() and the field called b_caller1 was NULL.
> > >
> > > Now, here's the weird part...
> > > ncl_doio_directwrite() should never be called because B_DIRECT
> > > should never be set. (The only place B_DIRECT gets set in the code
> > > is never currently executed.)
> > Do you mean the place in nfs_directio_write()?  And the fact that
> > IO_SYNC is always set.
> Yes.
>
> >
> > >
> > > I have a patch that clears out the "never to be executed" code and
> > > this seems to avoid the patch, since with the patch, 
> > > ncl_doio_directwrite()
> > > no longer exists.
> > >
> > > What I cannot figure out is how B_DIRECT got set?
> > > I can note that UFS was under heavy load when the client crashed,
> > > but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> > > without b_flags being set to 0.
> >
> > There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
> > set B_DIRECT.  On the other hand, they are not used by nfs client.
> Yes, again.
>
> >
> > What was the overall state of the buffer with the B_DIRECT flag?  Which
> > vnode it was assigned to?
> Unfortunately I was in a hurry and didn't get more info.
> And, since I have never seen this crash before, I doubt I'll be able
> to reproduce it.
Oh, and I will put the cleanup patch on phabricator. I didn't see the
crash again
during a few days of testing with the patch. This makes sense, since it gets
rid of ncl_doio_directwrite().

>
> Thanks, rick



Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
On Thu, Apr 25, 2024 at 8:09 PM Konstantin Belousov  wrote:
>
> On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> > Hi,
> >
> > This week I have been doing active testing as a part of an IETF
> > bakeathon for NFSv4. During the week I had a NFSv4 client
> > crash. On the surface, it is straightforward, in that it called
> > ncl_doio_directwrite() and the field called b_caller1 was NULL.
> >
> > Now, here's the weird part...
> > ncl_doio_directwrite() should never be called because B_DIRECT
> > should never be set. (The only place B_DIRECT gets set in the code
> > is never currently executed.)
> Do you mean the place in nfs_directio_write()?  And the fact that
> IO_SYNC is always set.
Yes.

>
> >
> > I have a patch that clears out the "never to be executed" code and
> > this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
> > no longer exists.
> >
> > What I cannot figure out is how B_DIRECT got set?
> > I can note that UFS was under heavy load when the client crashed,
> > but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> > without b_flags being set to 0.
>
> There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
> set B_DIRECT.  On the other hand, they are not used by nfs client.
Yes, again.

>
> What was the overall state of the buffer with the B_DIRECT flag?  Which
> vnode it was assigned to?
Unfortunately I was in a hurry and didn't get more info.
And, since I have never seen this crash before, I doubt I'll be able
to reproduce it.

Thanks, rick



Re: mysterious setting of B_DIRECT?

2024-04-25 Thread Konstantin Belousov
On Thu, Apr 25, 2024 at 07:49:23PM -0700, Rick Macklem wrote:
> Hi,
> 
> This week I have been doing active testing as a part of an IETF
> bakeathon for NFSv4. During the week I had a NFSv4 client
> crash. On the surface, it is straightforward, in that it called
> ncl_doio_directwrite() and the field called b_caller1 was NULL.
> 
> Now, here's the weird part...
> ncl_doio_directwrite() should never be called because B_DIRECT
> should never be set. (The only place B_DIRECT gets set in the code
> is never currently executed.)
Do you mean the place in nfs_directio_write()?  And the fact that
IO_SYNC is always set.

> 
> I have a patch that clears out the "never to be executed" code and
> this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
> no longer exists.
> 
> What I cannot figure out is how B_DIRECT got set?
> I can note that UFS was under heavy load when the client crashed,
> but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
> without b_flags being set to 0.

There are also vfs_bio_brelse()/vfs_bio_setflags() functions which can
set B_DIRECT.  On the other hand, they are not used by nfs client.

What was the overall state of the buffer with the B_DIRECT flag?  Which
vnode it was assigned to?



mysterious setting of B_DIRECT?

2024-04-25 Thread Rick Macklem
Hi,

This week I have been doing active testing as a part of an IETF
bakeathon for NFSv4. During the week I had a NFSv4 client
crash. On the surface, it is straightforward, in that it called
ncl_doio_directwrite() and the field called b_caller1 was NULL.

Now, here's the weird part...
ncl_doio_directwrite() should never be called because B_DIRECT
should never be set. (The only place B_DIRECT gets set in the code
is never currently executed.)

I have a patch that clears out the "never to be executed" code and
this seems to avoid the patch, since with the patch, ncl_doio_directwrite()
no longer exists.

What I cannot figure out is how B_DIRECT got set?
I can note that UFS was under heavy load when the client crashed,
but I cannot see how a UFS "struct buf" would become a NFS "struct buf"
without b_flags being set to 0.

Anyone have any ideas? rick



Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread Tom Jones
Can you isolate out the extraneous stuff and loop tx and rx on a CP2101 board 
and send bytes through? 

I did a bunch of development on an esp8266 board in the last few weeks and had 
no issues, but I’ve no idea if it were the same usb serial chip. 

I’ll have a dig around and see if I have something matching 

On Thu, Apr 25, 2024, at 20:17, FreeBSD User wrote:
> Hello,
>
> Host: 15.0-CURRENT FreeBSD 15.0-CURRENT #36 main-n269703-54c3aa02e926: 
> Thu Apr 25 18:48:56
> CEST 2024 amd64 or 14-STABLE recently compiled (dmesg/uname not at 
> hand).
>
> Hardware: oldish Z77Pro 4 based Asrock mainboard, a Lenovo T560 
> notebook, Fujitsu Esprimo Q5XX
> (simple desktop, Pentium Gold) or an oldish Fujitsu Celsius 7XX 
> workstation, 6 core Haswell
> XEON.
>
> Phenomenon: a couple of weeks now I try to connect to several Xtensa 
> ESP32 dev boards
> (ESP32-WROOM32 with CP2101 or CP2104 UART) via comms/py-esptool 
> (doesn't matter whether it is
> tho port's py39-esptool 4.5 or the latest py-esptool 4.7.0, doesn't 
> matter whether pkg package
> or self compiled on CURRENT and 14-STABLE, on all hardware platforms 
> same result).
>
> Attaching the ESP devel module via Micro USB cable (several type, 
> differnt vendors tried ...)
> show
>
> dmesg:
> [...]
> ugen0.4:  at usbus0
> uslcom0 on uhub3
> uslcom0:  rev 1.10/1.00, addr 4>
> on usbus0
> [...]
>
> When trying to connect to the ESP32 via below shown command (--trace 
> not every time issued), I
> get no connection:
>
> [ohartmann]: esptool.py --trace --chip esp32 --baud 115200 --port 
> /dev/cuaU1  flash_id
> esptool.py v4.7.0
> Loaded custom configuration from /pool/home/ohartmann/esptool.cfg
> Serial port /dev/cuaU1
> Connecting...TRACE +0.000 command op=0x08 data len=36 wait_response=1 
> timeout=0.100 data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.102 No serial data received.
> TRACE +0.052 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.107 No serial data received.
> TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
> TRACE +0.107 No serial data received.
> TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 
> data=
> 07071220  | ... 
>   | 
>   | 
> TRACE +0.000 Write 46 bytes: 
> c824 000707122055 | ...$ UUU
>   | 
>  55c0 | U.
>
>
> A serial exception error occurred: device reports readiness to read but 
> returned no data
> (device disconnected or multiple access on port?) Note: This error 
> originates from pySerial.
> It is likely not a problem with esptool, but with the hardware 
> connection or drivers. For
> troubleshooting steps visit:
> https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html
> [...]
>
>
> Whatever baud rate issued, in most cases on all tested OS versions and 
> almost all hardware
> platforms except the Fujistu Celsius 7XX (2015 model) I do not get any 
> connection! And it get
> more weird: To avoid out-of-sync-software I recompiled everything via 
> "portmaster -df
> comms/py-pyserial comms/py-esptool" and after that, everything was 
> fine, the connection was
> made, I got results out of the chip. Seconds later same problems.
>
> I exchanged cablings, exchanged the ESP32 model and vendor. Invariants 
> are 14-STABLE, daily
> compiled, CURRENT. daily compiled. On my private box (old Z77 based 
> IvyBridge ASRock crap), a
> couple of Lenovo T560 running 14-STABLE and several Fujitsu Esprimo 
> Q5XX boxes there is always
> this weird error message, but in very rare cases I get connection.
>
> Only exception: the Fujsitus Celsius 7XX workstation (14-STABLE, last 
> complied today noon). No
> matter what ESP32, no mat

Re: serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread Tomek CEDRO
CP2102 are pretty good ones and never let me down :-)

Is your UART connection to ESP32 working correctly? Can you see the
boot message and whatever happens next in terminal (cu / minicom)? Are
RX TX pins not swapped? Power supply okay?

Are boards generic devkits of custom hardware? ESP32 in addition to RX
TX needs two control lines Reset and Boot that will switch the chip to
bootloader / flashing mode. Most USB-to-UART use RTS/CTS lines for
that. Are you sure these lines are available on your board and
connected to the target correctly? Do you have Reset and Boot buttons
on the board so you could trigger bootloader by hand (hold Boot, press
and release Reset, device will be in bootloader upload mode, retry
esptool flashing now). You can also play with the buttons with active
terminal attached (i.e. minicom) to see if they work as expected.

Minicom serial terminal is pretty cool as it allows you to watch UART
behavior on adapter (un)plug. In minicom you can also enable/disable
hardware flow control lines (Ctrl+A O -> Serial Port Setup -> (F)
Hardware Flow Control). You can switch that easily and watch the
target behavior. If this is the problem you may want to use stty (1)
to enable/disable hardware flow control on the port.

Can you try with another board? ESP32 has fuses that may permanently
disable and/or mess up some hardware features.

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info



serial/ulscom: response timeout using pySerial/esptool.py

2024-04-25 Thread FreeBSD User
Hello,

Host: 15.0-CURRENT FreeBSD 15.0-CURRENT #36 main-n269703-54c3aa02e926: Thu Apr 
25 18:48:56
CEST 2024 amd64 or 14-STABLE recently compiled (dmesg/uname not at hand).

Hardware: oldish Z77Pro 4 based Asrock mainboard, a Lenovo T560 notebook, 
Fujitsu Esprimo Q5XX
(simple desktop, Pentium Gold) or an oldish Fujitsu Celsius 7XX workstation, 6 
core Haswell
XEON.

Phenomenon: a couple of weeks now I try to connect to several Xtensa ESP32 dev 
boards
(ESP32-WROOM32 with CP2101 or CP2104 UART) via comms/py-esptool (doesn't matter 
whether it is
tho port's py39-esptool 4.5 or the latest py-esptool 4.7.0, doesn't matter 
whether pkg package
or self compiled on CURRENT and 14-STABLE, on all hardware platforms same 
result).

Attaching the ESP devel module via Micro USB cable (several type, differnt 
vendors tried ...)
show

dmesg:
[...]
ugen0.4:  at usbus0
uslcom0 on uhub3
uslcom0: 
on usbus0
[...]

When trying to connect to the ESP32 via below shown command (--trace not every 
time issued), I
get no connection:

[ohartmann]: esptool.py --trace --chip esp32 --baud 115200 --port /dev/cuaU1  
flash_id
esptool.py v4.7.0
Loaded custom configuration from /pool/home/ohartmann/esptool.cfg
Serial port /dev/cuaU1
Connecting...TRACE +0.000 command op=0x08 data len=36 wait_response=1 
timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.102 No serial data received.
TRACE +0.052 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.107 No serial data received.
TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.
TRACE +0.107 No serial data received.
TRACE +0.054 command op=0x08 data len=36 wait_response=1 timeout=0.100 data=
07071220  | ... 
  | 
  | 
TRACE +0.000 Write 46 bytes: 
c824 000707122055 | ...$ UUU
  | 
 55c0 | U.


A serial exception error occurred: device reports readiness to read but 
returned no data
(device disconnected or multiple access on port?) Note: This error originates 
from pySerial.
It is likely not a problem with esptool, but with the hardware connection or 
drivers. For
troubleshooting steps visit:
https://docs.espressif.com/projects/esptool/en/latest/troubleshooting.html
[...]


Whatever baud rate issued, in most cases on all tested OS versions and almost 
all hardware
platforms except the Fujistu Celsius 7XX (2015 model) I do not get any 
connection! And it get
more weird: To avoid out-of-sync-software I recompiled everything via 
"portmaster -df
comms/py-pyserial comms/py-esptool" and after that, everything was fine, the 
connection was
made, I got results out of the chip. Seconds later same problems.

I exchanged cablings, exchanged the ESP32 model and vendor. Invariants are 
14-STABLE, daily
compiled, CURRENT. daily compiled. On my private box (old Z77 based IvyBridge 
ASRock crap), a
couple of Lenovo T560 running 14-STABLE and several Fujitsu Esprimo Q5XX boxes 
there is always
this weird error message, but in very rare cases I get connection.

Only exception: the Fujsitus Celsius 7XX workstation (14-STABLE, last complied 
today noon). No
matter what ESP32, no matter what vendor, no matter what cablin used: 
connection is established
at any BAUD rate issued at any time. Not one single failure as shown above in 
any session (I
checked several tenth times)!

Now I'm out of ideas and I suspect the CP210X ulscom serial driver to have 
trouble with most
onboard serial chipsets.

Can anyone help me track down this issue? Is there anything I could have missed?

I drives me nuts ...

Thanks in advance,

Oliver

 
-- 
O. Hartmann