Re: Linux and usb device drivers using functionfs

2017-10-31 Thread Greg KH
On Mon, Oct 30, 2017 at 04:51:57PM +, andy_purc...@keysight.com wrote:
> Hello, 
> 
> I have implemented a USB device function using Linux functionfs and now there 
> is a problem being reported. 
> I need to ask this group for advice. 
> 
> The problem is this: 
> 1) device boots 
> 2) some usb transfers happen, all are OK 
> 3) a device app runs to completion (USB quiescent during this time, no USB 
> transfers required) 
> 4) the controlling PC starts a 4 KByte USB transfer to the device, but this 
> transfer does not finish. Only 3 Kbytes are ACK'd by the device.
>  (A USB analyzer shows the host trying to send more, but the device 
> persistently NAK's)
> 
> If step (3) is omitted, everything works fine. It is reliable - 15/15 times 
> it is OK.
> 
> The USB device function is implemented with functionfs and aio. Most of the 
> implementation is in user space.
> An off-the-shelf low level Linux driver is being used. 
> Regression tests show no problems with various sized USB transfers for over 
> 24 hours.
> 
> A colleague has investigated and has asserted user space is not the right way 
> to do things.
> He says:
> 
> "It appeared that running the  was enough to swap the usb code 
> out that it wasn't able to swap back in quick enough to respond to the USB 
> traffic in a timely fashion"  "This is the major drawback to user space 
> drivers as opposed to kernel drivers.  Kernel drivers pages are locked into 
> memory while user space can be swapped out.  There were numerous articles 
> about this, but the best one I found was:
> http://www.makelinux.net/ldd3/chp-2-sect-9 "
> Linux Device Drivers, 3rd Edition, By Jonathan Corbet, Greg 
> Kroah-Hartman, Alessandro Rubini  : February 2005


Are you really using swap in your system and does the usb application
get swapped out to the rotating media?  Why not just lock it into
memory, it should not be a lot, as functionfs apps "should" be pretty
small.

> "There pertinent part is:
> o    Response time is slower, because a context switch is required to 
> transfer information or actions between the client and the hardware.
> o    Worse yet, if the driver has been swapped to disk, response time is 
> unacceptably long. Using the mlock system call might help, but usually you'll 
> need to lock many memory pages, because a user-space program depends on a lot 
> of library code. mlock, too, is limited to privileged users.
> Some articles I read stated that the swap could take seconds." 
> 
> 
> QUESTIONS: 
> - Did I make a mistake using user space and functionfs? 
>   (I thought state-of-the-art way to do usb function drivers was to use 
> functionfs...) 

No, it should be ok, what exactly are you failing to respond to?
Application-level requests?  Shouldn't the USB hardware just respond
with NAKs until the data is ready to be sent by your program?

> - Should I add calls to mlock() to try to fix?

Yes, that should be easy to test.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux and usb device drivers using functionfs

2017-10-31 Thread Felipe Balbi

Hi,

andy_purc...@keysight.com writes:
> I have implemented a USB device function using Linux functionfs and
> now there is a problem being reported.
>
> I need to ask this group for advice. 
>
> The problem is this: 
> 1) device boots 
>
> 2) some usb transfers happen, all are OK 
>
> 3) a device app runs to completion (USB quiescent during this time, no
> USB transfers required)
>
> 4) the controlling PC starts a 4 KByte USB transfer to the device, but
> this transfer does not finish. Only 3 Kbytes are ACK'd by the device.
>
>  (A USB analyzer shows the host trying to send more, but the
>  device persistently NAK's)
>
> If step (3) is omitted, everything works fine. It is reliable - 15/15
> times it is OK.
>
> The USB device function is implemented with functionfs and aio. Most
> of the implementation is in user space.
>
> An off-the-shelf low level Linux driver is being used. 
>
> Regression tests show no problems with various sized USB transfers for
> over 24 hours.

Okay, let's try to figure out what's going on. Are you using dwc3, by
any chance? If you are, can you capture tracepoints of the failing case?

While it could be something on the application side, I want to be sure
the controller is behaving properly.

For details on how to capture tracepoints, see [1] below.

> A colleague has investigated and has asserted user space is not the
> right way to do things.
>
> He says:
>
> "It appeared that running the  was enough to swap the usb
> code out that it wasn't able to swap back in quick enough to respond
> to the USB traffic in a timely fashion"  "This is the major
> drawback to user space drivers as opposed to kernel drivers.  Kernel
> drivers pages are locked into memory while user space can be swapped
> out.  There were numerous articles about this, but the best one I
> found was:
>
> http://www.makelinux.net/ldd3/chp-2-sect-9 "
>
> Linux Device Drivers, 3rd Edition, By Jonathan Corbet,
> Greg Kroah-Hartman, Alessandro Rubini : February 2005
>
> "There pertinent part is:
>
> o    Response time is slower, because a context switch is required to
> transfer information or actions between the client and the hardware.
>
> o    Worse yet, if the driver has been swapped to disk, response time
> is unacceptably long. Using the mlock system call might help, but
> usually you'll need to lock many memory pages, because a user-space
> program depends on a lot of library code. mlock, too, is limited to
> privileged users.
>
> Some articles I read stated that the swap could take seconds."
>
>
> QUESTIONS: 
>
> - Did I make a mistake using user space and functionfs?
>   (I thought state-of-the-art way to do usb function drivers was to
>   use functionfs...)

right, unless you can use some of the in-tree functions, it doesn't make
sure to rely on an ever-changing internal API :-)

> - Should I add calls to mlock() to try to fix?

that's an easy enough test, yes :-)

> Any advice is appreciated. 

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/driver-api/usb/dwc3.rst#n113

-- 
balbi


signature.asc
Description: PGP signature


RE: Linux and usb device drivers using functionfs

2017-10-31 Thread andy_purcell
Hello Felipe,

I am not using DWC3.

I have new information about the problem. 
The context was this: 
  1) device boots 
  2) some usb transfers happen, all are OK 
  3) a device app runs to completion (USB quiescent during this time, no USB 
transfers required) 
  4) the controlling PC starts a 4 KByte USB transfer to the device, but this 
transfer does not finish. Only 3 Kbytes are ACK'd by the device.
   (A USB analyzer shows the host trying to send more, but the device 
persistently NAK's)

  If step (3) is omitted, everything works fine.

The new information is that step (3) consumes a lot of memory and the theory is 
the OS is throwing USB user-space code pages out of RAM and using this RAM for 
NFS files (root file system is NFS mounted). 

Continuing on with this theory:
After the USB related code pages are no longer in RAM, a USB transaction 
happens. 
Now there is a race condition:
1)  The USB transaction complete interrupt and the OS call into user-space 
USB related code (functionfs, aio, etc) 
2)  The Linux paging system trying to page the user-space USB related code 
back into RAM

The theory is that (1) can happen before (2). 

As I understand it, the techniques below may solve the problem: 
* USB user space code calls mlockall()
* Change system swappiness


AP 



> -Original Message-
> From: Felipe Balbi [mailto:felipe.ba...@linux.intel.com]
> Sent: Tuesday, October 31, 2017 4:19 AM
> To: PURCELL,ANDY (K-Loveland,ex1) ; linux-
> u...@vger.kernel.org
> Subject: Re: Linux and usb device drivers using functionfs
> 
> 
> Hi,
> 
> andy_purc...@keysight.com writes:
> > I have implemented a USB device function using Linux functionfs and
> > now there is a problem being reported.
> >
> > I need to ask this group for advice.
> >
> > The problem is this:
> > 1) device boots
> >
> > 2) some usb transfers happen, all are OK
> >
> > 3) a device app runs to completion (USB quiescent during this time, no
> > USB transfers required)
> >
> > 4) the controlling PC starts a 4 KByte USB transfer to the device, but
> > this transfer does not finish. Only 3 Kbytes are ACK'd by the device.
> >
> >  (A USB analyzer shows the host trying to send more, but the
> >  device persistently NAK's)
> >
> > If step (3) is omitted, everything works fine. It is reliable - 15/15
> > times it is OK.
> >
> > The USB device function is implemented with functionfs and aio. Most
> > of the implementation is in user space.
> >
> > An off-the-shelf low level Linux driver is being used.
> >
> > Regression tests show no problems with various sized USB transfers for
> > over 24 hours.
> 
> Okay, let's try to figure out what's going on. Are you using dwc3, by any 
> chance?
> If you are, can you capture tracepoints of the failing case?
> 
> While it could be something on the application side, I want to be sure the
> controller is behaving properly.
> 
> For details on how to capture tracepoints, see [1] below.
> 
> > A colleague has investigated and has asserted user space is not the
> > right way to do things.
> >
> > He says:
> >
> > "It appeared that running the  was enough to swap the usb
> > code out that it wasn't able to swap back in quick enough to respond
> > to the USB traffic in a timely fashion"  "This is the major
> > drawback to user space drivers as opposed to kernel drivers.  Kernel
> > drivers pages are locked into memory while user space can be swapped
> > out.  There were numerous articles about this, but the best one I
> > found was:
> >
> > http://www.makelinux.net/ldd3/chp-2-sect-9 "
> >
> > Linux Device Drivers, 3rd Edition, By Jonathan Corbet,
> > Greg Kroah-Hartman, Alessandro Rubini : February 2005
> >
> > "There pertinent part is:
> >
> > o    Response time is slower, because a context switch is required to
> > transfer information or actions between the client and the hardware.
> >
> > o    Worse yet, if the driver has been swapped to disk, response time
> > is unacceptably long. Using the mlock system call might help, but
> > usually you'll need to lock many memory pages, because a user-space
> > program depends on a lot of library code. mlock, too, is limited to
> > privileged users.
> >
> > Some articles I read stated that the swap could take seconds."
> >
> >
> > QUESTIONS:
> >
> > - Did I make a mistake using user space and functionfs?
> >   (I thought state-of-the-art way to do usb function drivers was to
> >   use functionfs...)
> 
> right, unless you can use some of the in-tree functions, it doesn't make sure 
> to
> rely on an ever-changing internal API :-)
> 
> > - Should I add calls to mlock() to try to fix?
> 
> that's an easy enough test, yes :-)
> 
> > Any advice is appreciated.
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docume
> ntation/driver-api/usb/dwc3.rst#n113
> 
> --
> balbi


RE: Linux and usb device drivers using functionfs

2017-11-01 Thread Felipe Balbi

Hi Andy,

andy_purc...@keysight.com writes:
> Hello Felipe,
>
> I am not using DWC3.

oh, okay :-)

> I have new information about the problem. 
> The context was this: 
>   1) device boots 
>   2) some usb transfers happen, all are OK 
>   3) a device app runs to completion (USB quiescent during this time, no USB 
> transfers required) 
>   4) the controlling PC starts a 4 KByte USB transfer to the device, but this 
> transfer does not finish. Only 3 Kbytes are ACK'd by the device.
>(A USB analyzer shows the host trying to send more, but the device 
> persistently NAK's)
>
>   If step (3) is omitted, everything works fine.

What do you mean by "device app runs to completion"? What is the "device
app"? Is it the functionfs application you're talking about? And by
"completion" do you mean that it completely stops running or is it just
sleeping waiting for some input?

> The new information is that step (3) consumes a lot of memory and the
> theory is the OS is throwing USB user-space code pages out of RAM and
> using this RAM for NFS files (root file system is NFS mounted).
>
> Continuing on with this theory:
> After the USB related code pages are no longer in RAM, a USB transaction 
> happens. 
> Now there is a race condition:
> 1)The USB transaction complete interrupt and the OS call into user-space 
> USB related code (functionfs, aio, etc) 
> 2)The Linux paging system trying to page the user-space USB related code 
> back into RAM
>
> The theory is that (1) can happen before (2). 

I'm not sure the inversion of 1 and 2 would cause issues. Swaping out
the pages may :-)

Let us know if mlockall() helps.

-- 
balbi


signature.asc
Description: PGP signature


RE: Linux and usb device drivers using functionfs

2017-11-02 Thread andy_purcell
> What do you mean by "device app runs to completion"? What is the "device
> app"? Is it the functionfs application you're talking about? And by 
> "completion"
> do you mean that it completely stops running or is it just sleeping waiting 
> for
> some input?

"device app runs to completion" means an executable on the embedded device 
starts, downloads a 70 Mbyte file from the PC to the device, then exits. It is 
not waiting for input. This device app, when running, consumes much memory and 
some of that memory is associated with root file system files mounted over NFS.

A colleague claims mlockall() does not help, and says it is not a good idea 
anyway.
I am afraid I have screwed up big time by using functionfs and am feeling 
pressure to move things into the Linux kernel.

Andy Purcell

> -Original Message-
> From: Felipe Balbi [mailto:felipe.ba...@linux.intel.com]
> Sent: Wednesday, November 1, 2017 4:44 AM
> To: PURCELL,ANDY (K-Loveland,ex1) ; linux-
> u...@vger.kernel.org
> Subject: RE: Linux and usb device drivers using functionfs
> 
> 
> Hi Andy,
> 
> andy_purc...@keysight.com writes:
> > Hello Felipe,
> >
> > I am not using DWC3.
> 
> oh, okay :-)
> 
> > I have new information about the problem.
> > The context was this:
> >   1) device boots
> >   2) some usb transfers happen, all are OK
> >   3) a device app runs to completion (USB quiescent during this time, no USB
> transfers required)
> >   4) the controlling PC starts a 4 KByte USB transfer to the device, but 
> > this
> transfer does not finish. Only 3 Kbytes are ACK'd by the device.
> >(A USB analyzer shows the host trying to send more, but the
> > device persistently NAK's)
> >
> >   If step (3) is omitted, everything works fine.
> 
> What do you mean by "device app runs to completion"? What is the "device
> app"? Is it the functionfs application you're talking about? And by 
> "completion"
> do you mean that it completely stops running or is it just sleeping waiting 
> for
> some input?
> 
> > The new information is that step (3) consumes a lot of memory and the
> > theory is the OS is throwing USB user-space code pages out of RAM and
> > using this RAM for NFS files (root file system is NFS mounted).
> >
> > Continuing on with this theory:
> > After the USB related code pages are no longer in RAM, a USB transaction
> happens.
> > Now there is a race condition:
> > 1)  The USB transaction complete interrupt and the OS call into user-space
> USB related code (functionfs, aio, etc)
> > 2)  The Linux paging system trying to page the user-space USB related code
> back into RAM
> >
> > The theory is that (1) can happen before (2).
> 
> I'm not sure the inversion of 1 and 2 would cause issues. Swaping out the 
> pages
> may :-)
> 
> Let us know if mlockall() helps.
> 
> --
> balbi
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux and usb device drivers using functionfs

2017-11-02 Thread Greg KH
On Thu, Nov 02, 2017 at 04:37:51PM +, andy_purc...@keysight.com wrote:
> > What do you mean by "device app runs to completion"? What is the "device
> > app"? Is it the functionfs application you're talking about? And by 
> > "completion"
> > do you mean that it completely stops running or is it just sleeping waiting 
> > for
> > some input?
> 
> "device app runs to completion" means an executable on the embedded
> device starts, downloads a 70 Mbyte file from the PC to the device,
> then exits. It is not waiting for input. This device app, when
> running, consumes much memory and some of that memory is associated
> with root file system files mounted over NFS.

And where is your swap?  What happens if you just do not have swap at
all?

> A colleague claims mlockall() does not help, and says it is not a good
> idea anyway.

That's not true, have you tried it?

> I am afraid I have screwed up big time by using functionfs and am
> feeling pressure to move things into the Linux kernel.

If you disable swap, or use mlock, what happens?  You never answered
that...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Linux and usb device drivers using functionfs

2017-11-02 Thread andy_purcell
Hello, 


> 
> And where is your swap?  What happens if you just do not have swap at all?
Our system has no swap. Running 'top' says 0 total, 0 free, 0 used 

> 
> > A colleague claims mlockall() does not help, and says it is not a good
> > idea anyway.
> 
> That's not true, have you tried it?

Developer says he tried mlockall(). No help... 
Developer then got some input from some Linux experts that we must also modify 
the scheduling priority in order for mlockall() to work.  
Is this true? 


Andy Purcell 
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux and usb device drivers using functionfs

2017-11-02 Thread Greg KH
On Thu, Nov 02, 2017 at 08:43:04PM +, andy_purc...@keysight.com wrote:
> Hello, 
> 
> 
> > 
> > And where is your swap?  What happens if you just do not have swap at all?
> Our system has no swap. Running 'top' says 0 total, 0 free, 0 used 

then your program can not get swapped out, so this whole thing is crazy.

> > > A colleague claims mlockall() does not help, and says it is not a good
> > > idea anyway.
> > 
> > That's not true, have you tried it?
> 
> Developer says he tried mlockall(). No help... 

that doesn't matter if you don't have swap, your program isn't going
anywhere.

> Developer then got some input from some Linux experts that we must
> also modify the scheduling priority in order for mlockall() to work.  
> Is this true? 

What?  No, that's not true at all.

Again, if you do not have swap, and your program does not exit, it's
there in memory just fine.

You really should run 'perf' or something else to get a better
understanding of what is going on...

good luck!

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Linux and usb device drivers using functionfs

2017-11-02 Thread andy_purcell
Greg, 

> >
> > >
> > > And where is your swap?  What happens if you just do not have swap at all?
> > Our system has no swap. Running 'top' says 0 total, 0 free, 0 used
> 
> then your program can not get swapped out, so this whole thing is crazy.

Colleague says when there is memory pressure, program code can get paged out of 
RAM. 
The way I read your sentence above, program code stays in RAM.

What is the right answer?
Can program code, with no swap and no calls to mlockall(), get paged out of RAM 
under extreme memory pressure?  


Andy Purcell
Keysight Technologies
900 South Taft
Loveland, Colorado 80537
970-679-5976

> -Original Message-
> From: Greg KH [mailto:gre...@linuxfoundation.org]
> Sent: Thursday, November 2, 2017 3:31 PM
> To: PURCELL,ANDY (K-Loveland,ex1) 
> Cc: felipe.ba...@linux.intel.com; linux-usb@vger.kernel.org
> Subject: Re: Linux and usb device drivers using functionfs
> 
> On Thu, Nov 02, 2017 at 08:43:04PM +, andy_purc...@keysight.com wrote:
> > Hello,
> >
> >
> > >
> > > And where is your swap?  What happens if you just do not have swap at all?
> > Our system has no swap. Running 'top' says 0 total, 0 free, 0 used
> 
> then your program can not get swapped out, so this whole thing is crazy.
> 
> > > > A colleague claims mlockall() does not help, and says it is not a
> > > > good idea anyway.
> > >
> > > That's not true, have you tried it?
> >
> > Developer says he tried mlockall(). No help...
> 
> that doesn't matter if you don't have swap, your program isn't going anywhere.
> 
> > Developer then got some input from some Linux experts that we must
> > also modify the scheduling priority in order for mlockall() to work.
> > Is this true?
> 
> What?  No, that's not true at all.
> 
> Again, if you do not have swap, and your program does not exit, it's there in
> memory just fine.
> 
> You really should run 'perf' or something else to get a better understanding 
> of
> what is going on...
> 
> good luck!
> 
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Linux and usb device drivers using functionfs

2017-11-03 Thread Felipe Balbi

Hi,

andy_purc...@keysight.com writes:
>> > > And where is your swap?  What happens if you just do not have swap at 
>> > > all?
>> > Our system has no swap. Running 'top' says 0 total, 0 free, 0 used
>> 
>> then your program can not get swapped out, so this whole thing is crazy.
>
> Colleague says when there is memory pressure, program code can get
> paged out of RAM.

only if you have a place where to put it. If you're running out of RAM
and have no SWAP, then that would trigger OOM killer.

> The way I read your sentence above, program code stays in RAM.

yes

> What is the right answer?
>
> Can program code, with no swap and no calls to mlockall(), get paged
> out of RAM under extreme memory pressure?

no, it would get killed by OOM (out-of-memory) killer.

Seems like the problem is elsewhere. Do you have a USB sniffer? I'd say
you need to figure out what's going on with the wire. Which controller
are you using? Which kernel version? Is there anything interesting on
dmesg after the failure?

best

-- 
balbi


signature.asc
Description: PGP signature