I think I've debugged the issue, but it's only present in our locally modified 
version of the client, although the root cause could affects others. In case 
others have minor modifications to the client, or anyone modifies the client in 
the future:

It was a race condition between some error checking logic and connection 
initialization. If the error occured before the connection initialized, then 
the connection would be hung. I'm guessing this is related to the `go-sema` but 
I'm not entirely sure.

We added some additional error checking that happens at line:
  
https://github.com/racket/handin/blob/ac08937cc6b1eca8abe3d4d4df59876f95cbea17/handin-client/client-gui.rkt#L353
We simply checked that the current file was saved and raised an error if not:
> (unless filename
>   (report-error "File is not saved. Please save the file and try again."))

This occurs in parallel with initializing the connection:
  
https://github.com/racket/handin/blob/ac08937cc6b1eca8abe3d4d4df59876f95cbea17/handin-client/client-gui.rkt#L345

If the error checking raises an error before the connection is established, it 
seems that the connection logic completely hangs, and the connection can never 
be used.

We can't move the error checking BEFORE the initialization, since 
`report-error` relies on the `comm-cust` variable, which is initialized through 
mutation by `(init-comm)`.

Instead, I've moved the error reporting to happen AFTER the connection has 
definitely been established, right before a user tries to submit. This is a 
shame, since it principle it can happen in parallel with initialization, but I 
can figure out how to untangle this code enough to do that without risking the 
race condition.

--
William J. Bowman

On Sat, Sep 18, 2021 at 08:28:25PM -0700, 'William J. Bowman' via Racket Users 
wrote:
> I've confirmed it's definitely client side, by redirecting the handin 
> server's address to 127.0.0.1 in /etc/hosts, and listening with `nc -l`. The 
> handin client hangs on "Making secure connection ..." and nc display nothing 
> at all. A few restarts and `nc -l` displays a bunch of gibberish that I'm 
> guessing is the handin protocol, and killing `nc` triggers the handin client 
> to report a connection error.
> 
> So it's:
> - handin client side
> - maybe related to openssl
> - nondeterministic
> - when it occurs, it will recur until you restart DrRacket
> - when it doesn't occur, it will not recur until you restart DrRacket
> - affects 8.1 BC
> - affects 8.1 CS
> - affects 8.2.0.2 CS 
> - results in the client failing send anything to the network
> 
> --
> William J. Bowman
> 
> On Sat, Sep 18, 2021 at 08:05:10PM -0700, 'William J. Bowman' via Racket 
> Users wrote:
> > Since I'm currently experiencing the issue, I've been able to get some 
> > better data. I've managed to reproduce it in 8.2.0.2 CS, which suggests 
> > it's not https://github.com/racket/racket/issues/3804.
> > 
> > Restarting twice DrRacket hasn't helped, nor has resetting my wifi 
> > connection.
> > 
> > After connecting via a browser, I notice a lot of the following in the log 
> > that seem to correlate with my attempts in the browser:
> > > [-|2021-09-18T19:37:45] handin: unknown protocol: #"GET / HTTP/1.1"
> > > ...
> > > [-|2021-09-18T19:37:53] ERROR: ssl-accept/enable-break: accept failed 
> > > (error:1408F09C:SSL routines:ssl3_get_record:http request)
> > 
> > As expected, nothing seem to correlate with my attempts to connect from the 
> > handin plugin.
> > 
> > This makes me suspect the server, but I can't reconcile that with why 
> > there's nothing in the logs.
> > 
> > --
> > William J. Bowman
> > 
> > On Sat, Sep 18, 2021 at 06:59:43PM -0700, 'William J. Bowman' via Racket 
> > Users wrote:
> > > I just tried this, but I can't seem to connect.
> > >   http://cs110.students.cs.ubc.ca:7979/
> > > gives "connection reset", and 
> > >   https://cs110.students.cs.ubc.ca:7979/
> > > gives "secure connection failed".
> > > 
> > > There's no prompt to accept the certificate (which I wouldn't expect, 
> > > because we're using a CA signed certificate through Let's Encrypt, not a 
> > > self-signed certificate).
> > > 
> > > I'm currently experiencing the problem on my own client. I'm not sure if 
> > > that's related; I also couldn't connect from my phone.
> > > 
> > > --
> > > William J. Bowman
> > > 
> > > On Sat, Sep 18, 2021 at 09:24:05PM -0400, Sam Tobin-Hochstadt wrote:
> > > > Have you tried visiting the server with a browser? That should work,
> > > > although you'll have to accept the certificate. It might also indicate 
> > > > some
> > > > aspect of the behavior.
> > > > 
> > > > Sam
> > > > 
> > > > On Sat, Sep 18, 2021, 7:13 PM 'William J. Bowman' via Racket Users <
> > > > racket-users@googlegroups.com> wrote:
> > > > 
> > > > > I need some help debugging an issue with the handin package. The 
> > > > > handin
> > > > > plugin (client) displays “Making secure connection to <handin server> 
> > > > > …”,
> > > > > and simply hangs. Closing the dialog and trying again never resolves 
> > > > > the
> > > > > issue.
> > > > >
> > > > > The only method that seems to resolve the issue, although 
> > > > > inconsistently,
> > > > > is restarting DrRacket, opening a new file, and trying to submit that 
> > > > > new
> > > > > file. This sometimes, but not always, enables the client to connect. 
> > > > > Once
> > > > > it does connect, the issue doesn't seem to recur for some time. The 
> > > > > client
> > > > > can make multiple successful submissions, at least until the end of 
> > > > > lecture
> > > > > (maybe related to the next time they disconnect/reconnect to the 
> > > > > internet).
> > > > >
> > > > > We running Racket 7.8 on the server and 8.1 BC on the clients. We've 
> > > > > seen
> > > > > the issue occur on many operating system---old and new versions of 
> > > > > macOS,
> > > > > Windows 10, and at one report on Linux.
> > > > >
> > > > > I can't just upgrade the clients to 8.2, since there's a bug in 8.2 
> > > > > that
> > > > > affects rendering inexact numbers in BSL, so I really want some 
> > > > > confidence
> > > > > about what the issue is before I start upgrading versions.
> > > > >
> > > > > Anecdotally, the problem seems more common this semester compared to 
> > > > > the
> > > > > previous semester, and we upgraded the clients to 8.1 this semester,
> > > > > suggesting the clients are at fault.
> > > > >
> > > > > When this problem occurs, there is nothing in the log on the handin
> > > > > server, suggesting the client did not even manage to initiate the
> > > > > connection to the server. In particular, the server never seems to 
> > > > > make it
> > > > > to this log line:
> > > > >
> > > > > https://github.com/racket/handin/blob/ac08937cc6b1eca8abe3d4d4df59876f95cbea17/handin-server/main.rkt#L679
> > > > > This is one the earliest log lines and before pretty much anything
> > > > > happens, so we're *PRETTY SURE* the client is blocking.
> > > > >
> > > > > Right now, my best guess is that we might be affected by this bug, 
> > > > > which
> > > > > causes SSL ports to block incorrectly:
> > > > >   https://github.com/racket/racket/issues/3804
> > > > >
> > > > > If so, it would probably be in the client, unless `(ssl-addresses r)` 
> > > > > can
> > > > > block in the same way on the server, since otherwise the above log 
> > > > > line
> > > > > would execute.
> > > > >
> > > > > However, if it is the client, I don't have any explanation about why
> > > > > restarting DrRacket would workaround the bug, or why it sometimes 
> > > > > doesn't
> > > > > work.
> > > > >
> > > > > I'd appreciate any help.
> > > > >
> > > > > --
> > > > > William J. Bowman
> > > > >
> > > > > --
> > > > > You received this message because you are subscribed to the Google 
> > > > > Groups
> > > > > "Racket Users" group.
> > > > > To unsubscribe from this group and stop receiving emails from it, 
> > > > > send an
> > > > > email to racket-users+unsubscr...@googlegroups.com.
> > > > > To view this discussion on the web visit
> > > > > https://groups.google.com/d/msgid/racket-users/YUZyWlsY9CdCDyPu%40williamjbowman.com
> > > > > .
> > > > >
> > > 
> > > -- 
> > > You received this message because you are subscribed to the Google Groups 
> > > "Racket Users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an 
> > > email to racket-users+unsubscr...@googlegroups.com.
> > > To view this discussion on the web visit 
> > > https://groups.google.com/d/msgid/racket-users/YUaZj9v0Lch0jfMC%40williamjbowman.com.
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "Racket Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to racket-users+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/racket-users/YUao5ov6j7JCJHLW%40williamjbowman.com.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/YUauWYAeXzzk9lU/%40williamjbowman.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/YUgG5N3eWWt8G8rf%40williamjbowman.com.

Reply via email to