On 2023/07/03 09:37:01 Yasuhito FUTATSUKI wrote:
> Hi,
>
> On 2023/07/03 3:17, James McCoy wrote:
> > Hi all,
> >
> > Python 3.6 deprecated the asyncore and asynchat modules in favor of the
> > asyncio module. Per [PEP 594], that means that the modules
> > will be removed in Python 3.12, thus breaking svnpubsub/client.py.
> >
> > Is there someone familiar enough with these Python modules to work on
> > updating svnpubsub? What's our policy on minimum supported Python
> > version for the tools?
>
> Before it, nobody check (and/or report) that the scripts in
> tools/server-side/svnpubsub can run with any version of Python 3.x.
>
> https://cwiki.apache.org/confluence/display/SVN/Subversion%27s+Python+3+Support+Status
>
> It seems that those tools in the section "Not Checked For Python 3 Yet"
> are now unmantained.
>
> > [PEP 594]: https://peps.python.org/pep-0594/#deprecated-modules
> >
> > Cheers,
>
> Cheers,
> --
> Yasuhito FUTATSUKI <[email protected]>
>
I took some time today to try to figure out how it all connects. I made a
minimal client which I believe should be correct:
[[[
def cc(o):
print("commit " + repr(o))
def ec(url, type, o):
print("event on " + repr(url) + ": " + repr(type) + "\n" + repr(o))
def mc(o):
print("metadata " + repr(o))
import client
c=client.MultiClient(["http://pubsub.apache.org:2069"], cc, ec, mc)
c.run_forever()
]]]
Running on Python 3.11.2, I only get the following output:
[[[
event on 'http://pubsub.apache.org:2069': 'connected'
None
event on 'http://pubsub.apache.org:2069': 'closed'
None
event on 'http://pubsub.apache.org:2069': 'connected'
None
event on 'http://pubsub.apache.org:2069': 'closed'
None
...
]]]
'closed' come immediately after 'connected'. Between each pair there is
approximately 30 seconds
I also tries settingup svnwcsub.py/svnwcsub.conf and I got it running but with
the same events as above.
Running curl on the same endpoint print a "stillalive" event every 5 seconds
(with the occational actual event inbetween):
[[[
$ curl http://pubsub.apache.org:2069
{"stillalive": 1732781004.3811345}
{"stillalive": 1732781009.3833125}
]]]
Comparing the svnpubsub and curl network traffic using tcpdump, I noted that
svnpubsub use HTTP/1.0 while curl use HTTP/1.1. I also saw some differences
between the format actual on-the-wire-format by pubsub.apache.org
(Transfer-Encoding: chunked, so each record is a tuple consisting of [length]
and [json] separated and terminated by \r\n) and the expected formt (record
separator \0 and no length field).
Checking the code for pypubsub [1] it seems the \0 can be added depending on
the User-Agent header but it seems to always use Transfer-Encoding: chunked.
I've tested the below changes:
[[[
Index: client.py
===================================================================
--- client.py (revision 1922187)
+++ client.py (working copy)
@@ -97,7 +97,8 @@
self.handle_error()
return
- self.push(('GET %s HTTP/1.0\r\n\r\n' % resource).encode('ascii'))
+ self.push(('GET %s HTTP/1.1\r\nHost: %s\r\n\r\n' %
+ (resource, host + ":" + str(port))).encode('ascii'))
def handle_connect(self):
self.event_callback('connected', None)
@@ -113,12 +114,20 @@
def found_terminator(self):
if self.skipping_headers:
self.skipping_headers = False
- # Each JSON record is terminated by a null character
- self.set_terminator(b'\0')
+ # pypubsub is using Transfer-Encoding: chunked so each message is
+ # 0xLength\r\n{json...}\n\r\n
+ # ^^^^^^^^^^ <- This is the Length
+ # Set terminator to the trailing \n plus the second \r\n to catch the
+ # whole message
+ self.set_terminator(b'\n\r\n')
else:
- record = b"".join(self.ibuffer)
+ record = b"".join(self.ibuffer).decode()
self.ibuffer = []
- self.parser.feed(record.decode())
+ # Cut out the length before parsing the json
+ index = record.find('\r\n')
+ if index:
+ record = record[index+2:]
+ self.parser.feed(record)
def collect_incoming_data(self, data):
# Remember the last time we saw activity
]]]
With this, the client seems to work. I'd appreciate if someone with Python
knowledge can review it before I commit.
The next step will be to rewrite for asyncio, but I'd like to have a functional
client as a baseline.
Cheers,
Daniel
[1] https://github.com/apache/infrastructure-pypubsub/blob/master/pypubsub.py