Alright, people seem to have started to get annoyed, so I believe continuing this experiment would violate my sacred and eternal duty to Treat Agora Right Good.
So, yeah. I'm Greg. Or maybe I'm sending messages on Greg's behalf. See recent CFJs. As most of you accurately surmised, Greg's messages were generated by GPT-2, specifically a version of GPT-2 fine-tuned with Agoran mailing list logs since 2014. (The 2014 date is largely arbitrary—I might have been able to go a bit further back, but using much more data ran into resource limitations.) Greg was implemented using a combination of shell scripts, commands in my shell history, python scripts, a Google Colaboaratory Notebook, and me manually copy/pasting messages around. Notably, I manually pasted in and hit send on each message. I did this primarily because figuring out email APIs sounded like a PITA, but also because I wanted to be able to pull the plug in case it said anything horrific. I did, however, do as much as I could to avoid injecting my free will into the process. I operated off of two rules: for each message to the public forum, I would run a python script which had a 10% chance of invoking GPT-2 to generate a reply, which I would send verbatim. (GPT-2 barfs on overly-large input data, so I included a failsafe that automatically removed old messages in the thread until the input was small enough to work. Some messages (like the rulesets) were far too big on their own, resulting in the code generating "replies" without any context.) Additionally, each day after the first (which looks like is just going to mean "today"), I ran a script which had a 50% chance of generating a brand-new proposal. Had I been aware of CFJ 3790, I might have actually went to the trouble of having it send the messages automatically after generating them. I did "intervene" twice: for the registration message, I specifically asked GPT-2 to generate a message to BUS with a subject of "BUS: Registration". In my testing, this had about a 75% chance of generating a message that was a somewhat plausible attempt to register. Unfortunately, I got unlucky and my first "real" attempt to generate a registration message resulted in something completely random (a proposal, I think), so I generated a second one and sent that one. In another case, I discovered a bug with the large-input failsafe (turns out, GPT-2 can barf by silently returning the input with no additional output, or by throwing an exception; I was only handling the first case), so I fixed the bug and re-ran the generation. In every other case, I mechanically copies messages back and forth, following the plans I had made before sending the first message, without attempting to impose any editorial control. In my testing, Greg did occasionally borrow other people's signatures, but I didn't expect it to be this common. I considered preventing it from doing this by removing signatures from the training data (so it would never learn to include them), but I thought it was rare enough and amusing enough that it wasn't worth removing. In retrospect, I probably should have removed them. In my testing, I ran into several outputs that were interesting enough to save so I could show them later. Here are links to them: A proposal for something vaguely resembling a functional auction mechanic: https://gist.github.com/Gaelan/e7f7d3fc48c1abd08f0afb8049077acb A made-up FLR excerpt containing an interesting-sounding royalty mechanic: https://gist.github.com/Gaelan/8d092a17ed9c210685a4f4dd1e622ae2 Another ruleset excerpt, containing the core rules of an alternate universe Agora: https://gist.github.com/Gaelan/ee631f9f97b53df8483e342ef36b6618 A batch of attempts at starting new threads, with varying quality: https://gist.github.com/Gaelan/0c027e3f5b97dab700182aa663401f47 A fake rule called the "Register of Proposals", which looks like a semi-plausible implementation of proposals in an alternate-universe Agora: https://gist.github.com/Gaelan/0c6853f500799c5190a0a1ef474b098b I'd be happy to share the model at some point, but its a bit of a pain—I think it's 1.5 GB—so I'm not sure how best to do that. In the meantime, I'm happy to try out any inputs y'all are curious about. Also, From headers were included in the training data, so I should be able to ask it to generate message from specific Agorans. That might be fun. Happy to answer any questions, of course. Gaelan "Nothing in a democracy is sacred, and nothing in a democracy is sacrosanct. That's not to say it's never been broken, but it's been tracked for quite some time." [GPT-2 put that quote in someone's signature during one of my tests. As far as I can tell, it's not a real quote, but it sounds like a fairly interesting reflection on nomic.]