Amen. I, too, have been annoyed by clients that make assumptions about
':' and do not parse lines as they should. I've had to reverse changes
with csircd after finding clients coring or not parsing things correctly.
- Chris
On Thu, Jan 15, 2004 at 10:34:55AM +1300, Perry Lorier wrote:
>
> This email goes out to client authors, and would be client authors. You
> know who you are. :) Today I want to talk about tokenising IRC,
> specifically how to parse what the IRC server sends you.
>
> First, according to rfc 1459, a line from an IRC server may begin with a
> token starting with a ":". This token is the source of the message. If
> this token is not sent by the server, you can assume it is from the
> server. For instance:
>
> :foo.undernet.org NOTICE nick :*** Welcome to foo
> and
> NOTICE nick :*** Welcome to foo
>
> are equivilent (assuming the local server is "foo.undernet.org"),
> fortunately most IRC clients support this.
>
> However, things like:
>
> 351 nick u2.10.11.06. Foo.Undernet.org :B30AeEFfIKlMopSU
>
> are also valid, however most clients barf on it (which is a pity, since
> it would significantly save bandwidth from the server->client).
>
> However, the real big issue is the ":" on the last token.
>
> If you have the string
>
> 999 nick foo blah :blargh narf
>
> it should be tokenised as
>
> source=foo.undernet.org
> command=999
> destination=nick
> arguments = ["foo","blah","blargh narf"]
>
> NOT:
> arguments = ["foo","blah","blargh","narf"]
> or even:
> arguments = ["foo",'blah"]
> meat = "blargh narf"
>
> or any other bizarre variations on this that client authors love to
> think up.
>
> In particular:
>
> :[EMAIL PROTECTED] PRIVMSG #narf Hi!
>
> is valid, as you are only sending one word, the ":" is not necessary.
> If you are sending multiple words, then the ":" is necessary.
>
> With the 005 numeric, it's important to ignore the "last" parameter as
> it is not a valid token. If you don't keep the words after a ":"
> together then you can't tell what the token is.
>
> With other numerics/commands, ircu may decide (on a whim) to send a ":"
> or not based on some arbitary criteria, please please please don't rely
> on it! Do you parsing correctly, then we can change ircu to be far more
> sane with it's placement of ":"'s. Currently every time we've changed
> ":"'s it's caused important clients to core when recieving those commands.
>
> There can be up to 15 arguments after the destination in a command.
>
> I've tried to attach some (ugly) C code that parses lines the (more or
> less) the same way as ircu does and I highly recommend you check it out.
> It's not reliable enough to use in an actual program, but it should be
> a good example of how to parse IRC lines. However it doesn't make it
> through the lists.
>
> Use like so:
> [EMAIL PROTECTED]:~$ ./irc_parser "330 Target foo :narf bar"
> Source: foo.undernet.org
> Command/Numeric: 330
> Target: Target
> Arg #0: foo
> Arg #1: narf bar
>
> [EMAIL PROTECTED]:~$ ./irc_parser ":bar.undernet.org SPIKE Target foo gack
> naffle fish"
> Source: bar.undernet.org
> Command/Numeric: SPIKE
> Target: Target
> Arg #0: foo
> Arg #1: gack
> Arg #2: naffle
> Arg #3: fish
>
> ---- irc_parser.c
>
> #include <stdio.h>
>
> char *server_name = "foo.undernet.org";
>
> int do_command(char *source,char *command, char *target, int parc, char
> **parv)
> {
> int i;
> printf("Source: %s\n",source);
> printf("Command/Numeric: %s\n",command);
> printf("Target: %s\n",target);
> for(i=0;i<parc;i++) {
> printf("Arg #%i: %s\n",i,parv[i]);
> }
> return 0;
> }
>
> int parse(char *line)
> {
> char *source;
> char *command;
> char *target;
> char *arg[15];
> int args=0;
> /* Parse the source */
> if (*line==':') {
> line++;
> source=line;
> while (*line!=' ' && *line)
> line++;
> if (!*line) {
> printf("Error: Expected command\n");
> return 1;
> }
> *line='\0';
> line++;
> }
> else {
> source = server_name;
> }
> /* Skip any spaces */
> while(*line==' ') line++;
> /* Parse the command */
> command=line;
> while(*line!=' ' && *line) line++;
> if (!*line) {
> printf("Error: Expected Target\n");
> return 1;
> }
> *line='\0';
> line++;
> /* Skip any spaces */
> while(*line==' ') line++;
> /* Parse the target */
> target=line;
> while(*line!=' ' && *line) line++;
> while(*line && args<15) {
> *line='\0';
> line++;
> while(*line==' ') line++;
> if (*line == ':') {
> line++;
> arg[args++]=line;
> while(*line && args<15) {
> *line='\0';
> line++;
> while(*line==' ') line++;
> if (*line == ':') {
> line++;
> arg[args++]=line;
> break;
> }
> arg[args++]=line;
> while(*line && *line!=' ') line++;
> }
> return do_command(source,command,target,args,arg);
> }
>
> int main(int argc,char **argv)
> {
> if (argc<2) {
> fprintf(stderr,"usage: %s ircline\n",argv[0]);
> return 1;
> }
> return parse(argv[1]);
> }
>
>
>
>
>
>
>
--
Chris Behrens
Senior Software Architect
XO Communications