Amen.  I, too, have been annoyed by clients that make assumptions about
':' and do not parse lines as they should.  I've had to reverse changes
with csircd after finding clients coring or not parsing things correctly.

- Chris



On Thu, Jan 15, 2004 at 10:34:55AM +1300, Perry Lorier wrote:
> 
> This email goes out to client authors, and would be client authors.  You
> know who you are. :)  Today I want to talk about tokenising IRC,
> specifically how to parse what the IRC server sends you.
> 
> First, according to rfc 1459, a line from an IRC server may begin with a
> token starting with a ":".  This token is the source of the message.  If
> this token is not sent by the server, you can assume it is from the
> server.  For instance:
> 
> :foo.undernet.org NOTICE nick :*** Welcome to foo
> and
> NOTICE nick :*** Welcome to foo
> 
> are equivilent (assuming the local server is "foo.undernet.org"),
> fortunately most IRC clients support this.
> 
> However, things like:
> 
> 351 nick u2.10.11.06. Foo.Undernet.org :B30AeEFfIKlMopSU
> 
> are also valid, however most clients barf on it (which is a pity, since
> it would significantly save bandwidth from the server->client).
> 
> However, the real big issue is the ":" on the last token.
> 
> If you have the string
> 
> 999 nick foo blah :blargh narf
> 
> it should be tokenised as
> 
> source=foo.undernet.org
> command=999
> destination=nick
> arguments = ["foo","blah","blargh narf"]
> 
> NOT:
> arguments = ["foo","blah","blargh","narf"]
> or even:
> arguments = ["foo",'blah"]
> meat = "blargh narf"
> 
> or any other bizarre variations on this that client authors love to
> think up.
> 
> In particular:
> 
> :[EMAIL PROTECTED] PRIVMSG #narf Hi!
> 
> is valid, as you are only sending one word, the ":" is not necessary.
> If you are sending multiple words, then the ":" is necessary.
> 
> With the 005 numeric, it's important to ignore the "last" parameter as
> it is not a valid token.  If you don't keep the words after a ":"
> together then you can't tell what the token is.
> 
> With other numerics/commands, ircu may decide (on a whim) to send a ":"
> or not based on some arbitary criteria, please please please don't rely
> on it!  Do you parsing correctly, then we can change ircu to be far more
> sane with it's placement of ":"'s.  Currently every time we've changed
> ":"'s it's caused important clients to core when recieving those commands.
> 
> There can be up to 15 arguments after the destination in a command.
> 
> I've tried to attach some (ugly) C code that parses lines the (more or
> less) the same way as ircu does and I highly recommend you check it out.
>   It's not reliable enough to use in an actual program, but it should be
> a good example of how to parse IRC lines.  However it doesn't make it
> through the lists.
> 
> Use like so:
> [EMAIL PROTECTED]:~$ ./irc_parser "330 Target foo :narf bar"
> Source: foo.undernet.org
> Command/Numeric: 330
> Target: Target
> Arg #0: foo
> Arg #1: narf bar
> 
> [EMAIL PROTECTED]:~$ ./irc_parser ":bar.undernet.org SPIKE Target foo gack
> naffle fish"
> Source: bar.undernet.org
> Command/Numeric: SPIKE
> Target: Target
> Arg #0: foo
> Arg #1: gack
> Arg #2: naffle
> Arg #3: fish
> 
> ---- irc_parser.c
> 
> #include <stdio.h>
> 
> char *server_name = "foo.undernet.org";
> 
> int do_command(char *source,char *command, char *target, int parc, char
> **parv)
> {
>          int i;
>          printf("Source: %s\n",source);
>          printf("Command/Numeric: %s\n",command);
>          printf("Target: %s\n",target);
>          for(i=0;i<parc;i++) {
>                  printf("Arg #%i: %s\n",i,parv[i]);
>          }
>          return 0;
> }
> 
> int parse(char *line)
> {
>          char *source;
>          char *command;
>          char *target;
>          char *arg[15];
>          int args=0;
>          /* Parse the source */
>          if (*line==':') {
>                  line++;
>                  source=line;
>                  while (*line!=' ' && *line)
>                          line++;
>                  if (!*line) {
>                          printf("Error: Expected command\n");
>                          return 1;
>                  }
>                  *line='\0';
>                  line++;
>          }
>          else {
>                  source = server_name;
>          }
>          /* Skip any spaces */
>          while(*line==' ') line++;
>          /* Parse the command */
>          command=line;
>          while(*line!=' ' && *line) line++;
>          if (!*line) {
>                  printf("Error: Expected Target\n");
>                  return 1;
>          }
>          *line='\0';
>          line++;
>          /* Skip any spaces */
>          while(*line==' ') line++;
>          /* Parse the target */
>          target=line;
>          while(*line!=' ' && *line) line++;
>          while(*line && args<15) {
>                  *line='\0';
>                  line++;
>                  while(*line==' ') line++;
>                  if (*line == ':') {
>                          line++;
>                          arg[args++]=line;
>          while(*line && args<15) {
>                  *line='\0';
>                  line++;
>                  while(*line==' ') line++;
>                  if (*line == ':') {
>                          line++;
>                          arg[args++]=line;
>                          break;
>                  }
>                  arg[args++]=line;
>                  while(*line && *line!=' ') line++;
>          }
>          return do_command(source,command,target,args,arg);
> }
> 
> int main(int argc,char **argv)
> {
>          if (argc<2) {
>                  fprintf(stderr,"usage: %s ircline\n",argv[0]);
>                  return 1;
>          }
>          return parse(argv[1]);
> }
> 
> 
> 
> 
> 
> 
> 

-- 
Chris Behrens
Senior Software Architect
XO Communications

Reply via email to