A short discussion of the basics of Edi Weitz's Portable Perl
Compatible Regular Expression library.  Assumes you're familiar with
regular expressions in general.

I haven't used Edi's cl-ppcre package before, so this article was
partly just a learning experience for me.  It's documentation is quite
complete, so I could just say "Go read http://www.weitz.de/cl-ppcre/";
and be done.  :)  All the examples are copied from that file.

Where to get it: 
  - naked: http://weitz.de/files/cl-ppcre.tar.gz
  - Debian: apt-get install cl-ppcre
  - The doc says "There's also a port for Gentoo Linux thanks to
    Matthew Kennedy and a FreeBSD port thanks to Henrik Motakef.
    Installation via asdf-install should as well be possible."

Where to read the docs: http://www.weitz.de/cl-ppcre/

How to load it: 
  - naked: (load "load.lisp") will compile and load everything.
  - Debian: (clc:clc-require :cl-ppcre)

Interesting points: 
  - many of the functions have a flag, sharedp, which tells the
    function that the various substrings it generates can share
    structure with the string matched against.  So you could do
    multiple megabytes of matching, but only actually allocate a few
    displaced arrays.  Nifty.
  - "CL-PPCRE uses a compiler macro and LOAD-TIME-VALUE to make sure
    that the scanner is only built once if the first argument to SCAN,
    SCAN-TO-STRINGS, SPLIT, REGEX-REPLACE, or REGEX-REPLACE-ALL is a
    constant form."  So if you pass it a constant form, it's smart
    enough to realize that and pre-compile the regex parser.  Also
    very nifty.

Basics:

- scan regex target-string &key start end
  => match-start, match-end, reg-starts, reg-ends

Search a string for a regular expression.  Returns the start of the
match, the end of the match, and two arrays denoting the beginnings
and ends of register matches.  On failure returns NIL.

  (cl-ppcre:scan "a*b" "xaaabd")        ; no register matches
  => 1, 5, #(), #()

  (cl-ppcre:scan "(a)*b" "xaaabd")      ; 1 register match
  => 1, 5, #(3), #(4)

  (subseq "xaaabd" 1 5)
  => "aaab"

  (subseq "xaaabd" 3 4)
  => "a"

  (cl-ppcre:scan "(a*)b" "xaaabd")      ; 1 register match, in different place
  => 1, 5, #(1), #(4)

  (subseq "xaaabd" 1 4)
  => "aaa"

- scan-to-strings regex target-string &key start end sharedp 
  => match, regs

Like SCAN but returns substrings of target-string instead of
positions.

  (cl-ppcre:scan-to-strings "(([^b])*)b" "aaabd")
  => "aaab", #("aaa" "a")

- split regex target-string 
    &key start end limit with-registers-p omit-unmatched-p sharedp => list

Matches regex against target-string as often as possible and returns a
list of substrings between the matches.

  (cl-ppcre:split "\\s+" "foo   bar baz frob")
  => ("foo" "bar" "baz" "frob")

  (cl-ppcre:split "\\s+" "foo   bar baz
  frob")
  => ("foo" "bar" "baz" "frob")

  (cl-ppcre:split "\\s*" "foo   bar baz
  frob")
  => ("f" "o" "o" "b" "a" "r" "b" "a" "z" "f" "r" "o" "b")

- regex-replace regex target-string replacement 
  &key start end preserve-case simple-calls 
  => list 

Try to match target-string between start and end against regex and
replace the first match with replacement.

  (cl-ppcre:regex-replace "fo+" "foo bar" "frob")
  => "frob bar"

See the documentation for other functions and other examples.

-- Larry


_______________________________________________
Gardeners mailing list
[email protected]
http://www.lispniks.com/mailman/listinfo/gardeners

Reply via email to