Hi All,
Following up my own post:
> Is there any way to parse a create statement (eg create table, create view,
> create trigger) into its components?
At the moment I am resorting to developing regular expressions to do the
parsing. They work, but it seems to be re-inventing the wheel.
For instance, I can parse a create table statement into two strings: table
name, and combined column definitions and table constraints using:
(?xis) # Parse create table
statement into:
^\s*
create\s+table\s+
(\w+)\s* # 1 table name
\([\r\n]*
(.*)\s* # 2 columns and
constraints
\)
\s*
;?\s*
$ # end of text
Then I can separate the column definitions and table constraints using:
(?xis) # Parse columns and
constraints into:
^
(.+?) # 1 column definitions
(?: ,
( # 2 table constraint
definitions
(?<=,) # preceding comma, not
captured
\s*
(?: constraint\s+\w+\s+)?
(?: primary\s+key
| unique
| check
| foreign\s+key
)\b
.*
)*
)?
$
Then parse the column definitions into column name, type, column constraints
using:
(?xis) # Parse table column
definitions into:
(?:^|(?<=,)) # start or leading comma
\s*
(\w+?) # 1 column name
\b\s*
([^,]*?)? # 2 column type
\b\s*
( \s*
(?:constraint\s+\w+\s+)? # 3 column constraints
\b
(?: primary\s+key
| not\s+null
| unique
| check
| default
| collate
| references
)
\b[^,]*
)*
(?:,|$) # trailing comma or end of text
So, something like this:
create table People
( ID integer primary key
, Name text collate nocase
, Family_ID integer
references Family(ID)
)
Becomes:
table name: People
table columns & constraints:
ID integer primary key
, Name text collate nocase
, Family_ID integer
references Family(ID)
My second parser would separate the column definitions and table constraints,
but since there are no table constraints in this sample, I'll skip it.
My third parser divides up the column definitions to give:
Column 1:
name: ID
type: integer
constraints: primary key
Column 2:
name: Name
type: text
constraints: collate nocase
Column 3:
name: Family_ID
type: integer
constraints: references Family(ID)
For more complex column constraints I could then write another regex parser to
separate each constraint.
I have to allow for quoted identifiers (eg "Family ID"), comments and nested
brackets by tokenizing the string and substituting quoted/bracketed/commented
sections with word placeholders before applying the regex.
But, as I said, I'm thinking I'm re-inventing the wheel here. Isn't there a
better way? Or anyone care to comment on my regex, such as contingencies I may
have missed?
Thanks,
Tom
BareFeet
--
Comparison of SQLite GUI tools:
http://www.tandb.com.au/sqlite/compare/?ml
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users