Usually an incremental approach to parser writing is the best.

* Machine definition creates a named regular expression that can be referenced in other expressions.

* Machine instantiation creates a state machine from an expression.

The |* *| syntax is a scanner. You can find it in the manual.

-Adrian

On 13-09-14 09:44 AM, Etienne Samson wrote:
Hello ragel-users !

I'm trying to build a C parser for IMAP (RFC3501), but since I'm a complete 
beginner at ragel *and* I want to do it the best way I can think of, I'm having 
a hard time ;-). Please tell me what you think of the approach I'm aiming for, 
if I'm a little heavy-handed or whatever…

So, I'm trying to split parts of the ABNF for IMAP in different ragel machines 
for easy reuse. I already have :
- abnf.rl that contains machine definitions for basic ABNF tokens (ALPHA, BIT, 
…),
- rfc3501.rl which contains basic "common" things between what will become my 
different machines (tag, address, …),
- rfc3501_response.rl which contains stuff relating to server replies, 
(response, response_tagged, …)
- imap_parser.rl that is supposed to be in charge of parsing a server's response into my own 
"message" C structure. This is the only one I'm "write"ing directly.

My previous attempt was to copy/paste the whole ABNF from the RFC, convert it 
to ragel syntax and pray that it works. Luckily, it didn't, and since I ended 
up as the happy owner of a ragel state machine that has 3070 transitions and 
that I couldn't understand why it fails and where, I'm scaling back, and 
switching to divide-and-conquer (the only thing gained is that I can now look 
up a rule in my old file and integrate it pretty quickly after more thorough 
testing).

So, here's a list of the questions I have :

- I feel a little lost at the difference between a machine definition and a machine 
instantiation. It seems it works like C functions, definition = prototype and 
instantiation = actual function ? But even though they're different, you can attach 
actions to both of them. I understand that you can use definition to have a single place 
to tell ragel what actual syntax to parse (example from rfc3501_response, 
'response_untagged = "tag SP resp_cond_state CRLF";'). But I can't use 
instantiations from one of my including files.

- What does "main := |* stuff *|" mean ? I haven't been able to grasp what 
ragel does with it, I've seen no explanation in the user guide, and quite a few examples 
I found use that. In fact, I was thinking it was part of the instantiation syntax until I 
found examples that weren't using that (like mailbox.rl).

Cordialement,
Etienne Samson
--
samson.etie...@gmail.com


_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ra

_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to