This is going to be really difficult to handle with PLY because the tokenizer is built entirely on top of the re module. In order to ignore the '\' mid- token, you'd have to write the re patterns to look for it and eliminate it. That would be really ugly.
If it's just a matter of ripping '\' out of the program, you might be able to do it using the translate() method of strings (which has an option to delete characters). You could do this before feeding the input to lex. What happens with '\' in strings? That's where it's going to get really messy. Cheers, Dave On Tue 16/09/08 6:05 AM , Pedro Lopes [EMAIL PROTECTED] sent: > > > Hi, I have a difficult(?) problem to solve in PLY. I'm trying to parse > > a > > little language that allows statements to be broken across several > > lines > > by "escaping" the newline with \. This wouldn't be unusual, except in > > this case the break is allowed anywhere, even in the middle of a > > token. > > > > Here is a real example, note how the "chrU42" identifier is split > > across > > 2 lines: > > > > test4 = chrt34||chrh35||chre36||chrF38||chrO39||chrR40||chrM41||chr\ > > U42||chrL43||chrA44||chrP46||chrA47||chrR48||chrS49||chrE50||chrR51\ > > &&y>0.03&&y > > > Now, this would be easy to do by preprocessing the input to PLY with > > a > > regex, but I would rather do it the lexer. Problem is, I can't figure > > out how. Ignored characters in the lexer aren't really ignored > > because > > they still act as token delimiters, so that doesn't work. Ideas? > > > > Pedro > > > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ply-hack" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/ply-hack?hl=en -~----------~----~----~----~------~----~------~--~---
