Re: Pegged, a Parsing Expression Grammar (PEG) generator in D
On Tuesday, 13 March 2012 at 05:25:38 UTC, Jay Norwood wrote: Admittedly I have not heard of PEGs before, so I'm curious: Is this powerful enough to parse a language such as C? I've just read a few articles referenced from this page, and the second link was by someone who had done java 1.5, the second link http://bford.info/packrat/ http://www.romanredz.se/papers/FI2007.pdf Also in the later paper he did a C parser, so I suppose that is the answer ... http://www.romanredz.se/papers/FI2008.pdf
Re: Enhanced D syntax highlighting for Sublime Text 2
On Wed, 07 Mar 2012 19:59:36 +0100 Alex Rønne Petersen xtzgzo...@gmail.com wrote: https://github.com/alexrp/st2-d I plan to have it merged into ST2 proper if I can somehow get in touch with the dev(s)... Thank you. I'm evaluating ST2 and believe that soon we'll buy a license, so having decent support for D is very important for us. Sincerely, Gour -- Just try to learn the truth by approaching a spiritual master. Inquire from him submissively and render service unto him. The self-realized souls can impart knowledge unto you because they have seen the truth. http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810 signature.asc Description: PGP signature
Re: Pegged, From EBNF to PEG
On 12.03.2012 16:43, bls wrote: On 03/10/2012 03:28 PM, Philippe Sigaud wrote: Hello, I created a new Github project, Pegged, a Parsing Expression Grammar (PEG) generator in D. https://github.com/PhilippeSigaud/Pegged docs: https://github.com/PhilippeSigaud/Pegged/wiki Just WOW! Nice to have on your WIKI would be a EBNF to PEG sheet. Wirth EBNF Pegged A = BC. A - B C A = B|C. A - C / C Maybe A - B / C. And even then it's not exactly equivalent if the grammar was ambiguous. Imagine: B - a, C - aa -- Dmitry Olshansky
Re: Pegged, From EBNF to PEG
On 12.03.2012 17:45, bls wrote: On 03/13/2012 04:28 AM, Dmitry Olshansky wrote: On 12.03.2012 16:43, bls wrote: On 03/10/2012 03:28 PM, Philippe Sigaud wrote: Hello, I created a new Github project, Pegged, a Parsing Expression Grammar (PEG) generator in D. https://github.com/PhilippeSigaud/Pegged docs: https://github.com/PhilippeSigaud/Pegged/wiki Just WOW! Nice to have on your WIKI would be a EBNF to PEG sheet. Wirth EBNF Pegged A = BC. A - B C A = B|C. A - C / C Maybe A - B / C. And even then it's not exactly equivalent if the grammar was ambiguous. Imagine: B - a, C - aa PEG is pretty new to me. Can you elaborate a bit ? PEG defines order of alternatives, that is pretty much like a top-down recursive descent parser would parse it. Alternatives are tried from left to right, if first one fails, it tries next and so on. In an example I give B is always picked first and so C is never ever looked at. Somewhat less artificial example: Literal - IntL| FloatL FloatL - [0-9]+(.[0-9]+)? IntL - [0-9]+ If you change it to: Literal - FloatL| IntL then integer literals would get parsed as floating point. My mistake.. cleaned up stuff.. Pegged Wirth EBNF Sequence A - B C A = BC. B or C A - B / C A = B|C. Zero or one B A - B? A = [B]. Zero or more Bs A - B* A = {B}. One or more Bs A - B+ Not available PEG description of EBNF EBNF - Procuction+ Production - Identifier '=' Expression '.' Expression - Term ( '|' Term)* Term - Factor Factor* Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}' / '(' Expression ')' lowerCase - [a-z] upperCase - [A-Z] Identifier - (lowerCase / upperCase) (lowerCase / upperCase)* Why not: Identifier - [a-zA-Z]+ Literal - (' .+ ') / ('' .+ '') This needs escaping. Plain '.+' in pattern asks for trouble 99% of time. Still not sure if this is correct. Especially : Term - Factor Factor* Another thing I never really understand is the production order, In other words : Why not top down .. Start : lowerCase - [a-z] upperCase - [A-Z] Identifier - (lowerCase / upperCase) (lowerCase / upperCase)* End : EBNF - Procuction+ where End is Root.. In fact grammars are usually devised the other way around, e.g. Start: Program - ... Ehm... what the whole program is exactly ? Ok, let it be Declaration* for now. What kind of declarations do we have? ... and so on. Latter grammars get tweaked and extended numerous times. At any rate production order has no effect on the grammar, it's still the same. The only thing of importance is what non-terminal considered final (or start if you are LL-centric). TIA, Bjoern -- Dmitry Olshansky
Re: Pegged, From EBNF to PEG
On 13-03-2012 17:17, Dmitry Olshansky wrote: On 12.03.2012 17:45, bls wrote: On 03/13/2012 04:28 AM, Dmitry Olshansky wrote: On 12.03.2012 16:43, bls wrote: On 03/10/2012 03:28 PM, Philippe Sigaud wrote: Hello, I created a new Github project, Pegged, a Parsing Expression Grammar (PEG) generator in D. https://github.com/PhilippeSigaud/Pegged docs: https://github.com/PhilippeSigaud/Pegged/wiki Just WOW! Nice to have on your WIKI would be a EBNF to PEG sheet. Wirth EBNF Pegged A = BC. A - B C A = B|C. A - C / C Maybe A - B / C. And even then it's not exactly equivalent if the grammar was ambiguous. Imagine: B - a, C - aa PEG is pretty new to me. Can you elaborate a bit ? PEG defines order of alternatives, that is pretty much like a top-down recursive descent parser would parse it. Alternatives are tried from left to right, if first one fails, it tries next and so on. In an example I give B is always picked first and so C is never ever looked at. Somewhat less artificial example: Literal - IntL| FloatL FloatL - [0-9]+(.[0-9]+)? IntL - [0-9]+ If you change it to: Literal - FloatL| IntL then integer literals would get parsed as floating point. My mistake.. cleaned up stuff.. Pegged Wirth EBNF Sequence A - B C A = BC. B or C A - B / C A = B|C. Zero or one B A - B? A = [B]. Zero or more Bs A - B* A = {B}. One or more Bs A - B+ Not available PEG description of EBNF EBNF - Procuction+ Production - Identifier '=' Expression '.' Expression - Term ( '|' Term)* Term - Factor Factor* Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}' / '(' Expression ')' lowerCase - [a-z] upperCase - [A-Z] Identifier - (lowerCase / upperCase) (lowerCase / upperCase)* Why not: Identifier - [a-zA-Z]+ That was an illustrative example from the Pegged docs. But yeah, you should just use a range; reads nicer. Literal - (' .+ ') / ('' .+ '') This needs escaping. Plain '.+' in pattern asks for trouble 99% of time. Still not sure if this is correct. Especially : Term - Factor Factor* Another thing I never really understand is the production order, In other words : Why not top down .. Start : lowerCase - [a-z] upperCase - [A-Z] Identifier - (lowerCase / upperCase) (lowerCase / upperCase)* End : EBNF - Procuction+ where End is Root.. In fact grammars are usually devised the other way around, e.g. Start: Program - ... Ehm... what the whole program is exactly ? Ok, let it be Declaration* for now. What kind of declarations do we have? ... and so on. Latter grammars get tweaked and extended numerous times. At any rate production order has no effect on the grammar, it's still the same. The only thing of importance is what non-terminal considered final (or start if you are LL-centric). TIA, Bjoern -- - Alex
Re: Pegged, a Parsing Expression Grammar (PEG) generator in D
I am impressed. That's a really nice showcase for the D compile time features. Can I use PEG to parse languages like python and haskell where indention matters without preprocessing? Will you make it work with input ranges of dchar? So that I can easily plug in some preprocessing steps?
Mono-D 0.3.4
Again a couple of fixes improvements [v0.3.4] - [DDoc launcher] Extended functionality (now delegates array literals are handled, too) - [Refactoring] Fixed most of the renaming reference findinghighlighting bugs - [Settings] Enabled relative include paths for projects (will take the project's dir as base directory) global configurations (uses the config's bin path as base path) - [Formatter] Fixed indent problem with pressing newline in block comments - [Internal] Added instructions for debugging the addin under MonoDevelop v0.3.3: - [Settings] Made url for opening manual pages editable, but it's still using dlang.org by default - [Resolver] Re-fixed structs' default ctor - slightly buggy but working - [Doc outline] Fixed representation of e.g. private const literals - [Doc outline] Added special icon for alias declarations - [Parser] Fixed synchronized parse order issue - [Parser] Fixed class invariant parsing modified their representation in the doc outline - [Building] Small fix when executing stand-alone files - [Parser] Mixin parse error - There are text boxes instead of lists for include paths in the option dialogs now Original Post: http://mono-d.alexanderbothe.com/?p=350 Further issues: https://github.com/aBothe/Mono-D/issues
Re: Pegged, From EBNF to PEG
On Tue, Mar 13, 2012 at 18:05, Alex Rønne Petersen xtzgzo...@gmail.com wrote: lowerCase - [a-z] upperCase - [A-Z] Identifier - (lowerCase / upperCase) (lowerCase / upperCase)* Why not: Identifier - [a-zA-Z]+ That was an illustrative example from the Pegged docs. But yeah, you should just use a range; reads nicer. The docs are for teaching PEG :) (btw, it's the docs describe C-like identifiers, that's why I chose a longer approach) It's always this 'tension', between inlining and refactoring. [a-zA-Z]+ is shorter and more readable. But If you decide to extend your grammar to UTF-32, it'd be easier to just change the 'letter' rule.
Re: Pegged, From EBNF to PEG
On Mon, Mar 12, 2012 at 13:43, bls bizp...@orange.fr wrote: Just WOW! Thanks! Don't be too excited, it's still quite slow as a parser. But that is a fun project :) Nice to have on your WIKI would be a EBNF to PEG sheet. Wirth EBNF Pegged A = BC. A - B C A = B|C. A - C / C A = [B]. A - B? A = {B}. A - B* fact is, I don't know EBNF that much. I basically learned everything I know about parsing or grammars while coding Pegged in February :) I probably made every mistakes in the book. Hey, it's a github public wiki, I guess you can create a new page?