Re: Pegged, a Parsing Expression Grammar (PEG) generator in D

2012-03-13 Thread Jay Norwood

On Tuesday, 13 March 2012 at 05:25:38 UTC, Jay Norwood wrote:






Admittedly I have not heard of PEGs before, so I'm curious: Is 
this powerful enough to parse a language such as C?


I've just read a few articles referenced from this page, and 
the second link was by someone who had done java 1.5, the 
second link

http://bford.info/packrat/
http://www.romanredz.se/papers/FI2007.pdf


Also in the later paper he did a C parser, so I suppose that is 
the answer ...


http://www.romanredz.se/papers/FI2008.pdf


Re: Enhanced D syntax highlighting for Sublime Text 2

2012-03-13 Thread Gour
On Wed, 07 Mar 2012 19:59:36 +0100
Alex Rønne Petersen xtzgzo...@gmail.com wrote:

 https://github.com/alexrp/st2-d
 
 I plan to have it merged into ST2 proper if I can somehow get in
 touch with the dev(s)...

Thank you. I'm evaluating ST2 and believe that soon we'll buy a license,
so having decent support for D is very important for us.


Sincerely,
Gour

-- 
Just try to learn the truth by approaching a spiritual master. 
Inquire from him submissively and render service unto him. 
The self-realized souls can impart knowledge unto you because 
they have seen the truth.

http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810


signature.asc
Description: PGP signature


Re: Pegged, From EBNF to PEG

2012-03-13 Thread Dmitry Olshansky

On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the 
grammar was ambiguous.

Imagine: B - a, C - aa

--
Dmitry Olshansky


Re: Pegged, From EBNF to PEG

2012-03-13 Thread Dmitry Olshansky

On 12.03.2012 17:45, bls wrote:

On 03/13/2012 04:28 AM, Dmitry Olshansky wrote:

On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the
grammar was ambiguous.
Imagine: B - a, C - aa

PEG is pretty new to me. Can you elaborate a bit ?


PEG defines order of alternatives, that is pretty much like a top-down 
recursive descent parser would parse it. Alternatives are tried from 
left to right, if first one fails, it tries next and so on.
In an example I give B is always picked first and so C is never ever 
looked at.


Somewhat less artificial example:
Literal - IntL| FloatL
FloatL - [0-9]+(.[0-9]+)?
IntL - [0-9]+

If you change it to: Literal - FloatL| IntL then integer literals would 
get parsed as floating point.






My mistake.. cleaned up stuff..

Pegged Wirth EBNF

Sequence
A - B C A = BC.

B or C
A - B / C A = B|C.

Zero or one B
A - B? A = [B].

Zero or more Bs
A - B* A = {B}.

One or more Bs
A - B+ Not available

PEG description of EBNF

EBNF - Procuction+
Production - Identifier '=' Expression '.'
Expression - Term ( '|' Term)*
Term - Factor Factor*
Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}'
/ '(' Expression ')'
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


Why not:
Identifier - [a-zA-Z]+


Literal - (' .+ ') / ('' .+ '')


This needs escaping. Plain '.+' in pattern asks for trouble 99% of time.


Still not sure if this is correct. Especially :
Term - Factor Factor*


Another thing I never really understand is the production order, In
other words : Why not top down ..
Start :
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*





End :
EBNF - Procuction+

where End is Root..


In fact grammars are usually devised the other way around, e.g.
Start:
 Program - ...
Ehm... what the whole program is exactly ? Ok, let it be Declaration* 
for now. What kind of declarations do we have? ... and so on. Latter 
grammars get tweaked and extended numerous times.


At any rate production order has no effect on the grammar, it's still 
the same. The only thing of importance is what non-terminal considered 
final (or start if you are LL-centric).




TIA, Bjoern



--
Dmitry Olshansky


Re: Pegged, From EBNF to PEG

2012-03-13 Thread Alex Rønne Petersen

On 13-03-2012 17:17, Dmitry Olshansky wrote:

On 12.03.2012 17:45, bls wrote:

On 03/13/2012 04:28 AM, Dmitry Olshansky wrote:

On 12.03.2012 16:43, bls wrote:

On 03/10/2012 03:28 PM, Philippe Sigaud wrote:

Hello,

I created a new Github project, Pegged, a Parsing Expression Grammar
(PEG) generator in D.

https://github.com/PhilippeSigaud/Pegged

docs: https://github.com/PhilippeSigaud/Pegged/wiki


Just WOW!

Nice to have on your WIKI would be a EBNF to PEG sheet.

Wirth EBNF Pegged
A = BC. A - B C
A = B|C. A - C / C


Maybe A - B / C. And even then it's not exactly equivalent if the
grammar was ambiguous.
Imagine: B - a, C - aa

PEG is pretty new to me. Can you elaborate a bit ?


PEG defines order of alternatives, that is pretty much like a top-down
recursive descent parser would parse it. Alternatives are tried from
left to right, if first one fails, it tries next and so on.
In an example I give B is always picked first and so C is never ever
looked at.

Somewhat less artificial example:
Literal - IntL| FloatL
FloatL - [0-9]+(.[0-9]+)?
IntL - [0-9]+

If you change it to: Literal - FloatL| IntL then integer literals would
get parsed as floating point.





My mistake.. cleaned up stuff..

Pegged Wirth EBNF

Sequence
A - B C A = BC.

B or C
A - B / C A = B|C.

Zero or one B
A - B? A = [B].

Zero or more Bs
A - B* A = {B}.

One or more Bs
A - B+ Not available

PEG description of EBNF

EBNF - Procuction+
Production - Identifier '=' Expression '.'
Expression - Term ( '|' Term)*
Term - Factor Factor*
Factor - Identifier / Literal / '[' Expression ']' / '{' Expression '}'
/ '(' Expression ')'
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


Why not:
Identifier - [a-zA-Z]+


That was an illustrative example from the Pegged docs. But yeah, you 
should just use a range; reads nicer.





Literal - (' .+ ') / ('' .+ '')


This needs escaping. Plain '.+' in pattern asks for trouble 99% of time.


Still not sure if this is correct. Especially :
Term - Factor Factor*


Another thing I never really understand is the production order, In
other words : Why not top down ..
Start :
lowerCase - [a-z]
upperCase - [A-Z]
Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*





End :
EBNF - Procuction+

where End is Root..


In fact grammars are usually devised the other way around, e.g.
Start:
Program - ...
Ehm... what the whole program is exactly ? Ok, let it be Declaration*
for now. What kind of declarations do we have? ... and so on. Latter
grammars get tweaked and extended numerous times.

At any rate production order has no effect on the grammar, it's still
the same. The only thing of importance is what non-terminal considered
final (or start if you are LL-centric).



TIA, Bjoern






--
- Alex


Re: Pegged, a Parsing Expression Grammar (PEG) generator in D

2012-03-13 Thread Tobias Pankrath
I am impressed. That's a really nice showcase for the D compile 
time features.


Can I use PEG to parse languages like python and haskell where 
indention matters without preprocessing?


Will you make it work with input ranges of dchar? So that I can 
easily plug in some preprocessing steps?





Mono-D 0.3.4

2012-03-13 Thread alex

Again a couple of fixes  improvements [v0.3.4]

- [DDoc launcher] Extended functionality (now delegates  array 
literals are handled, too)
- [Refactoring] Fixed most of the renaming  reference 
findinghighlighting bugs
- [Settings] Enabled relative include paths for projects (will 
take the project's dir as base directory)  global configurations 
(uses the config's bin path as base path)
- [Formatter] Fixed indent problem with pressing newline in block 
comments
- [Internal] Added instructions for debugging the addin under 
MonoDevelop


v0.3.3:

- [Settings] Made url for opening manual pages editable, but it's 
still using dlang.org by default
- [Resolver] Re-fixed structs' default ctor - slightly buggy but 
working
- [Doc outline] Fixed representation of e.g. private const 
literals

- [Doc outline] Added special icon for alias declarations
- [Parser] Fixed synchronized parse order issue
- [Parser] Fixed class invariant parsing  modified their 
representation in the doc outline

- [Building] Small fix when executing stand-alone files
- [Parser] Mixin parse error
- There are text boxes instead of lists for include paths in the 
option dialogs now


Original Post: http://mono-d.alexanderbothe.com/?p=350
Further issues: https://github.com/aBothe/Mono-D/issues


Re: Pegged, From EBNF to PEG

2012-03-13 Thread Philippe Sigaud
On Tue, Mar 13, 2012 at 18:05, Alex Rønne Petersen xtzgzo...@gmail.com wrote:

 lowerCase - [a-z]
 upperCase - [A-Z]
 Identifier - (lowerCase / upperCase) (lowerCase / upperCase)*


 Why not:
 Identifier - [a-zA-Z]+


 That was an illustrative example from the Pegged docs. But yeah, you should
 just use a range; reads nicer.

The docs are for teaching PEG :) (btw, it's the docs describe C-like
identifiers, that's why I chose a longer approach)
It's always this 'tension', between inlining and refactoring.
[a-zA-Z]+ is shorter and more readable. But If you decide to extend
your grammar to UTF-32, it'd be easier to just change the 'letter'
rule.


Re: Pegged, From EBNF to PEG

2012-03-13 Thread Philippe Sigaud
On Mon, Mar 12, 2012 at 13:43, bls bizp...@orange.fr wrote:

 Just WOW!

Thanks! Don't be too excited, it's still quite slow as a parser. But
that is a fun project :)

 Nice to have on your WIKI would be a EBNF to PEG sheet.

 Wirth EBNF      Pegged
 A = BC.         A - B C
 A = B|C.        A - C / C
 A = [B].        A - B?
 A = {B}.        A - B*

fact is, I don't know EBNF that much. I basically learned everything I
know about parsing or grammars while coding Pegged in February :) I
probably made every mistakes in the book.

Hey, it's a github public wiki, I guess you can create a new page?