Re: [PEG] The computational power of PEGs

Nicolas Laurent Wed, 27 Feb 2019 14:45:55 -0800

I'm going to nitpick a bit, but you seems to imply one of my pet peeves
arguments against PEG =)


(Apologies if that's not the case, it's an interesting argument to make
regardless.)

> This means that if one is not careful when programming PEGs, one might
end up
> with a grammar which accepts certain pieces of syntax that one would
rather
> not have, and be completely unaware of it.

What this does not say, but — I suspect — convey to meany people, is that
PEG is
quite error-prone and hard to use.

One way to see this, let's a take a very close statement, about programming
in
general:

    If one is not careful when programming, one might end up with a program
that
    does something that one would rather not have it do, and be completely
    unaware of it.

Or about CFGs:

    If one is not careful when programming CFGs, one might end up with a
grammar
    which accepts certain pieces of syntax that one would rather not have,
and
    be completely unaware of it.

Both statements are certainly true. I have written erroneous programs and
erroneous CFGs. What is implied (and true!) is that errors are possible.

Hence making your own statement seems to imply that PEG are quite more
error-prone than CFGs. Which I don't agree with.

I have an argument from experience, which is that, in practice, I do make
errors
when writing both PEGs and CFGs, but not significantly more in one or the
other,
and mostly those aren't errors due to "prefix capture" (to use
Redziejowski's
name for it).

But okay, that's just my own experience.

Another argument against is that in many ways, ordered choice is a dual to
ambiguity. Both preclude one anothers. Both are statically computationally
intractable, but can be heuristically detected with similar techniques (*)

(*) see "Modular Syntax Demands Verification" by Sylvain Schmitz
http://www.i3s.unice.fr/~mh/RR/2006/RR-06.32-S.SCHMITZ.pdf

So quite similarly, it's not always "obvious" when your grammar has
ambiguity,
and the net result is still that the parser might blow in your hands at
runtime.

Again, from experience, being somewhat careful is sufficient to avoid both
problems (prefix capture and ambiguity). And to qualify the qualitative, I
mean
"somewhat careful" as in "being somewhat careful is sufficient to avoid most
memory leaks when coding in C". People are still going to screw up, but
we're
not talking about inscrutable black magic.

Anyway, I felt like making this little argument. People are prone to be
*quite*
uncharitable towards PEG (*) so it pays to watch these kinds of claims.

(*) Fun example:
https://groups.google.com/forum/#!topic/marpa-parser/8EEq92TjR4E
However, please don't weigh in unless you have something important to say,
the
last thing I want is to mob Mr Kegler.

Something semi-related that gave me pause:

> Does the PEG which I just programmed do what I want it to do?

How would you go about showing that for CFGs? Isn't "what I want it to do" a
feature of the programmer's brain?

About that project, interesting, I'd certainly be interested to hear more
about
it when it comes out.

Cheers,

Nicolas LAURENT


On Tue, 26 Feb 2019 at 12:44, Bruno Loff <bruno.l...@gmail.com> wrote:

> Hey Nicolas,
>
> Our paper is purely theoretical. Our goal, you see, was to come up with a
> context-free language without PEGs. Instead we got some understanding why
> proving such a separation should be pretty hard to do.
>
> It does bear the following consequence for practice (i.e. for actual use
> of PEGs). The PEGs which we built show that PEGs can be used in rather
> surprising ways, that PEGs are actually much more expressive than they may
> appear at first. This effectively makes certain questions about the
> behavior of PEGs very difficult to answer. Questions like "does the PEG
> which I just programmed do what I want it to do?"
>
> Such questions will be very difficult to answer in a fully general
> setting... For example, I am guessing that there is an analogue of Rice's
> theorem that could be proven for PEGs.
>
> This means that if one is not careful when programming PEGs, one might end
> up with a grammar which accepts certain pieces of syntax that one would
> rather not have, and be completely unaware of it. Of course this is also a
> problem in LR/LL/whatever grammars, but in the case of PEGs we formally
> show that the behavior is as hard to predict as that of any Turing machine.
>
> Practically speaking, I think that this calls for the elaboration of
> certain "programming principles" for PEGs. I mean, a set of guidelines that
> would protect people from getting into corner cases.
>
>
> We also have an idea for a practical programming project, and may procure
> funding and/or students for it within the next few years (we will keep it
> under wraps until then). An announcement will be made at some point.
>
>
>
>
>
> On Mon, 25 Feb 2019 at 17:55, Nicolas Laurent <
> nicolas.laur...@uclouvain.be> wrote:
>
>> Hey Bruno,
>>
>> This is mighty interesting and I'll be sure to give it a read as soon as
>> I get the time.
>>
>> Have you already thought about the practical applications of your finding
>> in the field of parsing?
>>
>> Cheers,
>>
>> Nicolas LAURENT
>>
>>
>> On Mon, 25 Feb 2019 at 17:36, Bruno Loff <bruno.l...@gmail.com> wrote:
>>
>>> Hello list members,
>>>
>>> I am writing to announce a new paper, entitled "The computational power
>>> of parsing expression grammars". It appears on ArXiv at the address:
>>> https://arxiv.org/abs/1902.08272
>>>
>>> Any feedback will be most welcome.
>>>
>>> Here is an abstract for the paper:
>>>
>>> We propose a new computational model, the scaffolding automaton, and
>>> prove that it exactly characterises the computational power of parsing
>>> expression grammars (PEGs). Using this characterisation we show that:
>>>
>>> * PEGs have unexpected power and semantics. We present several PEGs with
>>> surprising behaviour, and languages which, unexpectedly, have PEGs,
>>> including a PEG for the language of palindromes whose length is a power of
>>> two, and a PEG for a ``counting language''.
>>>
>>> * We show that PEGs are computationally ``universal'', in the following
>>> sense:  take any computable function f:{0,1}^* -> {0,1}^*; then there
>>> exists a computable function g: :{0,1}^* -> N such that the set
>>> { f(x) #^{g(x)} x | x \in {0,1}^* }
>>> has a PEG (where #^{g(x)} means g(x)-many separator symbols `#'). This
>>> result may be used to construct a PEG language which is complete for P
>>> under logspace reductions.
>>>
>>> * We show that there can be no pumping lemma for PEGs. There is no total
>>> computable function A with the following property: for every PEG G, there
>>> exists n_0 such that for every string x \in L(G) of size |x| \ge n_0, the
>>> output y = A(G, x) is in L(G) and has |y| > |x|.
>>>
>>> * We show that PEGs are strongly non real-time for Turing machines:
>>> There exists a language with a PEG, such that neither it nor its reverse
>>> can be recognised by any multi-tape online Turing machine which is allowed
>>> to do only o(n/\log n) steps after reading each input symbol.
>>> _______________________________________________
>>> PEG mailing list
>>> PEG@lists.csail.mit.edu
>>> https://lists.csail.mit.edu/mailman/listinfo/peg
>>>
>> _______________________________________________
>> PEG mailing list
>> PEG@lists.csail.mit.edu
>> https://lists.csail.mit.edu/mailman/listinfo/peg
>>
>

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] The computational power of PEGs

Reply via email to