"Chris Mellon" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Nov 7, 2007 5:15 PM, Just Another Victim of the Ambient Morality > <[EMAIL PROTECTED]> wrote: >> >> "Chris Mellon" <[EMAIL PROTECTED]> wrote in message >> news:[EMAIL PROTECTED] >> > On Nov 7, 2007 3:15 PM, Just Another Victim of the Ambient Morality >> > <[EMAIL PROTECTED]> wrote: >> > >> >> > In short, it hasn't really evovled into a user-friendly package >> >> > yet. >> >> >> >> Thank you. >> >> How is it that I seem to be the only one in the market for a >> >> correct >> >> parser? Earley has a runtine of O(n^3) in the worst case and O(n^2) >> >> typically. I have trouble believing that everyone else in the world >> >> has >> >> such intense run-time requirements that they're willing to forego >> >> correctness. Why can't I find a pyparsing-esque library with this >> >> implementation? I'm tempted to roll my own except that it's a fairly >> >> complicated algorithm and I don't really understand how it's any more >> >> efficient than the naive approach... >> > >> > You have an unusual definition of correctness. Many people would say >> > that an ambiguous grammar is a bug, not something to support. >> >> I don't think I do. > > There are an enormous variety of parsing tools, and it's the subject > of much research. And in all those tools, not one meets your > definition of correctness? You don't think that might make it unusual?
It doesn't appear to be common, I'll grant you that! However, there is some research. For instance, the Earley parser appears to be what I want (in conjunction with a parse tree builder). A CYK parser would probably do, too. The algorithms are out there yet no one has chosen to use any of them. At the same time, there are several LALR parsers. Why did anyone need to write the second one after the first one was written?! In fact, in a sense, my problem is solved. There exists a solution to my problem. It's just that no one has implemented that solution. I guess you're right in that it really does appear to be an unusual problem but I don't understand how... >> Besides, you assume too much... >> First off, we've already established that there are unambiguous >> grammars >> for which pyparsing will fail to parse. One might consider that a bug in >> pyparsing... > > You might. Or you might not, since it's well known that there are lots > of types of parsers that can't parse all possible grammars, but that > doesn't make those parsers useless. No one said they were useless. I only said that a correct parser is useful. Many people in this thread seem to disagree and I find this incredible... >> Secondly, I get the impression you want to consider ambiguous >> grammars, >> in some sense, "wrong." They are not. > > Sure they are, at least in many contexts. I understand that you want > support for them, but it's by far more common to want one and only one > set of results from parsing a particular document. Okay, in some contexts, an ambiguous grammar may be considered erroneous. However, in many other contexts, it's merely a fact of life. How is it that there are no tools to address this? If nothing else, pyparsing throws the same error it does when there is no valid parsing of the string. Having no solution and having several solutions are not the same thing... >>Even if they were, if you are >> parsing something for which you are not the creator and that something >> employs an ambiguous grammar, what choice do you have? > > You either disambiguate, or you don't accept ambiguous input. The > third option seems to be what you want, which is to find all possible > solutions and return all of them (and wouldn't this be NP-hard in the > general case?) but that's not a satisfactory result in most > applications. What do you mean by "disambiguate?" Do you mean disambiguate the grammar? One of the conditions of the problem is that you have no control over the grammar, so that's really not an option. Also, an implicit condition of solving a problem is that the problem be... solved, so not accepting the input is not an option, either. While there are many applications that can't deal with multiple solutions, surely there are some? Again, perhaps you can pick one solution over the other through the context of the parse results? That's kind of hard to do if your parser refuses to return any results... Just because a grammar is ambiguous doesn't mean the input string is. It can easily be the case that most or all the input you're expecting will, in practice, only produce one correct parse tree. In this case, it would be useful for your parser to return a correct solution, even at random! Finally, who cares if something is NP-Hard? Okay, in some situations, you'd care. In many others, you don't. For instance, suppose your input length has an upper bound? Unless that's a really high bound or a really complex grammar, its runtime is not likely relevant... >> Furthermore, given a >> set of possible parsings, you might be able to decide which one you >> favour >> given the context of what was parsed! There's a plethora of applications >> for parsing ambiguous grammars yet there are no tools for doing so? > > If you can do this, isn't this really a sign that your grammar is > context sensitive? I don't think so. I'm using the word "context" colloquially, not grammatically. If you know what kind of output you're expecting and only one of several parsings produces that kind of output, then you know which one to run with. That doesn't mean the grammar is context sensitive; just that the data is... As an aside, do you know what tools are available for parsing context senstive grammars? >> > In fact, I often use pyparsing precisely in order to disambiguate >> > (according to specific rules, which are embodied by the parser) >> > ambiguous input, like bizarre hand-entered datetime value. >> >> What do you mean? How do you use pyparsing to disambiguate: >> >> 01-01-08 > > The same way a human would - given an ambiguous date such as this, I > (arbitrarily) decided what it would mean. The alternative is to refuse > the input. You use pyparsing to "arbitrarily decide what it would mean?" You said you "often use pyparsing" to "disambiguate ambiguous input" and I was wondering what you meant by this. On one hand, it's not exactly a fair question since that date string is not grammatically ambiguous. On the other hand, you're the one who brought up "bizarre hand-entered datetime value," and I was only trying to give an example of what you meant. So, what do you mean? -- http://mail.python.org/mailman/listinfo/python-list