On Jul 15, 2019, at 18:44, Nam Nguyen <bits...@gmail.com> wrote:
> I have implemented a tiny (~200 SLOCs) package at 
> https://gitlab.com/nam-nguyen/parser_compynator that demonstrates something 
> like this is possible. There are several examples for you to have a feel of 
> it, as well as some early benchmark numbers to consider. This is far smaller 
> than any of the Python parsing libraries I have looked at, yet more universal 
> than many of them. I hope that it would convert the skeptics ;).

For at least some of your use cases, I don’t think it’s a problem that it’s 70x 
slower than the custom parsers you’d be replacing. How often do you need to 
parse a million URLs in your inner loop? Also, if the function composition is 
really the performance hurdle, can you optimize that away relatively simply, 
just by building an explicit tree (expression-template style) and walking the 
tree in a __call__ method, rather than building an implicit tree of nested 
calls? (And that could be optimized further if needed, e.g. by turning the tree 
walk into a simple virtual machine where all of the fundamental operations are 
inlined into the loop, and maybe even accelerating that with C code.)

But I do think it’s a problem that there seems to be no way to usefully 
indicate failure to the caller, and I’m not sure that could be fixed as easily. 
Invalid inputs in your readme examples don’t fail, they successfully return an 
empty set. There also doesn’t seem to be any way to trigger a hard fail rather 
than a backtrack. So I’m not sure how a real urlparse replacement could do the 
things the current one does, like raising a ValueError on  
https://abc.d[ef.ghi/ complaining that the netloc looks like an invalid IPv6 
address. (Maybe you could def a function that raises a ValueError and attach it 
as a where somewhere in the parser tree? But even if that works, wouldn’t you 
get a meaningless exception that doesn’t have any information about where in 
the source text or where in the parse tree it came from or why it was raised, 
and, as your readme says, a stack trace full of garbage?) Can you add failure 
handling without breaking the “~200LOC and easy to read” feature of the 
library, and without breaking the “easy to read once you grok parser 
combinators” feature of the parsers built with it?
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
Message archived at 
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to