Parsing ... Parsing natural language (s) with and without Context-Free Grammars
https://en.wikipedia.org/wiki/Parsing https://en. wikipedia.org/wiki/Natural_language_processing https://en.wikipedia.org/wiki/Context-free_grammar#Undecidable_problems (*) ### NLTK https://en.wikipedia.org/wiki/Natural_Language_Toolkit "3.5 Useful Applications of Regular Expressions" http://www.nltk.org/book/ch03.html > Consolidate your understanding of regular expression patterns and substitutions using nltk.re_show(p, s) which annotates the string s to show every place where pattern p was matched, and nltk.app.nemo() which provides a graphical interface for exploring regular expressions. For more practice, try some of the exercises on regular expressions at the end of this chapter. https://github.com/nltk/nltk/blob/develop/nltk/test/tokenize.doctest https://github.com/nltk/nltk/blob/develop/nltk/test/grammar.doctest https://github.com/nltk/nltk/blob/develop/nltk/test/unit/test_tokenize.py https://github.com/nltk/nltk/blob/develop/nltk/test/unit/test_stem.py ## Other tools for natural language https://www.google.com/search?q=site%3Agithub.com+inurl%3Aawesome+spacy+nltk On Sun, Sep 20, 2020, 11:38 AM Wes Turner <wes.tur...@gmail.com> wrote: > Tests for parsers / [regex] pattern matchers in the CPython standard > library: > > https://github.com/python/cpython/blob/master/Lib/test/test_fstring.py > > https://github.com/python/cpython/blob/master/Lib/test/re_tests.py > https://github.com/python/cpython/blob/master/Lib/test/test_re.py > > > https://github.com/python/cpython/blob/master/Lib/test/test_ast.py > https://github.com/python/cpython/blob/master/Lib/test/test_unparse.py > > > https://github.com/python/cpython/blob/master/Lib/test/test_grammar.py > > https://github.com/python/cpython/blob/master/Lib/test/test_tokenize.py > > > https://github.com/python/cpython/blob/master/Lib/test/test_shlex.py > > https://github.com/python/cpython/blob/master/Lib/test/test_optparse.py > https://github.com/python/cpython/blob/master/Lib/test/test_argparse.py > > > > Tests for other parsers / pattern matchers written in Python: > > https://bitbucket.org/mrabarnett/mrab-regex/src/hg/regex_3/test_regex.py > > https://github.com/r1chardj0n3s/parse/blob/master/test_parse.py > > > https://github.com/pyparsing/pyparsing/blob/master/tests/test_simple_unit.py > > https://github.com/jszheng/py3antlr4book > > > https://github.com/dateutil/dateutil/blob/master/dateutil/test/test_parser.py > > https://github.com/arrow-py/arrow/blob/master/tests/test_parser.py > > On Sun, Sep 20, 2020, 5:25 AM Stephen J. Turnbull < > turnbull.stephen...@u.tsukuba.ac.jp> wrote: > >> Greg Ewing writes: >> > On 20/09/20 7:45 am, Christopher Barker wrote: >> > > In [4]: from parse import parse >> > > In [5]: parse("{x}{y}{z}", a_string) >> > > Out[5]: <Result () {'x': '2', 'y': '3', 'z': '4567'}> >> > > >> > > In [6]: parse("{x:d}{y:d}{z:d}", a_string) >> > > Out[6]: <Result () {'x': 2345, 'y': 6, 'z': 7}> >> > > >> > > So that's interesting -- different level of "greadiness" for strings >> > > than integers >> > >> > Hmmm, that seems really unintuitive. I think a better result would >> > be a parse error -- "I was told to expect three things, but I only >> > found one." >> >> Are you sure that shouldn't be "I was told to expect three things, but >> I found six?" ;-) >> >> And why not parse a_string using the "grammar" "{x}{y}{z}" as {'x': >> 2345, 'y': 6, 'z': 7}? That's perfectly valid *interpreting the >> 'grammar' as a format string", and therefore might very well be >> expected. Of course there's probably a rule in parse that {x} is an >> abbreviation for {x:s}. >> >> Regexps are hard for people to interpret, but they're well-defined and >> one *can* learn them. If we're going to go beyond regexps in the >> stdlib (and I'm certainly in favor of that!), let's have a parser that >> uses a grammar notation that is rarely ambiguous in the way that >> format strings *usually* are, and when there is ambiguity, demands >> that the programmer explicitly disambiguate rather than "guessing" in >> some arbitrary way. >> _______________________________________________ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/EYIPHOLUPERDXC6A756HXRK3KQU565O3/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/4X3CUGWUYXKX3LWDMDI6P7UKFK2E3AYD/ Code of Conduct: http://python.org/psf/codeofconduct/