This is an interesting challenge you have. However, this list is for proposing ideas for changes in the Python language itself, in particular the CPython reference implementation.
Python-list or some discussion site dealing with machine learning or natural language processing would be appropriate for the task you are trying to figure out. I suspect that third party libraries contain the data structures you need, but I cannot recommend anything specific from my experience. On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions <stephie.ma...@gmail.com> wrote: > I am in desperate need of a dict similar structure that allows sets and/or > dicts as keys *and* values. My application is NLP conceptual plagiarism > detection. Dealing with infinite grammars communicating illogical > concepts. Would be even better if keys could nest the same data structure, > e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as > key(s). > > In order to detect conceptual plagiarism, I need to populate a data > structure with if/then equivalents as a decision tree. But my equivalents > have potentially infinite ways of arranging them syntactically* and* > semantically. > > A dict having keys with identical set values treats each key as a distinct > element. I am dealing with semantics or elemental equivalents and many > different statements treated as equivalent statements involving if/then > (key/value) or a implies b, where a and/or b can be an element or an > if/then as an element. Modeling the syntactic equivalences of such claims > is paramount, and in order to do that, I need the data structure. > > Hello, I am Stephanie. I have never contributed to any open source. I am > about intermediate at python and I am a self-directed learner/hobbyist. I > am trying to prove with my code that a particular very famous high profile > pop debate intellectual is plagiarizing Anders Breivik. I can show it via > observation, but his dishonesty is dispersed among many different > talks/lectures. I am dealing with a large number of speaking hours as > transcripts containing breadcrumbs that are very difficult for a human to > piece together as having come from the manifesto which is 1515 pages and > about half copied from other sources. The concepts stolen are > rearrangements and reorganizations of the same identical claims and themes. > He occasionally uses literal string plagiarism but not very much at once. > He is very good at elaboration which makes it even more difficult. > > Thank you, for your time, > Stephanie > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/