Re: [Python-ideas] New Data Structure - Non Well-Founded Dict
And in case it wasn't clear, "python-list" is here: python-l...@python.org Please try posting the same question there instead. Cheers, Cameron Simpson On 17Mar2019 16:23, David Mertz wrote: This is an interesting challenge you have. However, this list is for proposing ideas for changes in the Python language itself, in particular the CPython reference implementation. Python-list or some discussion site dealing with machine learning or natural language processing would be appropriate for the task you are trying to figure out. I suspect that third party libraries contain the data structures you need, but I cannot recommend anything specific from my experience. On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions wrote: I am in desperate need of a dict similar structure that allows sets and/or dicts as keys *and* values. My application is NLP conceptual plagiarism detection. Dealing with infinite grammars communicating illogical concepts. Would be even better if keys could nest the same data structure, e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as key(s). In order to detect conceptual plagiarism, I need to populate a data structure with if/then equivalents as a decision tree. But my equivalents have potentially infinite ways of arranging them syntactically* and* semantically. A dict having keys with identical set values treats each key as a distinct element. I am dealing with semantics or elemental equivalents and many different statements treated as equivalent statements involving if/then (key/value) or a implies b, where a and/or b can be an element or an if/then as an element. Modeling the syntactic equivalences of such claims is paramount, and in order to do that, I need the data structure. Hello, I am Stephanie. I have never contributed to any open source. I am about intermediate at python and I am a self-directed learner/hobbyist. I am trying to prove with my code that a particular very famous high profile pop debate intellectual is plagiarizing Anders Breivik. I can show it via observation, but his dishonesty is dispersed among many different talks/lectures. I am dealing with a large number of speaking hours as transcripts containing breadcrumbs that are very difficult for a human to piece together as having come from the manifesto which is 1515 pages and about half copied from other sources. The concepts stolen are rearrangements and reorganizations of the same identical claims and themes. He occasionally uses literal string plagiarism but not very much at once. He is very good at elaboration which makes it even more difficult. Thank you, for your time, Stephanie ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
On Mon, Mar 18, 2019 at 9:34 AM Steven D'Aprano wrote: > > On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: > [...] > > Yes, it will. Can you determine whether some code does this? Can you > > recognize what kind of object is on the left of a percent sign? > > Remember, it quite possibly won't be a literal. > > I don't understand whether your question is asking if Francis > *personally* can do this, or if it is possible in principle. > > If the later, then inferring the type of expressions is precisely the > sort of thing that mypy (and others) do. Kinda somewhere between. Francis keeps saying "oh, just make a source code rewriter", and I'm trying to point out that (1) that is NOT an easy thing to do - sure, there are easy cases, but there are also some extremely hard ones; and (2) even if it could magically be made to work, it would still have (and cause) problems. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote: [...] > Yes, it will. Can you determine whether some code does this? Can you > recognize what kind of object is on the left of a percent sign? > Remember, it quite possibly won't be a literal. I don't understand whether your question is asking if Francis *personally* can do this, or if it is possible in principle. If the later, then inferring the type of expressions is precisely the sort of thing that mypy (and others) do. -- Steven ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] New Data Structure - Non Well-Founded Dict
This is an interesting challenge you have. However, this list is for proposing ideas for changes in the Python language itself, in particular the CPython reference implementation. Python-list or some discussion site dealing with machine learning or natural language processing would be appropriate for the task you are trying to figure out. I suspect that third party libraries contain the data structures you need, but I cannot recommend anything specific from my experience. On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions wrote: > I am in desperate need of a dict similar structure that allows sets and/or > dicts as keys *and* values. My application is NLP conceptual plagiarism > detection. Dealing with infinite grammars communicating illogical > concepts. Would be even better if keys could nest the same data structure, > e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as > key(s). > > In order to detect conceptual plagiarism, I need to populate a data > structure with if/then equivalents as a decision tree. But my equivalents > have potentially infinite ways of arranging them syntactically* and* > semantically. > > A dict having keys with identical set values treats each key as a distinct > element. I am dealing with semantics or elemental equivalents and many > different statements treated as equivalent statements involving if/then > (key/value) or a implies b, where a and/or b can be an element or an > if/then as an element. Modeling the syntactic equivalences of such claims > is paramount, and in order to do that, I need the data structure. > > Hello, I am Stephanie. I have never contributed to any open source. I am > about intermediate at python and I am a self-directed learner/hobbyist. I > am trying to prove with my code that a particular very famous high profile > pop debate intellectual is plagiarizing Anders Breivik. I can show it via > observation, but his dishonesty is dispersed among many different > talks/lectures. I am dealing with a large number of speaking hours as > transcripts containing breadcrumbs that are very difficult for a human to > piece together as having come from the manifesto which is 1515 pages and > about half copied from other sources. The concepts stolen are > rearrangements and reorganizations of the same identical claims and themes. > He occasionally uses literal string plagiarism but not very much at once. > He is very good at elaboration which makes it even more difficult. > > Thank you, for your time, > Stephanie > ___ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
francismb writes: > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote: > > What 2to3 does is to handle a lot of automatic conversions, such as > > flipping the identifiers from str to bytes and unicode to str. It was > > necessary to have some such tool because of the very large amount of > > such menial work needed to change a 2 code base to a 3 code base. But > > even so, there were things that 2to3 couldn't do, and it often exposed > > bugs or very poor practice (decode applied to unicode objects, encode > > applied to bytes) that had to be reworked by the developer anyway. > Very interesting from the 2/3 transition experience point of view. But > that's not still the past, IMHO that will be after 2020,... around 2025 :-) Yeah, I did a lightning talk on that about 3 years ago (whenever the 2020 Olympics was awarded to Tokyo, which was pretty much simultaneous with the start of the EOL-for-Python-2 clock -- the basic fantasy was that "Python 2 is the common official business-oriented language of the Tokyo Olympics and Paralympics", and the punch line was "no gold for programmers, just job security"). But so what? My point is that 2to3 development itself is the past. I don't think anybody's working on it at all now. The question you asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's why 2to3 was worth the effort, 3.X upgrades are quite different and it's not worth it". > Could one also say that under the line that it *improved* the code? > (by exposing bugs, bad practices) could be a first step to just > *flag* those behaviors/changes ? Probably not. So many lines of code need to be changed to go from 2 to 3 that most likely the first release after conversion is a pile of dungbeetles. Remember, some Python 2 code uses str as more or less opaque bytes, other code use it as "I don't need no stinkin' Unicode" text (works fine for monolingual environments with 8-bit encodings, after all). So it doesn't even do a great job for 'str' vs 'unicode' vs 'bytes'. No automatic conversion could do more than a 50% job for most medium-size projects, and every line of code changed has some probability of introducing a bug. If there were a lot of bugs to start with, that probability goes up -- and a lot of lines change, implying a lot of *new* bugs. It's hard for a syntax-based tool to find enough old bugs to keep up with the proliferation of new ones. You really should have given up on this by now. It's not that it's a bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its context. But the analogy for within-3 upgrades doesn't hold, and it's not hard to see why it doesn't once you have the basic facts (conservative policy toward backwards compatibility, even across major versions). I could be wrong, but I don't think there's much for you to learn by prolonging the thread. Unless you actually code an upgrade tool yourself -- then you'll learn a *ton*. That's not my idea of fun, though. :-) Steve ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] New Data Structure - Non Well-Founded Dict
I am in desperate need of a dict similar structure that allows sets and/or dicts as keys *and* values. My application is NLP conceptual plagiarism detection. Dealing with infinite grammars communicating illogical concepts. Would be even better if keys could nest the same data structure, e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as key(s). In order to detect conceptual plagiarism, I need to populate a data structure with if/then equivalents as a decision tree. But my equivalents have potentially infinite ways of arranging them syntactically* and* semantically. A dict having keys with identical set values treats each key as a distinct element. I am dealing with semantics or elemental equivalents and many different statements treated as equivalent statements involving if/then (key/value) or a implies b, where a and/or b can be an element or an if/then as an element. Modeling the syntactic equivalences of such claims is paramount, and in order to do that, I need the data structure. Hello, I am Stephanie. I have never contributed to any open source. I am about intermediate at python and I am a self-directed learner/hobbyist. I am trying to prove with my code that a particular very famous high profile pop debate intellectual is plagiarizing Anders Breivik. I can show it via observation, but his dishonesty is dispersed among many different talks/lectures. I am dealing with a large number of speaking hours as transcripts containing breadcrumbs that are very difficult for a human to piece together as having come from the manifesto which is 1515 pages and about half copied from other sources. The concepts stolen are rearrangements and reorganizations of the same identical claims and themes. He occasionally uses literal string plagiarism but not very much at once. He is very good at elaboration which makes it even more difficult. Thank you, for your time, Stephanie ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
On 3/15/19 11:09 PM, Chris Angelico wrote: > And, are you going to run this function on every single code snippet > before you try it? If just trying, may be not. But yes, if I care to know where the applicability limits are (interpreter versions) before integrating it. IMHO I don't think it's a good practice to integrate a snippet of code without knowing it, so I would use that function (or may be the service :-) ) if the possibility existed. Regards, --francis ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
On Mon, Mar 18, 2019 at 1:09 AM francismb wrote: > > On 3/15/19 11:09 PM, Chris Angelico wrote: > > Python 3.5 introduced the modulo operator for bytes objects. How are > > you going to write a function that determines whether or not a piece > > of code depends on this? > I'm not sure I understand the question. Isn't *a piece of code* that > does a modulo operation on a bytes type object *at least* 3.5 python > code ? or needs *at least* that version to run ? > Yes, it will. Can you determine whether some code does this? Can you recognize what kind of object is on the left of a percent sign? Remember, it quite possibly won't be a literal. ChrisA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Code version evolver
On 3/15/19 11:09 PM, Chris Angelico wrote: > Python 3.5 introduced the modulo operator for bytes objects. How are > you going to write a function that determines whether or not a piece > of code depends on this? I'm not sure I understand the question. Isn't *a piece of code* that does a modulo operation on a bytes type object *at least* 3.5 python code ? or needs *at least* that version to run ? Thanks, --francis ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Left arrow and right arrow operators
On 3/15/19 9:02 PM, francismb wrote: > And the operator is the function.exactly, function application/call ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Left arrow and right arrow operators
Hi Nick, On 3/12/19 3:57 PM, Nick Timkovich wrote: > The onus is on you > to positively demonstrate you require both directions, not him to > negatively demonstrate it's never required. >From Calvin I just wanted to have some examples where he sees a use for swapping operands (nothing to be demonstrated :-) ). But I really just wanted to talk some *visual asymmetric form* that could be used as operator for potentially asymmetric operations and thought that the arrow could be one of this. So you're correct one should discuss with *form* could potentially be wider accepted/work. The debate on this is going on the thread: "Why operators are useful". Regards, --francis ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/