Re: [Python-ideas] New Data Structure - Non Well-Founded Dict

2019-03-17 Thread Cameron Simpson

And in case it wasn't clear, "python-list" is here:

 python-l...@python.org

Please try posting the same question there instead.

Cheers,
Cameron Simpson 

On 17Mar2019 16:23, David Mertz  wrote:

This is an interesting challenge you have. However, this list is for
proposing ideas for changes in the Python language itself, in particular
the CPython reference implementation.

Python-list or some discussion site dealing with machine learning or
natural language processing would be appropriate for the task you are
trying to figure out. I suspect that third party libraries contain the data
structures you need, but I cannot recommend anything specific from my
experience.

On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions 
wrote:


I am in desperate need of a dict similar structure that allows sets and/or
dicts as keys *and* values. My application is NLP conceptual plagiarism
detection. Dealing with infinite grammars communicating illogical
concepts. Would be even better if keys could nest the same data structure,
e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as
key(s).

In order to detect conceptual plagiarism, I need to populate a data
structure with if/then equivalents as a decision tree. But my equivalents
have potentially infinite ways of arranging them syntactically* and*
semantically.

A dict having keys with identical set values treats each key as a distinct
element. I am dealing with semantics or elemental equivalents and many
different statements treated as equivalent statements involving if/then
(key/value) or a implies b, where a and/or b can be an element or an
if/then as an element. Modeling the syntactic equivalences of such claims
is paramount, and in order to do that, I need the data structure.

Hello, I am Stephanie. I have never contributed to any open source. I am
about intermediate at python and I am a self-directed learner/hobbyist. I
am trying to prove with my code that a particular very famous high profile
pop debate intellectual is plagiarizing Anders Breivik. I can show it via
observation, but his dishonesty is dispersed among many different
talks/lectures. I am dealing with a large number of speaking hours as
transcripts containing breadcrumbs that are very difficult for a human to
piece together as having come from the manifesto which is 1515 pages and
about half copied from other sources. The concepts stolen are
rearrangements and reorganizations of the same identical claims and themes.
He occasionally uses literal string plagiarism but not very much at once.
He is very good at elaboration which makes it even more difficult.

Thank you, for your time,
Stephanie

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread Chris Angelico
On Mon, Mar 18, 2019 at 9:34 AM Steven D'Aprano  wrote:
>
> On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote:
> [...]
> > Yes, it will. Can you determine whether some code does this? Can you
> > recognize what kind of object is on the left of a percent sign?
> > Remember, it quite possibly won't be a literal.
>
> I don't understand whether your question is asking if Francis
> *personally* can do this, or if it is possible in principle.
>
> If the later, then inferring the type of expressions is precisely the
> sort of thing that mypy (and others) do.

Kinda somewhere between. Francis keeps saying "oh, just make a source
code rewriter", and I'm trying to point out that (1) that is NOT an
easy thing to do - sure, there are easy cases, but there are also some
extremely hard ones; and (2) even if it could magically be made to
work, it would still have (and cause) problems.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread Steven D'Aprano
On Mon, Mar 18, 2019 at 01:13:29AM +1100, Chris Angelico wrote:
[...]
> Yes, it will. Can you determine whether some code does this? Can you
> recognize what kind of object is on the left of a percent sign?
> Remember, it quite possibly won't be a literal.

I don't understand whether your question is asking if Francis 
*personally* can do this, or if it is possible in principle.

If the later, then inferring the type of expressions is precisely the 
sort of thing that mypy (and others) do.



-- 
Steven
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New Data Structure - Non Well-Founded Dict

2019-03-17 Thread David Mertz
This is an interesting challenge you have. However, this list is for
proposing ideas for changes in the Python language itself, in particular
the CPython reference implementation.

Python-list or some discussion site dealing with machine learning or
natural language processing would be appropriate for the task you are
trying to figure out. I suspect that third party libraries contain the data
structures you need, but I cannot recommend anything specific from my
experience.

On Sun, Mar 17, 2019, 12:39 PM Savant Of Illusions 
wrote:

> I am in desperate need of a dict similar structure that allows sets and/or
> dicts as keys *and* values. My application is NLP conceptual plagiarism
> detection. Dealing with infinite grammars communicating illogical
> concepts. Would be even better if keys could nest the same data structure,
> e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as
> key(s).
>
> In order to detect conceptual plagiarism, I need to populate a data
> structure with if/then equivalents as a decision tree. But my equivalents
> have potentially infinite ways of arranging them syntactically* and*
> semantically.
>
> A dict having keys with identical set values treats each key as a distinct
> element. I am dealing with semantics or elemental equivalents and many
> different statements treated as equivalent statements involving if/then
> (key/value) or a implies b, where a and/or b can be an element or an
> if/then as an element. Modeling the syntactic equivalences of such claims
> is paramount, and in order to do that, I need the data structure.
>
> Hello, I am Stephanie. I have never contributed to any open source. I am
> about intermediate at python and I am a self-directed learner/hobbyist. I
> am trying to prove with my code that a particular very famous high profile
> pop debate intellectual is plagiarizing Anders Breivik. I can show it via
> observation, but his dishonesty is dispersed among many different
> talks/lectures. I am dealing with a large number of speaking hours as
> transcripts containing breadcrumbs that are very difficult for a human to
> piece together as having come from the manifesto which is 1515 pages and
> about half copied from other sources. The concepts stolen are
> rearrangements and reorganizations of the same identical claims and themes.
> He occasionally uses literal string plagiarism but not very much at once.
> He is very good at elaboration which makes it even more difficult.
>
> Thank you, for your time,
> Stephanie
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread Stephen J. Turnbull
francismb writes:
 > On 3/15/19 4:54 AM, Stephen J. Turnbull wrote:
 > > What 2to3 does is to handle a lot of automatic conversions, such as
 > > flipping the identifiers from str to bytes and unicode to str.  It was
 > > necessary to have some such tool because of the very large amount of
 > > such menial work needed to change a 2 code base to a 3 code base.  But
 > > even so, there were things that 2to3 couldn't do, and it often exposed
 > > bugs or very poor practice (decode applied to unicode objects, encode
 > > applied to bytes) that had to be reworked by the developer anyway.

 > Very interesting from the 2/3 transition experience point of view. But
 > that's not still the past, IMHO that will be after 2020,... around 2025 :-)

Yeah, I did a lightning talk on that about 3 years ago (whenever the
2020 Olympics was awarded to Tokyo, which was pretty much simultaneous
with the start of the EOL-for-Python-2 clock -- the basic fantasy was
that "Python 2 is the common official business-oriented language of
the Tokyo Olympics and Paralympics", and the punch line was "no gold
for programmers, just job security").

But so what?  My point is that 2to3 development itself is the past.  I
don't think anybody's working on it at all now.  The question you
asked is "we have 2to3, why not 3.Xto3.Y?" and my answer is "here's
why 2to3 was worth the effort, 3.X upgrades are quite different and
it's not worth it".

 > Could one also say that under the line that it *improved* the code? 
 > (by exposing bugs, bad practices) could be a first step to just
 > *flag* those behaviors/changes ?

Probably not.  So many lines of code need to be changed to go from 2
to 3 that most likely the first release after conversion is a pile of
dungbeetles.  Remember, some Python 2 code uses str as more or less
opaque bytes, other code use it as "I don't need no stinkin' Unicode"
text (works fine for monolingual environments with 8-bit encodings,
after all).  So it doesn't even do a great job for 'str' vs 'unicode'
vs 'bytes'.  No automatic conversion could do more than a 50% job for
most medium-size projects, and every line of code changed has some
probability of introducing a bug.  If there were a lot of bugs to
start with, that probability goes up -- and a lot of lines change,
implying a lot of *new* bugs.  It's hard for a syntax-based tool to
find enough old bugs to keep up with the proliferation of new ones.

You really should have given up on this by now.  It's not that it's a
bad idea: 2to3 wasn't just a good idea, it was a necessary idea in its
context.  But the analogy for within-3 upgrades doesn't hold, and it's
not hard to see why it doesn't once you have the basic facts
(conservative policy toward backwards compatibility, even across major
versions).  I could be wrong, but I don't think there's much for you
to learn by prolonging the thread.  Unless you actually code an
upgrade tool yourself -- then you'll learn a *ton*.  That's not my
idea of fun, though. :-)

Steve

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] New Data Structure - Non Well-Founded Dict

2019-03-17 Thread Savant Of Illusions
I am in desperate need of a dict similar structure that allows sets and/or
dicts as keys *and* values. My application is NLP conceptual plagiarism
detection. Dealing with infinite grammars communicating illogical
concepts. Would be even better if keys could nest the same data structure,
e.g. set(s) or dict(s) in set(s) or dict(s) of the set(s) or dict(s) as
key(s).

In order to detect conceptual plagiarism, I need to populate a data
structure with if/then equivalents as a decision tree. But my equivalents
have potentially infinite ways of arranging them syntactically* and*
semantically.

A dict having keys with identical set values treats each key as a distinct
element. I am dealing with semantics or elemental equivalents and many
different statements treated as equivalent statements involving if/then
(key/value) or a implies b, where a and/or b can be an element or an
if/then as an element. Modeling the syntactic equivalences of such claims
is paramount, and in order to do that, I need the data structure.

Hello, I am Stephanie. I have never contributed to any open source. I am
about intermediate at python and I am a self-directed learner/hobbyist. I
am trying to prove with my code that a particular very famous high profile
pop debate intellectual is plagiarizing Anders Breivik. I can show it via
observation, but his dishonesty is dispersed among many different
talks/lectures. I am dealing with a large number of speaking hours as
transcripts containing breadcrumbs that are very difficult for a human to
piece together as having come from the manifesto which is 1515 pages and
about half copied from other sources. The concepts stolen are
rearrangements and reorganizations of the same identical claims and themes.
He occasionally uses literal string plagiarism but not very much at once.
He is very good at elaboration which makes it even more difficult.

Thank you, for your time,
Stephanie
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread francismb
On 3/15/19 11:09 PM, Chris Angelico wrote:
> And, are you going to run this function on every single code snippet
> before you try it?
If just trying, may be not. But yes, if I care to know where the
applicability limits are (interpreter versions) before integrating it.

IMHO I don't think it's a good practice to integrate a snippet of code
without knowing it, so I would use that function (or may be the service
:-) ) if the possibility existed.

Regards,
--francis
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread Chris Angelico
On Mon, Mar 18, 2019 at 1:09 AM francismb  wrote:
>
> On 3/15/19 11:09 PM, Chris Angelico wrote:
> > Python 3.5 introduced the modulo operator for bytes objects. How are
> > you going to write a function that determines whether or not a piece
> > of code depends on this?
> I'm not sure I understand the question. Isn't *a piece of code* that
> does a modulo operation on a bytes type object *at least* 3.5 python
> code ? or needs *at least* that version to run ?
>

Yes, it will. Can you determine whether some code does this? Can you
recognize what kind of object is on the left of a percent sign?
Remember, it quite possibly won't be a literal.

ChrisA
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Code version evolver

2019-03-17 Thread francismb
On 3/15/19 11:09 PM, Chris Angelico wrote:
> Python 3.5 introduced the modulo operator for bytes objects. How are
> you going to write a function that determines whether or not a piece
> of code depends on this?
I'm not sure I understand the question. Isn't *a piece of code* that
does a modulo operation on a bytes type object *at least* 3.5 python
code ? or needs *at least* that version to run ?

Thanks,
--francis
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Left arrow and right arrow operators

2019-03-17 Thread francismb
On 3/15/19 9:02 PM, francismb wrote:
> And the operator is the function.exactly, function application/call
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Left arrow and right arrow operators

2019-03-17 Thread francismb
Hi Nick,

On 3/12/19 3:57 PM, Nick Timkovich wrote:
> The onus is on you
> to positively demonstrate you require both directions, not him to
> negatively demonstrate it's never required.
>From Calvin I just wanted to have some examples where he sees a use for
swapping operands (nothing to be demonstrated :-) ).

But I really just wanted to talk some *visual asymmetric form* that
could be used as operator for potentially asymmetric operations and
thought that the arrow could be one of this. So you're correct one
should discuss with *form* could potentially be wider accepted/work. The
debate on this is going on the thread: "Why operators are useful".

Regards,
--francis
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/