Re: [Tutor] Questions about the formatting of docstrings

2018-07-26 Thread Steven D'Aprano
On Thu, Jul 26, 2018 at 11:34:11PM -0500, boB Stepp wrote:

> I am near the end of reading "Documenting Python Code:  A Complete
> Guide" by James Mertz, found at
> https://realpython.com/documenting-python-code/  This has led me to a
> few questions:
> 
> (1) The author claims that reStructuredText is the official Python
> documentation standard.  Is this true?  If yes, is this something I
> should be doing for my own projects?

Yes, it is true. If you write documentation for the Python standard 
library, they are supposed to be in ReST. Docstrings you read in 
the interactive interpreter often aren't, but the documentation you read 
on the web page has all been automatically generated from ReST text 
files.

As for your own projects... *shrug*. Are you planning to automatically 
build richly formatted PDF and HTML files from plain text documentation? 
If not, I wouldn't worry too much.

On the other hand, if your documentation will benefit from things 
like:

Headings


* lists of items
* with bullets

Definition
a concise explanation of the meaning of a word


then you're probably already using something close to ReST.


> (2) How would type hints work with this reStructuredText formatting?

Before Python 3 introduced syntax for type hints:

def func(arg: int) -> float:
...

there were a number of de facto conventions for coding that information 
into the function doc string. Being plain text, the human reader can 
simply read it, but being a standard format, people can write tools to 
extract that information and process it.

Well I say standard format, but in fact there were a number of slightly 
different competing formats.


> In part of the author's reStructuredText example he has:
> 
> [...]
> :param file_loc:  The file location of the spreadsheet
> :type file_loc:  str
> [...]

As far as I remember, that's not part of standard ReST, but an extension 
used by the Sphinx restructured text tool. I don't know what it does 
with that information, but being a known format, any tool can parse the 
docstring, extract out the parameters and their types, and generate 
documentation, do type checking (either statically or dynamically) or 
whatever else you want to do.


> It seems to me that if type hinting is being used, then the ":type"
> info is redundant, so I wonder if special provision is made for
> avoiding this redundancy when using type hinting?

No. You can use one, or the other, or both, or neither, whatever takes 
your fancy.

I expect that as Python 2 fades away, it will eventually become common 
practice to document types using a type hint rather than in the 
docstring and people will simply stop writing things like ":type 
file_loc: str" in favour of using def func(file_loc: str).


-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] Questions about the formatting of docstrings

2018-07-26 Thread boB Stepp
I am near the end of reading "Documenting Python Code:  A Complete
Guide" by James Mertz, found at
https://realpython.com/documenting-python-code/  This has led me to a
few questions:

(1) The author claims that reStructuredText is the official Python
documentation standard.  Is this true?  If yes, is this something I
should be doing for my own projects?

(2) How would type hints work with this reStructuredText formatting?
In part of the author's reStructuredText example he has:

[...]
:param file_loc:  The file location of the spreadsheet
:type file_loc:  str
[...]

It seems to me that if type hinting is being used, then the ":type"
info is redundant, so I wonder if special provision is made for
avoiding this redundancy when using type hinting?

TIA!
-- 
boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How can I find a group of characters in a list of strings?

2018-07-26 Thread Matt Ruffalo
On 2018-07-25 20:23, Mats Wichmann wrote:
> On 07/25/2018 05:50 PM, Jim wrote:
>> Linux mint 18 and python 3.6
>>
>> I have a list of strings that contains slightly more than a million
>> items. Each item is a string of 8 capital letters like so:
>>
>> ['MIBMMCCO', 'YOWHHOY', ...]
>>
>> I need to check and see if the letters 'OFHCMLIP' are one of the items
>> in the list but there is no way to tell in what order the letters will
>> appear. So I can't just search for the string 'OFHCMLIP'. I just need to
>> locate any strings that are made up of those letters no matter their order.
>>
>> I suppose I could loop over the list and loop over each item using a
>> bunch of if statements exiting the inner loop as soon as I find a letter
>> is not in the string, but there must be a better way.
>>
>> I'd appreciate hearing about a better way to attack this.
> It's possible that the size of the biglist and the length of the key has
> enough performance impacts that a quicky (untested because I don't have
> your data) solution is unworkable for performance reasons.  But a quicky
> might be to take these two steps:
>
> 1. generate a list of the permutations of the target
> 2. check if any member of the target-permutation-list is in the biglist.
>
> Python sets are a nice way to check membership.
>
> from itertools import permutations
> permlist = [''.join(p) for p in permutations('MIBMMCCO', 8)]
>
> if not set(permlist).isdisjoint(biglist):
> print("Found a permutation of MIBMMCCO")
>

I would *strongly* recommend against keeping a list of all permutations
of the query string; though there are only 8! = 40320 permutations of 8
characters, suggesting anything with factorial runtime should be done
only as a last resort.

This could pretty effectively be solved by considering each string in
the list as a set of characters for query purposes, and keeping a set of
those, making membership testing constant-time. Note that the inner sets
will have to be frozensets because normal sets aren't hashable.

For example:

"""
In [1]: strings = ['MIBMMCCO', 'YOWHHOY']

In [2]: query = 'OFHCMLIP'

In [3]: search_db = {frozenset(s) for s in strings}

In [4]: frozenset(query) in search_db
Out[4]: False

In [5]: frozenset('MMCOCBIM') in search_db # permutation of first string
Out[5]: True
"""

MMR...
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How can I find a group of characters in a list of strings?

2018-07-26 Thread Steven D'Aprano
On Wed, Jul 25, 2018 at 05:29:27PM -0700, Martin A. Brown wrote:

> If I only had to do this once, over only a million items (given 
> today's CPU power), so I'd probably do something like the below 
> using sets. 

The problem with sets is that they collapse multiple instances of 
characters to a single one, so that 'ABC' will match 'ABBBCC'. There's 
no indication that is what is required.


-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor