Thanks for your insights, will take them aboard, and I'll update here when an alpha is ready
SAMUEL MARKS Sydney Medical School | Westmead Institute for Medical Research | https://linkedin.com/in/samuelmarks Director | Offscale.io | Sydney Scientific Pty Ltd THE UNIVERSITY OF SYDNEY On Tue, 26 Jan 2021, 1:39 am Mike Bayer, <mike...@zzzcomputing.com> wrote: > > > On Mon, Jan 25, 2021, at 1:14 AM, Samuel Marks wrote: > > Good clarification. Didn't know about those PEPs. > > So you're thinking that for runtime discoverability to use `doc` first, > and fallback to `comment`? > > My idea with this project is to translate between: > ↔ SQL > ↔ Python (public served API, models, tests) > ↔ Rust (public served API, models, tests) > ↔ TypeScript (web) > ↔ Swift (iOS) > ↔ Java/Kotlin (Android) > > So naturally every layer needs to have sufficient information to recreate > the semantics for every other layer. SQL has the strongest types, for the > rest a small new syntax will need to be created, e.g., to specify that this > property is a PK of VARCHAR(20). With the exception of this however, > everything else should be standard, and it should be easy for any developer > to jump in and develop following best-practices in their chosen language(s) > and framework(s). > > > I don't totally know what you're targeting as far as those environments, > as in, will those environments need to be tweaked to work with your tool > specifically or not, I would expect that they definitely would be for > non-trivial environments. Assumptions like "SQL has the strongest types" > you will find do not always hold up as folks will be doing things like > putting JSON or UUID values into VARCHAR on the database, backends like > Oracle that have no "boolean" datatype will have an integer in the > database, etc. > > Certainly if you're reading a Python SQLAlchemy model for comments, having > the users use "doc" would be the most direct means of being able to > discover a comment for that attribute. But people will have to know to > use this to be compatible with your tool, for example a model like this: > https://github.com/openstack/neutron/blob/master/neutron/db/models/agent.py#L21 > uses "#" comments, you can't read those, etc. > > within the realm of "create a web API for any model", there's a lot of > work done on that already so you'd be in a crowded space, I'd look around > at what's been done in that area to see what bases should be covered. > > > > > > > > > Any further tips here would be appreciated. > > Samuel Marks > Charity <https://sydneyscientific.org> | consultancy <https://offscale.io> > | open-source <https://github.com/offscale> | LinkedIn > <https://linkedin.com/in/samuelmarks> > > > On Mon, Jan 25, 2021 at 2:14 PM Mike Bayer <mike...@zzzcomputing.com> > wrote: > > > > > On Sun, Jan 24, 2021, at 7:53 PM, Samuel Marks wrote: > > Again, my goal isn't related to Sphinx (although generating nice > documentation is of course, a nice-to-have). > > One advantage of having the generated SQL code be commented is that I > could write parsers that go from SQL files to SQL alchemy models, complete > with docs. > > > Right now people use sqlacodegen for this: > https://pypi.org/project/sqlacodegen/ although it doesn't go from a > plain SQL file first, the SQL would have to be run into a database first > such as SQLite. > > > > The issue with having loose strings in the middle of a class like you've > done is that there is no built-in semantics, and it'll break all existing > linters. > > > Docstrings beneath attributes are known as "attribute docstrings" and are > explicitly mentioned in PEP-257 > https://www.python.org/dev/peps/pep-0257/#what-is-a-docstring : > > "String literals occurring elsewhere in Python code may also act as > documentation. They are not recognized by the Python bytecode compiler and > are not accessible as runtime object attributes (i.e. not assigned to > __doc__), but two types of extra docstrings may be extracted by software > tools: > > 1. String literals occurring immediately after a simple assignment at > the top level of a module, class, or __init__ method are called > "attribute docstrings". > 2. String literals occurring immediately after another docstring are > called "additional docstrings". > > " > > Pep-257 led to Pep 258 > https://www.python.org/dev/peps/pep-0258/#attribute-docstrings, which was > rejected because it didn't become part of Python, but docutils is the > standard tool used for documenting Python and is used to generate Python's > own documentation: > > "A string literal immediately following an assignment statement is > interpreted by the docstring extraction machinery as the docstring of the > target of the assignment statement, under the following conditions <etc>" > > They are the only technique that is usable in all cases since the > attribute may refer to a value such as ``None`` that's a singleton that > does not have a ``__doc__`` attribute. All Python linters I'm familiar > with accept this style of docstring, formatters such as Black will format > them, and this is what I use in all my projects including SQLAlchemy > itself, since if you have something like "my_constant = 1" as your > attribute, that's your only option really other than using a pound sign > comment, which seems to actually be more common from my googling around but > also is not runtime-discoverable. > > There is of course the disadvantage that the bytecode doesn't have access > to them but this is a limitation of Python itself that most projects I'm > familiar with have learned to live with. > > To the extent that people talk about "how should we document attributes?" > using a string literal below the value is usually what you'll find, I > googled a bit and found this styleguide for a major observatory for > example: > https://developer.lsst.io/python/numpydoc.html#py-docstring-attribute-constants-structure > . > > > > > > Sure, I could extend the linters and traverse the body of the class, > inferring out the semantics. But that would be incredibly non-standard. I'm > trying to generate code that could be considered *the standard*. > > > "attribute docstrings" are the standard as discussed in PEP-257, and there > is no competing standard of any kind that I'm aware of. > > > > > So comment, doc, or an ivar/cvar [Sphinx treats these as the same: > https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#info-field-lists] > is what I'll generate to/from. I can generate them all, but that would be > hard for a human to maintain. The idea with the generated code is that it > needs to be human maintainable, as well as machine maintainable. > > > if you insist upon having runtime discoverability then you would use the > "doc" parameter of Column and make use of a base declarative class that > would copy out "doc" into "comment" on the Column objects, but from my end > I would never call this a "standard". There's no standard here > unfortunately. If I had to pick one I'd want string literals underneath > attributes and then Python would include a means of collecting these in a > similar way that PEP-526 variable annotations are collected. > > The lack of runtime discoverability for attribute docstrings is an > "unsolved" problem in Python itself, and we can see as evidenced by > Python's very own "class declarative" system called dataclasses described > at pep-557 https://www.python.org/dev/peps/pep-0557/ has absolutely > nothing to say about how to inline-document declared attributes nor does > the actual field() class ( > https://docs.python.org/3/library/dataclasses.html#dataclasses.field) > have any notion of a "doc" parameter, though as always, you can put your > own docstrings if you wanted into the "metadata" collection, where once > again there's no "standard" here of any kind. > > This is definitely a bit of a problem in Python and I welcome efforts for > a standard means of runtime discoverability of attribute assignments, but > as a wise leader once said, it's "beyond my pay grade" as far as SQLAlchemy > is concerned, anything we propose such as "Column(.. doc)" is just an API > feature that projects may use if they wish for runtime discoverability, but > in my own experience runtime discoverability is not necessary since the two > purposes I have for docstrings are a. reading them in source code and b. > generating docs with tools that know how to read them. > > > > > > (all my code generators go both ways, so you can edit the generated [cli] > code and generate [class] code from it, and edit the [class] code and > generate [cli] code from it) > > Samuel Marks > Charity <https://sydneyscientific.org> | consultancy <https://offscale.io> > | open-source <https://github.com/offscale> | LinkedIn > <https://linkedin.com/in/samuelmarks> > > > On Mon, Jan 25, 2021 at 10:59 AM Mike Bayer <mike...@zzzcomputing.com> > wrote: > > > > > On Sun, Jan 24, 2021, at 5:20 PM, Samuel Marks wrote: > > Dear Mike, > > My tool works at the AST level, and converts between: > > - different docstring formats; > - having types in docstring or explicitly annotated; > - argparse parser augmenting function, class [plain old python class], > methods/functions > > The next step is to add support for SQLalchemy models, routes, and tests. > As you saw from my example code above, the duplication in SQLalchemy is > intense. > > Columns can be documented in the docstring, and/or on a column itself with > `comment` and/or `doc`. > > So if I'm going to generate these SQLalchemy models, and generate classes > &etc. from these SQLalchemy models, then I'll need a clean, consistent way > of documenting each model. > > What is that way? > > > the "comment" field applies to the DDL rendered to the database and is > separate from the docstring that would be present in the ORM model. So > there is not currently any means for these to be "unified" because they are > two different concerns. > > Within the docstrings, I note the use of ":cvar:" which IIUC is a "class > variable", SQLAlchemy ORM models do make use of the class variables at this > level but they represent SQL expressions so terms like ":cvar K: backend > engine, e.g., `np` or `tf`. Defaults to np" don't necessarily make sense > unless they are documented as "instance variables". Then I'm not really > sure from a Sphinx pov why one would have both ":cvar:" in the top level > docstring as well as per-attribute docstrings, which is what Column(... > doc="doc") does. > > My understanding of Sphinx is that it has a feature that extracts > docstrings from source code in order to associate class attribute level > docstrings. so if I wanted ORM models that were documented, I'd want them > to look like this: > > class Model(Base): > """ > Acquire from the official tensorflow_datasets model zoo, or the > ophthalmology focussed ml-prepare library > > """ > > __tablename__ = "model" > > dataset_name = Column(String, primary_key=True, default="mnist") > """name of dataset""" > > tfds_dir = Column(String, default="~/tensorflow_datasets") > "directory to look for models in" > > K = Column(String, default="np") > "backend engine, e.g. np or tf" > > that is, like a standard Python class, nothing special used. if that > isn't working for docs tools then the issue has to be fixed at that level. > > I'm not actually sure why Column() has a "doc" keyword given that Sphinx > should be able to scan these from the source. The "doc" keyword assigns > the given docstring to the "__doc__" attribute of the descriptor but IMO > this should not be necessary, unless Sphinx is still buggy in this regard. > We added that parameter many years ago and it may have been perhaps to work > around limitations in Sphinx, not really sure. > > > > > > > > > > Samuel Marks > Charity <https://sydneyscientific.org> | consultancy <https://offscale.io> > | open-source <https://github.com/offscale> | LinkedIn > <https://linkedin.com/in/samuelmarks> > > > On Mon, Jan 25, 2021 at 6:39 AM Mike Bayer <mike...@zzzcomputing.com> > wrote: > > > hey there, sorry I hadn't responded to this. > > is your tool reformatting Python code? I don't see anything "wrong" with > it other than the code looks kind of verbose. This would be a matter of > personal preference but if it were me I'd want each attribute to have a > string description listed out only once in the source code so that it may > be edited directly. then as far as how it appears in Sphinx and/or DDL > there would be transparent extensions that make that happen. > > > > On Sun, Jan 24, 2021, at 12:14 AM, Samuel Marks wrote: > > Would be great to have some insight here. If I'm going to start generating > to/fro SQLalchemy models, then I need to get the column descriptions right > > Samuel Marks > Charity <https://sydneyscientific.org> | consultancy <https://offscale.io> > | open-source <https://github.com/offscale> | LinkedIn > <https://linkedin.com/in/samuelmarks> > > > On Tue, Jul 28, 2020 at 5:57 PM Samuel Marks <sam...@offscale.io> wrote: > > I have created a little tool—at the AST level—to translate between > docstrings, methods, classes, and argparse. > https://github.com/SamuelMarks/doctrans > > Now looking at adding SQLalchemy support. > > Using the mock > <https://github.com/SamuelMarks/doctrans/tree/f35963b/doctrans/tests/mocks> > I've been using throughout, does this look like the 'right' kind of > SQLalchemy code? > > *class *Model(Base): > > *"""* > * Acquire from the official tensorflow_datasets model zoo, or the > ophthalmology focussed ml-prepare library* > > * :cvar dataset_name: name of dataset. Defaults to mnist* > * :cvar tfds_dir: directory to look for models in. Defaults to > ~/tensorflow_datasets* > * :cvar K: backend engine, e.g., `np` or `tf`. Defaults to np* > * :cvar as_numpy: Convert to numpy ndarrays* > * :cvar data_loader_kwargs: pass this as arguments to data_loader function* > * """* __tablename__ = > *'model'* > dataset_name = Column(String, primary_key=*True*, default=*'mnist'*, > comment=*'name of dataset'*, doc=*'name of > dataset'*) > tfds_dir = Column(String, default=*'~/tensorflow_datasets'*, > comment=*'directory to look for models in'*, > doc=*'directory to look for models in'*) > K = Column(String, default=*'np'*, > comment=*'backend engine, e.g., `np` or `tf`'*, doc=*'backend > engine, e.g., `np` or `tf`'*) > as_numpy = Column(Boolean, > comment=*'Convert to numpy ndarrays'*, doc=*'Convert to > numpy ndarrays'*) > data_loader_kwargs = Column(*'data_loader_kwargs'*, JSON, > comment=*'pass this as arguments to > data_loader function'*, > doc=*'pass this as arguments to data_loader > function'*) > > > *# _return_type = 'Train and tests dataset splits. Defaults to (np.empty(0), > np.empty(0))'* > *def *__repr__(self): > > *"""* > * :returns: String representation of constructed object* > * :rtype: ```str```* > * """* *return **'<Model(dataset_name={self[dataset_name]!r},' > *\ > *' tfds_dir={self[tfds_dir]!r},' *\ > *' K={self[K]!r},' *\ > *' as_numpy={self[as_numpy]!r},' *\ > *' data_loader_kwargs={self[data_loader_kwargs]!r}' *\ > *')>'*.format(self=self) > > > If not, what should it look like? > > Thanks for your suggestions > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/xZAh5zPswM0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/6576d789-d088-4e68-a7f7-a17b5c96a810o%40googlegroups.com > <https://groups.google.com/d/msgid/sqlalchemy/6576d789-d088-4e68-a7f7-a17b5c96a810o%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkTFRjQpTiNwM%2BMSX3dw95KGUhX-ATCpNbb_YRhZRM%2B5Rw%40mail.gmail.com > <https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkTFRjQpTiNwM%2BMSX3dw95KGUhX-ATCpNbb_YRhZRM%2B5Rw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/xZAh5zPswM0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/ee539bc2-67a9-4831-99e9-5e65bc527d84%40www.fastmail.com > <https://groups.google.com/d/msgid/sqlalchemy/ee539bc2-67a9-4831-99e9-5e65bc527d84%40www.fastmail.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkSXTvSFscgTVMqGx6oVGrwL3WtvhDebGXEAMqA1rBt9kQ%40mail.gmail.com > <https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkSXTvSFscgTVMqGx6oVGrwL3WtvhDebGXEAMqA1rBt9kQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/xZAh5zPswM0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/ec302847-868a-40bd-83ea-34a27dfcda0e%40www.fastmail.com > <https://groups.google.com/d/msgid/sqlalchemy/ec302847-868a-40bd-83ea-34a27dfcda0e%40www.fastmail.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkTUnyP%2Br9CjECu7cozR4mWb5oJfEMJsR0dkygKcN-bmpQ%40mail.gmail.com > <https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkTUnyP%2Br9CjECu7cozR4mWb5oJfEMJsR0dkygKcN-bmpQ%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/xZAh5zPswM0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/925cbdca-2fa0-443d-811e-ba59e1c231ca%40www.fastmail.com > <https://groups.google.com/d/msgid/sqlalchemy/925cbdca-2fa0-443d-811e-ba59e1c231ca%40www.fastmail.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkRebudw%3DFGKZ5DLJd6Nyjc_EyGr14sCgG%3DyrZbqLvQ4kA%40mail.gmail.com > <https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkRebudw%3DFGKZ5DLJd6Nyjc_EyGr14sCgG%3DyrZbqLvQ4kA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to a topic in the > Google Groups "sqlalchemy" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/sqlalchemy/xZAh5zPswM0/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/d1acbb3c-901a-4123-adf4-dd50e1c8c5a4%40www.fastmail.com > <https://groups.google.com/d/msgid/sqlalchemy/d1acbb3c-901a-4123-adf4-dd50e1c8c5a4%40www.fastmail.com?utm_medium=email&utm_source=footer> > . > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/CAGOFhkQ2krbbvVXCrvY5EjQUvs-xZXh%2BhpjiD1JqPw_itait5w%40mail.gmail.com.