Re: [Fenics] Generation of docstring module

Anders Logg Tue, 07 Sep 2010 07:13:41 -0700

On Tue, Sep 07, 2010 at 03:50:11PM +0200, Kristian Ølgaard wrote:
> On 7 September 2010 15:02, Anders Logg <[email protected]> wrote:
> > On Tue, Sep 07, 2010 at 02:56:47PM +0200, Kristian Ølgaard wrote:
> >> On 7 September 2010 12:37, Anders Logg <[email protected]> wrote:
> >> > On Tue, Sep 07, 2010 at 12:20:09PM +0200, Kristian Ølgaard wrote:
> >> >> On 7 September 2010 11:04, Anders Logg <[email protected]> wrote:
> >> >> > On Mon, Sep 06, 2010 at 05:56:13PM +0200, Kristian Ølgaard wrote:
> >> >> >> On 6 September 2010 17:24, Johan Hake <[email protected]> wrote:
> >> >> >> > On Monday September 6 2010 08:13:44 Anders Logg wrote:
> >> >> >> >> On Mon, Sep 06, 2010 at 08:08:10AM -0700, Johan Hake wrote:
> >> >> >> >> > On Monday September 6 2010 05:47:27 Anders Logg wrote:
> >> >> >> >> > > On Mon, Sep 06, 2010 at 12:19:03PM +0200, Kristian Ølgaard 
> >> >> >> >> > > wrote:
> >> >> >> >> > > > > Do we have any functionality in place for handling 
> >> >> >> >> > > > > documentation
> >> >> >> >> > > > > that should be automatically generated from the C++ 
> >> >> >> >> > > > > interface and
> >> >> >> >> > > > > documentation that needs to be added later?
> >> >> >> >> > > >
> >> >> >> >> > > > No, not really.
> >> >> >> >> > >
> >> >> >> >> > > ok.
> >> >> >> >> > >
> >> >> >> >> > > > > I assume that the documentation we write in the C++ 
> >> >> >> >> > > > > header files
> >> >> >> >> > > > > (like Mesh.h) will be the same that appears in Python 
> >> >> >> >> > > > > using
> >> >> >> >> > > > > help(Mesh)?
> >> >> >> >> > > >
> >> >> >> >> > > > Yes and no, the problem is that for instance overloaded 
> >> >> >> >> > > > methods will
> >> >> >> >> > > > only show the last docstring.
> >> >> >> >> > > > So, the Mesh.__init__.__doc__ will just contain the 
> >> >> >> >> > > > Mesh(std::str
> >> >> >> >> > > > file_name) docstring.
> >> >> >> >> > >
> >> >> >> >> > > It would not be difficult to make the documentation 
> >> >> >> >> > > extraction script
> >> >> >> >> > > we have (in fenics-doc) generate the docstrings module and 
> >> >> >> >> > > just
> >> >> >> >> > > concatenate all constructor documentation. We are already 
> >> >> >> >> > > doing the
> >> >> >> >> > > parsing so spitting out class Foo: """ etc would be easy. 
> >> >> >> >> > > Perhaps that
> >> >> >> >> > > is an option.
> >> >> >> >> >
> >> >> >> >> > There might be other overloaded methods too. We might try to 
> >> >> >> >> > setle on a
> >> >> >> >> > format for these methods, or make this part of the 1% we need 
> >> >> >> >> > to handle
> >> >> >> >> > our self.
> >> >> >> >>
> >> >> >> >> ok. Should also be fairly easy to handle.
> >> >> >> >
> >> >> >> > Ok.
> >> >> >> >
> >> >> >> >> > > > > But in some special cases, we may want to go in and handle
> >> >> >> >> > > > > documentation for special cases where the Python 
> >> >> >> >> > > > > documentation
> >> >> >> >> > > > > needs to be different from the C++ documentation. So 
> >> >> >> >> > > > > there should
> >> >> >> >> > > > > be two different sources for the documentation: one that 
> >> >> >> >> > > > > is
> >> >> >> >> > > > > generated automatically from the C++ header files, and 
> >> >> >> >> > > > > one that
> >> >> >> >> > > > > overwrites or adds documentation for special cases. Is 
> >> >> >> >> > > > > that the
> >> >> >> >> > > > > plan?
> >> >> >> >> > > >
> >> >> >> >> > > > The plan is currently to write the docstrings by hand for 
> >> >> >> >> > > > the entire
> >> >> >> >> > > > dolfin module. One of the reasons is that we rename/ignores
> >> >> >> >> > > > functions/classes in the *.i files, and if we we try to 
> >> >> >> >> > > > automate the
> >> >> >> >> > > > docstring generation I think we should make it fully 
> >> >> >> >> > > > automatic not
> >> >> >> >> > > > just part of it.
> >> >> >> >> > >
> >> >> >> >> > > If we can make it 99% automatic and have an extra file with 
> >> >> >> >> > > special
> >> >> >> >> > > cases I think that would be ok.
> >> >> >> >> >
> >> >> >> >> > Agree.
> >> >> >>
> >> >> >> Yes, but we'll need some automated testing to make sure that the 1%
> >> >> >> does not go out of sync with the code.
> >> >> >> Most likely the 1% can't be handled because it is relatively 
> >> >> >> important
> >> >> >> (definitions in *.i files etc.).
> >> >> >
> >> >> > I imagine that "1%" will be the same as the "1%" that we have special
> >> >> > treatment for in the SWIG files anyway, so it makes sense those need
> >> >> > special treatment.
> >> >>
> >> >> I think that we can automate that last 1% too.
> >> >>
> >> >> > So the idea would be:
> >> >> >
> >> >> >  1. Document the C++ code in the C++ header files
> >> >> >  2. Document the extra Python code in the Python files (?)
> >> >> >  3. Document the extra SWIG stuff in a special file
> >> >>
> >> >> All Python docstrings should be located where the code is.
> >> >> In the Python layer (like dolfin/fem.py), or in the extended methods
> >> >> in the *.i files for the dolfin/cpp.py module.
> >> >>
> >> >> We then need to figure out how to change the syntax/name correctly
> >> >> such that std::vector, double* etc. are mapped to the correct Python
> >> >> arguments/return values, and how to handle the *example* code.
> >> >>
> >> >> >> >> > > > Also, we will need to change the syntax in all *example* 
> >> >> >> >> > > > code of the
> >> >> >> >> > > > docstrings. Maybe it can be done, but I'll need to give it 
> >> >> >> >> > > > some more
> >> >> >> >> > > > careful thought. We've already changed the approach a few 
> >> >> >> >> > > > times now,
> >> >> >> >> > > > so I really like the next try to close to our final 
> >> >> >> >> > > > implementation.
> >> >> >> >> > >
> >> >> >> >> > > I agree. :-)
> >> >> >> >> > >
> >> >> >> >> > > > > Another thing to discuss is the possibility of using 
> >> >> >> >> > > > > Doxygen to
> >> >> >> >> > > > > extract the documentation. We currently have our own 
> >> >> >> >> > > > > script since
> >> >> >> >> > > > > (I assume) Doxygen does not have a C++ --> reST 
> >> >> >> >> > > > > converter. Is that
> >> >> >> >> > > > > correct?
> >> >> >> >> > > >
> >> >> >> >> > > > I don't think Doxygen has any such converter, but there 
> >> >> >> >> > > > exist a
> >> >> >> >> > > > project http://github.com/michaeljones/breathe
> >> >> >> >> > > > which makes it possible to use xml output from Doxygen in 
> >> >> >> >> > > > much the
> >> >> >> >> > > > same way as we use autodoc for the Python module. I had a 
> >> >> >> >> > > > quick go at
> >> >> >> >> > > > it but didn't like the result. No links on the index pages 
> >> >> >> >> > > > to
> >> >> >> >> > > > function etc. So what we do now is better, but perhaps it 
> >> >> >> >> > > > would be a
> >> >> >> >> > > > good idea to use Doxygen to extract the docstrings for all 
> >> >> >> >> > > > classes
> >> >> >> >> > > > and functions, I tried parsing the xml output in the
> >> >> >> >> > > > test/verify_cpp_
> >> >> >> >> > > > ocumentation.py script and it should be relatively
> >> >> >> >> > > > simple to get the docstrings since these are stored as 
> >> >> >> >> > > > attributes of
> >> >> >> >> > > > classes/functions.
> >> >> >> >> > >
> >> >> >> >> > > Perhaps an idea would be to use Doxygen for parsing and then 
> >> >> >> >> > > have our
> >> >> >> >> > > own script that works with the XML output from Doxygen?
> >> >> >> >> >
> >> >> >> >> > I did not know we allready used Doxygen to extract information 
> >> >> >> >> > about
> >> >> >> >> > class structure from the headers.
> >> >> >> >>
> >> >> >> >> I thought it was you who implemented the Doxygen documentation 
> >> >> >> >> extraction?
> >> >> >> >
> >> >> >> > Duh... I mean that I did not know we used it in fenics_doc, in
> >> >> >> > verify_cpp_documentation.py.
> >> >> >>
> >> >> >> We don't. I wrote this script to be able to test the documentation in
> >> >> >> *.rst files against dolfin.
> >> >> >> Basically, I parse all files and keep track of the classes/functions
> >> >> >> which are defined in dolfin and try to match those up against the
> >> >> >> definitions in the documentation (and vise versa) to catch
> >> >> >> missing/obsolete documentation.
> >> >> >>
> >> >> >> >> > What are the differences between using the XML from Doxygen to 
> >> >> >> >> > also
> >> >> >> >> > extract the documentation, and the approach we use today?
> >> >> >> >>
> >> >> >> >> Pros (of using Doxygen):
> >> >> >> >>
> >> >> >> >>   - Doxygen is developed by people that presumably are very good 
> >> >> >> >> at
> >> >> >> >>     extracting docs from C++ code
> >> >> >> >>
> >> >> >> >>   - Doxygen might handle some corner cases we can't handle?
> >> >> >>
> >> >> >> Definitely, and we don't have to maintain it.
> >> >> >
> >> >> > We would need to maintain the script that extracts data from the
> >> >> > Doxygen-generated XML files.
> >> >> >
> >> >> >> >> Cons (of using Doxygen):
> >> >> >> >>
> >> >> >> >>   - Another dependency
> >> >> >> >
> >> >> >> > Which we already have.
> >> >> >> >
> >> >> >> >>   - We still need to write a script to parse the XML
> >> >> >> >
> >> >> >> > We should be able to ust the xml parser in docstringgenerator.py.
> >> >> >> >
> >> >> >> >>   - The parsing of /// stuff from C++ code is very simple
> >> >> >> >
> >> >> >> > Yes, and this might be just fine. But if it grows we might 
> >> >> >> > consider using
> >> >> >> > Doxygen.
> >> >> >>
> >> >> >> But some cases are not handled correctly already (nested classes 
> >> >> >> etc.)
> >> >> >> so I vote for Doxygen.
> >> >> >
> >> >> > Not that I'm insisting on not using Doxygen, but isn't it quite rare
> >> >> > that we use nested classes? I think we decided at some point that we
> >> >> > wanted to avoid it for some other reason. I don't remember which but
> >> >> > it might have been a SWIG problem.
> >> >>
> >> >> Look at 
> >> >> http://www.fenics.org/newdoc/programmers-reference/cpp/function/Function.html
> >> >> as a user I would be confused by LocalScratch and GatherScratch.
> >> >
> >> > Those can be easily fixed by letting the script stop parsing when it
> >> > finds "private:".
> >>
> >> OK, and if we are sure that no other nested classes are present in
> >> DOLFIN I guess things should be fine.
> >>
> >> >> The documentation here is also rather confusing, yes we can fix it,
> >> >> but similar cases will arise in the future.
> >> >>
> >> >> http://www.fenics.org/newdoc/programmers-reference/cpp/mesh/MeshPrimitive.html
> >> >
> >> > That looks strange because Andre has used an arbitrary mix of "//" and
> >> > "///" in his comments. Don't blame my script for that. :-)
> >>
> >> Alright alright, I'll never question the almighty
> >> generate_cpp_documentation.py script again. :)
> >
> > Sounds good. ;-)
> >
> >> In light of the above and the Doxygen line break issue, maybe it's
> >> best to use your script as a first try?
> >> We just need to break it up in parsing (intermediate representation),
> >> modifying (C++ and Python syntax) and writing stages (dump in
> >> respective folders in the documentation) and settle on the
> >> intermediate representation such that we can easily switch to a
> >> Doxygen parser in case we decide to.
> >
> > Sounds like a compiler to me. :-)
>
> Yup.
>
> > And since I anticipated your comment, it is already broken up into two
> > different stages:
> >
> >  generate_documentation (should maybe be extract_documentation)
> >  write_documentation
>
> Nice.
>
> > The intermediate representation is a simple Python list with class
> > names, signatures, comments etc. I'm sure it can be improved and
> > simplified.
>
> We will most likely need to refine it w.r.t. information, but I don't
> think that we can simplify it, most likely it will become a little
> more complex.
>
> BTW, shouldn't the extract_documentation() part be in DOLFIN since we
> intend to use to generate the docstrings.i file?
>
> Then write_cpp_documentation() and write_python_documentation() is
> part of fenics-doc, but they'll import extract_documentation() from
> DOLFIN. Otherwise we'll end up with redundant code.


Yes, that sounds like a good idea.

Perhaps we should have a module 'documentation' as part of DOLFIN:

  from dolfin import documentation

  doc = documentation.extract_documentation(DOLFIN_DIR=...)

The extract_documentation function would look for the DOLFIN_DIR
environment variable and if it is not set, it would need the argument
to be supplied.

If we make it a part of DOLFIN, I have a feeling it will be more
robust since it just needs to extract the documentation, while other
scripts are responsible for generating various kinds of output.

--
Anders

_______________________________________________
Mailing list: https://launchpad.net/~fenics
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~fenics
More help   : https://help.launchpad.net/ListHelp

Re: [Fenics] Generation of docstring module

Reply via email to