Hi everyone, Do you have any opinion on the structure of the builds for tutorials & notebooks? I’ve found close to 200 scripts and notebooks that need attention.
The current state of notebooks is… MXNet: md --> html + ipynb The Straight Dope: ipynb --> html Many of the notebooks are broken*, and testing doesn’t seem to happen as part of the build. However, we can implement notebook testing as part of the build<https://blog.thedataincubator.com/2016/06/testing-jupyter-notebooks/> or even as a regular nightly process. *Broken: urllib bug (Python 3), six module needs upgrading (Python 2), AttributeError: 'NoneType' object has no attribute 'readlines', and so on… Considerations: * Python 2 || Python 3 support. Just 3 or both? * Default context: testing the nbconvert tool to check for errors reveals that we should have ctx.cpu() as our default context, so that testing can be automated on non-GPU machines, docker/CI setups, etc. * Review of ipynb’s on github is a PITA. * Source of truth: md or ipynb? * Inclusion: how to make it easy and clear how to contribute and edit tutorials? * Scripts/readmes: can we index these and have them on the html version of the site by translating the readme.md to index.html during the build (this already sort of happens and is probably just a configuration step needed). Suggestion: * Nightly build: look for the latest md or ipynb file and translate it, so a newer md makes an ipynb, or a newer ipynb makes an md. * QA: if the latest ipynb fails testing, roll back, create a ticket (or whatever notification makes sense). * Conformity: have mxnet and TSD use a similar process. * Documentation of scripts: each folder should have a readme, if not, it’s flagged (this should help clean up the issues with undocumented scripts) Cheers, Aaron
