(Subject change to focus on testing)
Hi all,
First off... what GDAL has with autotest, travis-ci and coverity
is awesome!
Thoughts / discussion more than welcome!
For my production work, I'm not able to use the autotest python
code because of its non-unittest architecture. So... I started
creating python unittest and C++ gunit based tests. I use
autotest2 in Google's internal continuous integration system in
our main code base. I'm using Google's build system... I've got
nothing started for running the C++ tests outside of Google.
Apologies for not even getting out at least samples of autotest2
for folks to inspect and comment on. My intention is to put what
I have in a git repo and the to start discussions as to what (if
anything) GDAL community wants to do with autotest2. I was
hoping to get a lot more coverage and get GDAL 2.x.x support, but
that will have to come later. It's only 14K lines at this point
(optimistically 2-3% done), but it has been a huge help for me
especially with in upgrading versions of gdal and catching bugs
in support libs & development toolchains.
The tests are more focused on test isolation than autotest. This
allows for a lot more parallelism in testing. e.g. It's fair
game to run all tests at the same time on the same machine.
find . -name \*.py | xargs wc -l | tail -1
10684 total
find . -name \*.cc -o -name \*.h | xargs wc -l | tail -1
3734 total
Where GDAL's autotest is 204K lines:
find . -name \*.py | xargs wc -l | tail -1
193994 total
find . -name \*.c\* -o -name \*.h | xargs wc -l | tail -1
10471 total
Here are some samples:
C++ tests use C++11, gunit, google logging, gflags: (Hoping for
C++14 soon.. e.g. make_unique)
- autotest2/cpp/port/cpl_conv_test.cc
<https://gist.github.com/schwehr/13137d826763763fb031> (Yes, this
is massively boring code)
- autotest2/cpp/ogr/ogrpoint_test.cc
<https://gist.github.com/schwehr/c8ee86a6f6a1c1cc043b>
- autotest2/cpp/ogr/ogrsf_frmts/geojson/geojson_test.cc
<https://gist.github.com/schwehr/bc44b91a37cd621212c4>
Python pretty much follows the autotest layout, but with util
files in the same directory. Assumes python 2.7 or >= 3.4 (have
not tried py3 yet)
- autotest2/gcore/gcore_util.py
<https://gist.github.com/schwehr/c143927ca25d03a10265>
- autotest2/gdrivers/gdrivers_util.py
<https://gist.github.com/schwehr/dd75f73cedf8f7b5357e>
- autotest2/gdrivers/tiff_read_test.py
<https://gist.github.com/schwehr/a35b2bc8a7956ef1f620> (I'm
leading towards moving driver tests in gcore to gdrivers)
- autotest2/ogr/geojson_test.py
<https://gist.github.com/schwehr/6cbdc3482055d2237ad2>
Probably should move python code to also match the C++ tree. e.g.
tiff_read_test.py -> autotest2/py/frmts/gtiff/tiff_read_test.py
I'm (mostly) following Google's style guides. Public versions
here: C++
<https://google-styleguide.googlecode.com/svn/trunk/cppguide.html> Python
<https://google-styleguide.googlecode.com/svn/trunk/pyguide.html>
All C++ should be formatted with "clang-format --style=Google"
What does autotest2 not do?
Would like to eventually do (unsorted):
- Test error handling on a range of corrupt data sources
- Fuzz testing, ASAN/MSAN/TSAN/Valgrind/Heap checks (I've done
some MSAN & heap checkers by hand)
- Performance testing - time and memory usage
- Test the C API at the C level
- Test platforms other than Linux (MS Win*, Mac OSX, Android,
iOS, other embedded oses, BSD*, Solaris, HPUX, etc)
- Have more detailed language binding tests for java, ruby, perl, php
- Coverage checking
- Test parallel processing and multithreading
- Test networking (I need to think through isolation)
- Test multiple configurations (e.g. all drivers and features
enables vrs minimal build).
- Check which system calls are used by each driver for read and
for write
- Check i18n support.
- Check distribution packaging
- Validating that the given build options result in the expect
available features
Probably out of scope:
- Test for support from older platforms & C++ older than C++11
- Actual sandbox checks
- Test other bindings to GDAL's C or C++ API such as Fiona & Shapely
- Integration tests (e.g. GRASS, QGIS, mapserv, GeoDjango, etc).
- ABI compatibility checks
- Older versions of dependent libs e.g. netcdf/hdf4/5, kakadu,
openjpeg, etc.
-kurt
Engineer at Google
On Sat, Sep 5, 2015 at 7:48 AM, Dmitry Baryshnikov
<bishop....@gmail.com <mailto:bishop....@gmail.com>> wrote:
Hi Even,
05.09.2015 17:10, Even Rouault пишет:
Dmitry,
During the code sprint in FOSS4G 2015 (Korea, Seoul)
I plan to start
refactoring Cmake for GDAL (everybody are welcome
http://2015.foss4g.org/programme/code-sprint/). This
is good starting
point to try release an idea to reformat source tree
(combine drivers on
some principles - raster/vector/raster+vector). I
digging the mailing
list, but didn't found discussion started by Even
about this.
Regarding unified drivers, it was a bit mentionned in
https://trac.osgeo.org/gdal/wiki/rfc46_gdal_ogr_unification
. Basically the
PCIDSK drivers have been merged in frmts/pcidsk, the PDF
ones in frmts/pdf.
And the raster side of GPKG has been added to the existing
ogr/ogrsf_frmts/geopackage
Potential changes on the tree structure were left out in
the "Potential
changes that are *NOT* included in this RFC" paragraph.
I plan to experiment with this and if I get good results, RFC
will be written.
Also we have
new type of drivers - network. So, how it'll be best
to organise sources?
This can be not only drivers, but the whole source
tree. How should the
ideal GDAL source tree looks like?
Also I plan:
1. Move all internal libraries (zlib, libtiff,
libjpeg, etc.) to
separate github repos to use CMake ExternalProject
feature.
Just to give some context: the point for the internal
libraries was to have a
no-brainer way of building GDAL without any prerequisite.
- internal zlib is identical to its upstream v1.2.3 AFAIK
- internal libtiff: most files are identical to libtiff
CVS, but a few ones
(tiffconf.h, tif_config.h) have been modified for
integration with GDAL CPL, and
tif_vsi.c is GDAL specific (I/O implementation) + a build
time hack for TIFF
JPEG 12 bit support
- internal libjpeg is mostly upstream libjpeg v6b + one
patch. There's also
the build hack for libjpeg12
I only plan to move this internal libraries in separate
repos, not to link official ones. So this is only give more
structured sources tree.
2. Remove any other building systems
That sounds ambitious. Given the complexity and maturity
of our current build
systems, I guess this would take some time to have CMake
catch up.
Yes, certainly. But anyhow current CMake branch not fully
consistent for current build system. So this have to be done.
3. Try CTest for testing
What do you think it will bring w.r.t our current testing
system ? Do we want
to be dependant of a particular build system for our tests ?
Regarding testing, I've somehow understood Kurt had
mentionned plans for a
"gdalautotest2"
This is only subject of experiments. Let's try CTest and see
if it fits.
Regarding all the above, I assume you mean in a fork of
yours ?
Yes. All experiments will be on forked GDAL in separate branch.
As for me the ideal structure should looks like:
+ apps
+ autotests
+ bindings
+ core
+ port
+ ogr
+ gcore
gnm core would go here too ?
Yes
+ cmake
+ data
+ docs
+ doxygen
+ readme
+ drivers
+ raster
+ vector
+ network
+ combined
+ CMakeLists.txt
+ LICENSE
So, at the root of sources tree we will have only 8
folders and 2 files.
Is the churn in the tree structure worth the effort ? Be
aware that there are a
number of interdependencies between drivers, so this
might require fixing a
number of source files. What advantages do you see in a
new structure ?
1. More ease to understand sources tree for novice.
2. More useful for CMake macro. In current release there are
a lot of hardcoded things. Macro give more flexibility.
3. More ease to add some new check needed by separate drivers.
4. More configurable (ease selected depended libraries
installed in OS, or should be loaded via ExternalProject),
more dependence checks.
5. May be CPack using in future to create distros.
I'm afraid that if you want to change multiple things at
a time (build system,
testing mechanisms, tree structure), you will never
manage to get a working
result. Incremental approaches when feasible are less
risky (but admitedly
involve potentially a larger cumulated effort).
Yes, you may be right. But it seems to me that current Cmake
version is too complicated than it can be. If Ican improve it
it'll solve lot of problems, if not - ok this will be only an
unsuccessful experiment.
Even
I do not insist, maybe it's a crazy idea. But as was the
discussion of unification, it seemed to me that this worth
trying during improvements Cmake build system.
Best regards,
Dmitry