As a scientist, I teach my students that for doing science it is a requirement to work with open source software, because only then workflows are fully transparent and can be reproduced by other scientists without prohibitive license costs. Currently, working with large amounts of earth observation (EO) or climate model data typically requires to download these data tile by tile, stitch them together, and go through all of them. Array databases may simplify this substantially: after ingesting the tiles, they can directly work on the whole data as a multi-dimensinal array ("data cube"). Computations on these array are typically embarassingly parallel, and scale up with the number of cores in a cluster.
Rasdaman is an array data base that comes in two flavours, the open source community edition (CE) and the commercial enterprise edition (EE). The differences between the two are clear [1]. When I want to use rasdaman CE (open source) for scalable image analysis, I get stuck waiting for one core to finish everything [1]. This is not going to solve any problems related to computing on large data, and is not scalable. The bold claim that rasdaman.org opens with ("This worldwide leading array analytics engine distinguishes itself by its flexibility, performance, and scalability") is not true for the CE advertised. This has been mentioned in the past on mailing lists [2,3], but the typical answer from Peter Baumann diverts into other arguments. Also the benchmark graph (photo from an AGU poster) [4] that Peter sent this week [5] must refer to the enterprise edition, since Spark and Hive both scale, but rasdaman CE does not [3]. I assume that on the discussions on this list, ONLY the open source community edition is considered, compared, and discussed, as a potential future OSGeo project. OSGeo supports the needs of the open source geospatial community [6]. Given * the bold claims and continuing confusion about whether, and which, rasdaman is scalable, * the need for OSGeo to give good advice to prospective users about technologies that do scale EO data analysis, * the current (unfilled!) needs of scientists for good, open source software for such analysis, and * the potential conflict of interest of its creator [7], I wonder wether OSGeo should recommend rasdaman CE to the open source geospatial community. [1] http://rasdaman.org/wiki/Features [2] https://lists.osgeo.org/pipermail/incubator/2014-October/002540.html [3] https://groups.google.com/forum/#!topic/rasdaman-users/66XL3tmDDQI [4] https://lists.osgeo.org/pipermail/discuss/attachments/20160515/49200cd4/attachment.jpg [5] https://lists.osgeo.org/pipermail/discuss/2016-May/016099.html [6] http://www.osgeo.org/content/faq/foundation_faq.html [7] https://lists.osgeo.org/pipermail/discuss/2016-May/016045.html -- Edzer Pebesma Institute for Geoinformatics (ifgi), University of Münster Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081 Journal of Statistical Software: http://www.jstatsoft.org/ Computers & Geosciences: http://elsevier.com/locate/cageo/ Spatial Statistics Society http://www.spatialstatistics.info
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Discuss mailing list Discuss@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/discuss