Antoine Queric created SDAP-307:
-----------------------------------
Summary: nexustiles get_tiles_by_polygon() &
get_tiles_by_polygon_sorttimeasc() gets data out of the requested polygon from
Solr
Key: SDAP-307
URL: https://issues.apache.org/jira/browse/SDAP-307
Project: Apache Science Data Analytics Platform
Issue Type: Bug
Reporter: Antoine Queric
While developping ElasticsearchProxy as metadatastore, we noticed subtle
differences for mean & std values between Solr & Elasticsearch for a same
dataset ingested with the same parameters (number of tiles desired, parameter
to rread ...)
Our query points to the following polygon :
`POLYGON ((-10 -20, 10 -20, 10 20, -10 20, -10 -20))`
As it is done in SolrProxy, we use the *INTERSECTS* relation in Elasticsearch
geoshape query.
However, Elasticsearch provides the following result (truncated to a single
latitude range) :
```
'time:0:1,lat:572:616,*lon:1700:1800'*, 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010_val', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.45, *min_lon=-9.95, max_lon=-0.05*),
'min_time': datetime.datetime(2015, 11, 1, 0, 0, tzinfo=<UTC>), 'max_time':
datetime.datetime(2015, 11, 1, 0, 0, tzinfo=<UTC>), 'tile_stats':
TileStats(min=19.630005, max=21.889984, mean=20.769224, count=4400),
'latitudes': 'None', 'longitudes': 'None', 'times': 'None', 'data': 'None',
'meta_data': 'None'}
'time:0:1,lat:572:616,*lon:1800:1900*', 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010_val', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.45, *min_lon=0.05, max_lon=9.95*), 'min_time':
datetime.datetime(2015, 11, 1, 0, 0, tzinfo=<UTC>), 'max_time':
datetime.datetime(2015, 11, 1, 0, 0, tzinfo=<UTC>), 'tile_stats':
TileStats(min=17.25, max=20.149994, mean=18.791216, count=4400), 'latitudes':
'None', 'longitudes': 'None', 'times': 'None', 'data': 'None', 'meta_data':
'None'}
```
while Solr returns the following :
```
'time:0:1,lat:572:616,*lon:1600:1700'*, 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.450000762939453,
{color:#FF0000}*min_lon=-19.950000762939453,
max_lon=-10.050000190734863*{color}), 'min_time': datetime.datetime(2015, 11,
1, 0, 0, tzinfo=<UTC>), 'max_time': datetime.datetime(2015, 11, 1, 0, 0,
tzinfo=<UTC>), 'tile_stats': TileStats(min=21.329986572265625,
max=23.529998779296875, mean=22.397216796875, count=4400), 'latitudes': 'None',
'longitudes': 'None', 'times': 'None', 'data': 'None', 'meta_data': 'None'}
'time:0:1,lat:572:616,*lon:1700:1800*', 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.450000762939453, *min_lon=-9.949999809265137,
max_lon=-0.05000000074505806*), 'min_time': datetime.datetime(2015, 11, 1, 0,
0, tzinfo=<UTC>), 'max_time': datetime.datetime(2015, 11, 1, 0, 0,
tzinfo=<UTC>), 'tile_stats': TileStats(min=19.6300048828125,
max=21.889984130859375, mean=20.769224166870117, count=4400), 'latitudes':
'None', 'longitudes': 'None', 'times': 'None', 'data': 'None', 'meta_data':
'None'}
'time:0:1,lat:572:616,*lon:1800:1900*', 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.450000762939453, *min_lon=0.05000000074505806,
max_lon=9.949999809265137*), 'min_time': datetime.datetime(2015, 11, 1, 0, 0,
tzinfo=<UTC>), 'max_time': datetime.datetime(2015, 11, 1, 0, 0, tzinfo=<UTC>),
'tile_stats': TileStats(min=17.25, max=20.149993896484375,
mean=18.791215896606445, count=4400), 'latitudes': 'None', 'longitudes':
'None', 'times': 'None', 'data': 'None', 'meta_data': 'None'}
'time:0:1,lat:572:616,*lon:1900:2000*', 'dataset':
'IFR-L4-SSTfnd-ODYSSEA-GLOB_010', 'granule':
'20151101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-GLOB_010-v2.0-fv1.0.nc', 'bbox':
BBox(min_lat=-22.75, max_lat=-18.450000762939453,
{color:#FF0000}*min_lon=10.050000190734863,
max_lon=19.950000762939453*{color}), 'min_time': datetime.datetime(2015, 11, 1,
0, 0, tzinfo=<UTC>), 'max_time': datetime.datetime(2015, 11, 1, 0, 0,
tzinfo=<UTC>), 'tile_stats': TileStats(min=13.30999755859375,
max=18.42999267578125, mean=16.47122573852539, count=1476), 'latitudes':
'None', 'longitudes': 'None', 'times': 'None', 'data': 'None', 'meta_data':
'None'}
```
It seems like Solr returns tiles that are not intersecting with the BBox
provided to the endpoint, even though they are rather close. Is this behavior
expected ?
Should we develop ElasticsearchProxy along with this principle of returning a
tile that is close, but not into the desired polygon ?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)