skorper opened a new pull request, #235:
URL: https://github.com/apache/incubator-sdap-nexus/pull/235
Fixed bug where satellite to satellite queries fail if using an L2 dataset.
This issue was being caused by a bug where secondary satellite tile masks
were being incorrectly combined, causing the entire secondary tile to be
masked. This is due to the nature of Python masks and the fact that `True`
means the value is invalid, meaning a logical_or against an entirely masked np
array would result in an entirely masked np array. This issue cropped up when
running sat to sat matchup where VIIRS was the secondary dataset. This is
because VIIRS contains lots of null values for some variables -- in many cases
the entire variable in the tile is masked. This would cause the above issue.
Simplifying the problem, our old logic was like this:
```python
>>> a = np.ma.masked_array([1.0, 2.0, 3.0, 4.0], mask=[0, 0, 1, 0])
>>> b = np.ma.masked_array([5.0, 6.0, 7.0, 8.0], mask=[1, 1, 1, 1])
>>> np.logical_or(a, b)
masked_array(data=[--, --, --, --],
mask=[ True, True, True, True],
fill_value=1e+20,
dtype=bool)
```
where `True` means drop the value and `False` means keep the value. This is
not what we want! We want the inverse logic, where a masked array "or'd"
against an entirely masked array "or'd" == the first array.
Our new logic is like this:
```python
>>> np.logical_not(np.logical_and(a.mask, b.mask))
array([True, True, False, True])
```
where `True` == keep the value and `False` means drop the value.
In addition to the above, made a few small changes:
1. Only query the insitu API for the schema if `parameter_s` is provided
2. Retrieve tiles one-by-one rather than all at once when finding/retrieving
data for secondary tiles.
3. If no secondary tiles are found (in sat to sat matchup), handle
gracefully and return `[]` rather than letting an error get raised
4. Fixed bug where only the first two variables are considered in the tile
mask computed in `get_indices`
Tested like so:
- Tested Shawn's `ASCATB-L2-Coastal` -> `VIIRS_NPP-2018_Heatwave` query
locally. It works!
- Manually ran Riley's regression tests -- all passed. NOTE: Only sat to sat
run for now. See below.
Please note the following needs to be done before this PR is approved/merged:
1. Run full regression test suite
- This is not currently possible because the insitu api is down.
3. Run benchmarks for (2) above, to ensure sure retrieving sat tiles
one-by-one is faster than retrieving them all at once.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]