On Tue, 10 Mar 2020 10:29:38 GMT, Frederic Thevenet 
<github.com+7450507+ftheve...@openjdk.org> wrote:

>> ### 14-internal
>> 
>> --------
>> |    | 1024 |2048 |3072 |4096 |5120 |6144 |7168 |8192 |9216 |
>> |---|---|---|---|---|---|---|---|---|---|
>> | 1024 | 5.740508 | 9.337537 | 13.489849 | 17.611105 | 38.898909 | 48.165735 
>> | 53.596876 | 49.449740 | 66.032570 |
>> | 2048 | 9.845097 | 17.799415 | 26.109529 | 34.607728 | 79.345622 | 
>> 94.082500 | 107.777644 | 100.901349 | 135.826890 |
>> | 3072 | 14.654498 | 26.183649 | 39.781191 | 51.871491 | 113.010307 | 
>> 143.613631 | 184.883820 | 167.076202 | 200.852633
>> | | 4096 | 18.706278 | 36.115871 | 51.477296 | 68.457649 | 156.240888 | 
>> 186.159272 | 222.876505 | 237.387683 |
>> 290.125942 | | 5120 | 50.566276 | 106.465632 | 140.506406 | 161.687151 | 
>> 203.644875 | 237.260330 | 279.108632 |
>> 311.002566 | 371.704115 | | 6144 | 53.501341 | 106.726656 | 160.191733 | 
>> 216.969484 | 264.996201 | 287.375425 |
>> 335.294473 | 365.035267 | 419.995978 | | 7168 | 66.422026 | 110.882355 | 
>> 187.978455 | 239.014528 | 308.817056 |
>> 335.838550 | 394.270828 | 445.987300 | 506.974069 | | 8192 | 60.315442 | 
>> 108.770069 | 164.424088 | 205.330331 |
>> 305.201833 | 343.846336 | 392.867668 | 454.540147 | 503.808112 | | 9216 | 
>> 71.070811 | 132.708328 | 188.411172 |
>> 256.130225 | 320.028449 | 400.748559 | 471.542252 | 595.355103 | 589.240851 |
>> ![14-internal](https://user-images.githubusercontent.com/7450507/76303535-31fc0980-62c2-11ea-8b65-c8e104dcb042.png)
>
> ### 15-internal:
> 
> --------
> |    | 1024 |2048 |3072 |4096 |5120 |6144 |7168 |8192 |9216 |
> |---|---|---|---|---|---|---|---|---|---|
> | 1024 | 5.381051 | 9.261115 | 14.033219 | 20.608201 | 26.159817 | 33.599632 
> | 36.669261 | 43.042338 | 46.086088 |
> | 2048 | 9.752862 | 17.698869 | 27.004541 | 38.437578 | 52.297443 | 60.757880 
> | 68.101838 | 80.162117 | 93.852856 |
> | 3072 | 15.564961 | 27.304138 | 40.255866 | 56.636476 | 80.472402 | 
> 86.346635 | 105.154089 | 121.048263 | 130.458981 |
> | 4096 | 19.436113 | 35.556343 | 53.277865 | 71.623899 | 95.814932 | 
> 122.543003 | 136.833771 | 160.199834 | 178.356125 |
> | 5120 | 27.246498 | 65.875784 | 73.171492 | 103.380029 | 126.486761 | 
> 147.666102 | 165.833885 | 199.005331 |
> 220.659671 | | 6144 | 31.843301 | 62.101937 | 93.646729 | 125.531512 | 
> 150.914608 | 175.553034 | 209.835003 |
> 241.114596 | 253.512648 | | 7168 | 40.507918 | 70.843435 | 101.075064 | 
> 137.284040 | 165.808501 | 197.015259 |
> 254.286955 | 304.928104 | 299.992601 | | 8192 | 43.206941 | 80.290957 | 
> 121.946965 | 157.016439 | 193.509481 |
> 243.514969 | 268.151933 | 359.562281 | 352.102850 | | 9216 | 49.529493 | 
> 90.895186 | 149.422784 | 179.512616 |
> 217.260338 | 267.610592 | 309.706685 | 354.950852 | 383.275751 |
> ![15-internal](https://user-images.githubusercontent.com/7450507/76303560-3a544480-62c2-11ea-9860-030a0110a9fe.png)

I've uploaded 3 sets of results, from 3 different implementations:

1. **14-ea+9** is the implementation merged into openjfx14 following #68; it 
only use tiling if the original
implementation fails; on Windows that would typically be when the snapshot 
dimensions are larger than 8192 pixels.

2. **14-internal** uses the same tiling implementation than the above, but 
start using tiling as soon as snapshot
dimensions are larger than `PrismSettings.maxTextureSize`; i.e. typically 4096 
pixels.  NB: This implementation was
never merged into openJFX; results are only provided for comparison's sake.

3. **15-internal** uses the tiling implementation proposed in the PR. Compared 
to the ones above, it attempt to align
pixel formats to avoid the cost of transformation from one format to another 
(e.g. ByteBRGA to IntARGB) and it tries to
divide up the final snapshots into tiles of the same dimensions to prevent 
creating a new GPU surface for every tile.
**Please note that these results are for the best possible scenario with regard 
to the above optimizations; in the worst
  case scenario (a user provided image with a different pixel format and no way 
to divide the snapshot into equal size
  tiles), then the performances are the same as that of implementation 2**

My conclusions from the above results are twofold:

- Tiling has a cost in terms of performance, but as far as I can tell it 
remains the only practical way to work around
  the underlying issue (i.e. taking snapshot larger than the supported texture 
size) and the optimizations proposed in
  this PR arguably do mitigate that cost, even if it is only true in some of 
the cases.

- The original implementation, which is preserved in 14-ea+9, ignores texture 
clamping to 4096 which means it doesn't has
  to resort to tiling until the target snapshot size is >8192 on d3d or 16384 
on es2.
If this is an incorrect behaviour on the part of the original implementation 
(which I'm led to believe is the case), it
gives it an unfair advantage in this benchmark, i.e. it is faster because is 
does things it shouldn't do. If however it
turns out it is actually safe to ignore clamping when taking snapshot, then it 
would make sense to do so in the
implementation proposed by this PR as well.

-------------

PR: https://git.openjdk.java.net/jfx/pull/112

Reply via email to