> This code is not just there for prefetching. It is an example of > using software pipelining:
OK. I understand. But the code is very hard to maintain... I've met too many register conflictions. # q2 and d2 were used in a same sequence. It cannot be exist in aarch64-neon. Anyway, I'll try to remove unnecessary register copies as you've suggested. After that, I'll also tryh to make benchmarks that * advance vs none * L1 / L2 / L3 (Cortex-A53 doesn't have), keep / strm to find the better configuration. But it is only a result of Cortex-A53 (that you ane me have). Does anyone can test other (expensive :) aarch64 environment ? (Cortex-Axx, Apple Ax, NVidia Denver, etc, etc...) On 5 April 2016 at 16:53, Siarhei Siamashka <siarhei.siamas...@gmail.com> wrote: > On Sun, 3 Apr 2016 20:17:45 +0900 > Mizuki Asakura <ed6e1...@gmail.com> wrote: > >> > The 'advanced' prefetch type is implemented by having some branchless ARM >> > code >> >> If the prefetch code assumes that "branch-less", it cannot be done in aarch64 >> since aarch64 doesn't support conditional alythmetics such as subge, subges. >> >> If so, we could / should remove all prefetch-related codes because it >> might cause >> performance regression (by branching) rather than benefit of prefetching. > > Yes, I'm fine and actually in favour of removing the prefetch related > AArch64 code (assuming that it does not do anything good for us). > Something similar happened to the pixman x86 prefetch code in the past: > https://lists.freedesktop.org/archives/pixman/2010-June/000231.html > > But I'm going to run some additional higher level benchmarks to be > sure. > >> And also, we could remove all "tail-head" optimizatoins that is only >> for highly utilizing prefetching. > > This code is not just there for prefetching. It is an example of > using software pipelining: > > https://cgit.freedesktop.org/pixman/tree/pixman/pixman-arm-neon-asm.S?id=pixman-0.34.0#n191 > https://en.wikipedia.org/wiki/Software_pipelining > >> "tail-head" codes are very complicated, hard to understand and hard to >> maintain. >> If we could remove these codes, asm code could be more slimmer and >> easy-to-maintain. > > If we were favouring ease of maintenance over performance, then we > would have used intrinsics instead of assembly in the first place. > >> Ofcource, the modification shouldn't be applied for original >> aarch32-neon codes. It may cause >> performance regression on some architecture. >> But for aarch64, it would be a considerable changes ? > > Well, just to make sure that there is no misunderstanding between > us. I would like to keep and do AArch64 conversion for all the > parts of code, which are well optimized and not planned to be > replaced in the near future. And I suggested not to bother with > the 'pixman-arm-neon-asm-bilinear.S' file, because this code > is not the best way to do the job and it had to be eventually > replaced with iterators: > https://lists.freedesktop.org/archives/pixman/2013-September/002889.html > https://lists.freedesktop.org/archives/pixman/2013-September/002892.html > > Now it looks like Ben Avison has NEON patches for doing > separable bilinear scaling and, so this makes the > 'pixman-arm-neon-asm-bilinear.S' file really obsolete. > > The nearest scaling and rgb565 format support code is still useful. > The bilinear scaling code from pixman-arm-neon-asm.S is useful too, at > least the 'pixman_scaled_bilinear_scanline_8888_8888_SRC_asm_neon' and > 'pixman_scaled_bilinear_scanline_8888_0565_SRC_asm_neon' functions. > But 'pixman_scaled_bilinear_scanline_0565_0565_SRC_asm_neon' and > 'pixman_scaled_bilinear_scanline_0565_x888_SRC_asm_neon' are bad. > > Anyway, your first patch was already usable. I only see that just a > few minor tweaks are needed and it will be good enough for pushing > to git. But if I'm mistaken and something is actually difficult, > then you don't need to spend too much time on it. Thanks. > > -- > Best regards, > Siarhei Siamashka _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/pixman