Re: CONFIG_PPC_VAS depends on 64k pages...?

2021-01-16 Thread Carlos Eduardo de Paula
Hi all, any news on this matter?

Can a patch be submitted for evaluation?

Thanks,
Carlos

On Wed, Dec 2, 2020 at 4:19 AM Will Springer 
wrote:

> On Tuesday, December 1, 2020 5:16:51 AM PST Bulent Abali wrote:
> > I don't know anything about VAS page size requirements in the kernel.  I
> > checked the user compression library and saw that we do a sysconf to
> > get the page size; so the library should be immune to page size by
> > design. But it wouldn't surprise me if a 64KB constant is inadvertently
> > hardcoded somewhere else in the library.  Giving heads up to Tulio and
> > Raphael who are owners of the github repo.
> >
> > https://github.com/libnxz/power-gzip/blob/master/lib/nx_zlib.c#L922
> >
> > If we got this wrong in the library it might manifest itself as an error
> > message of the sort "excessive page faults".  The library must touch
> > pages ahead to make them present in the memory; occasional page faults
> > is acceptable. It will retry.
>
> Hm, good to know. As I said I haven't noticed any problems so far, over a
> few different days of testing. My change is now in the Void Linux kernel
> package, and is working for others as well (including the Void maintainer
> Daniel/q66 who I CC'd initially).
>
> >
> > Bulent
> >
> >
> >
> >
> > From:"Sukadev Bhattiprolu" 
> > To:"Christophe Leroy" 
> > Cc:"Will Springer" ,
> > linuxppc-dev@lists.ozlabs.org, dan...@octaforge.org, Bulent
> > Abali/Watson/IBM@IBM, ha...@linux.ibm.com Date:12/01/2020 12:53
> > AM
> > Subject:Re: CONFIG_PPC_VAS depends on 64k pages...?
> >
> > Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> > > Hi,
> > >
> > > Le 19/11/2020 à 11:58, Will Springer a écrit :
> > > > I learned about the POWER9 gzip accelerator a few months ago when
> > > > the
> > > > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > > > dictates that VAS depends on a 64k page size, which is problematic
> > > > as I
> > > > run Void Linux, which uses a 4k-page kernel.
> > > >
> > > > Some early poking by others indicated there wasn't an obvious page
> > > > size
> > > > dependency in the code, and suggested I try modifying the config to
> > > > switch it on. I did so, but was stopped by a minor complaint of an
> > > > "unexpected DT configuration" by the VAS code. I wasn't equipped to
> > > > figure out exactly what this meant, even after finding the
> > > > offending condition, so after writing a very drawn-out forum post
> > > > asking for help, I dropped the subject.
> > > >
> > > > Fast forward to today, when I was reminded of the whole thing again,
> > > > and decided to debug a bit further. Apparently the VAS platform
> > > > device (derived from the DT node) has 5 resources on my 4k kernel,
> > > > instead of 4 (which evidently works for others who have had success
> > > > on 64k kernels). I have no idea what this means in practice (I
> > > > don't know how to introspect it), but after making a tiny patch[1],
> > > > everything came up smoothly and I was doing blazing-fast gzip
> > > > (de)compression in no time.
> > > >
> > > > Everything seems to work fine on 4k pages. So, what's up? Are there
> > > > pitfalls lurking around that I've yet to stumble over? More
> > > > reasonably,
> > > > I'm curious as to why the feature supposedly depends on 64k pages,
> > > > or if there's anything else I should be concerned about.
> >
> > Will,
> >
> > The reason I put in that config check is because we were only able to
> > test 64K pages at that point.
> >
> > It is interesting that it is working for you. Following code in skiboot
> > https://github.com/open-power/skiboot/blob/master/hw/vas.cshould
> > restrict it to 64K pages. IIRC there is also a corresponding change in
> > some NX registers that should also be configured to allow 4K pages.
>
> Huh, that is interesting indeed. As far as the kernel code, the only thing
> specific to 64k pages I could find was in [1], where
> VAS_XLATE_LPCR_PAGE_SIZE is set. There is also NX_PAGE_SIZE in drivers/
> crypto/nx/nx.h, which is set to 4096, but I don't know if that's related
> to
> kernel page size at all. Without a better idea of the code base, I didn't
> examine more thoroughly.
>
> [1]:
> https://git.kernel.org/pub/scm/linux/kernel/gi

Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-12-02 Thread Will Springer
On Tuesday, December 1, 2020 5:16:51 AM PST Bulent Abali wrote:
> I don't know anything about VAS page size requirements in the kernel.  I
> checked the user compression library and saw that we do a sysconf to
> get the page size; so the library should be immune to page size by
> design. But it wouldn't surprise me if a 64KB constant is inadvertently
> hardcoded somewhere else in the library.  Giving heads up to Tulio and
> Raphael who are owners of the github repo.
> 
> https://github.com/libnxz/power-gzip/blob/master/lib/nx_zlib.c#L922
> 
> If we got this wrong in the library it might manifest itself as an error
> message of the sort "excessive page faults".  The library must touch
> pages ahead to make them present in the memory; occasional page faults
> is acceptable. It will retry.

Hm, good to know. As I said I haven't noticed any problems so far, over a 
few different days of testing. My change is now in the Void Linux kernel 
package, and is working for others as well (including the Void maintainer 
Daniel/q66 who I CC'd initially).

> 
> Bulent
> 
> 
> 
> 
> From:"Sukadev Bhattiprolu" 
> To:"Christophe Leroy" 
> Cc:"Will Springer" ,
> linuxppc-dev@lists.ozlabs.org, dan...@octaforge.org, Bulent
> Abali/Watson/IBM@IBM, ha...@linux.ibm.com Date:12/01/2020 12:53
> AM
> Subject:Re: CONFIG_PPC_VAS depends on 64k pages...?
> 
> Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> > Hi,
> > 
> > Le 19/11/2020 à 11:58, Will Springer a écrit :
> > > I learned about the POWER9 gzip accelerator a few months ago when
> > > the
> > > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > > dictates that VAS depends on a 64k page size, which is problematic
> > > as I
> > > run Void Linux, which uses a 4k-page kernel.
> > > 
> > > Some early poking by others indicated there wasn't an obvious page
> > > size
> > > dependency in the code, and suggested I try modifying the config to
> > > switch it on. I did so, but was stopped by a minor complaint of an
> > > "unexpected DT configuration" by the VAS code. I wasn't equipped to
> > > figure out exactly what this meant, even after finding the
> > > offending condition, so after writing a very drawn-out forum post
> > > asking for help, I dropped the subject.
> > > 
> > > Fast forward to today, when I was reminded of the whole thing again,
> > > and decided to debug a bit further. Apparently the VAS platform
> > > device (derived from the DT node) has 5 resources on my 4k kernel,
> > > instead of 4 (which evidently works for others who have had success
> > > on 64k kernels). I have no idea what this means in practice (I
> > > don't know how to introspect it), but after making a tiny patch[1],
> > > everything came up smoothly and I was doing blazing-fast gzip
> > > (de)compression in no time.
> > > 
> > > Everything seems to work fine on 4k pages. So, what's up? Are there
> > > pitfalls lurking around that I've yet to stumble over? More
> > > reasonably,
> > > I'm curious as to why the feature supposedly depends on 64k pages,
> > > or if there's anything else I should be concerned about.
> 
> Will,
> 
> The reason I put in that config check is because we were only able to
> test 64K pages at that point.
> 
> It is interesting that it is working for you. Following code in skiboot
> https://github.com/open-power/skiboot/blob/master/hw/vas.cshould
> restrict it to 64K pages. IIRC there is also a corresponding change in
> some NX registers that should also be configured to allow 4K pages. 

Huh, that is interesting indeed. As far as the kernel code, the only thing 
specific to 64k pages I could find was in [1], where 
VAS_XLATE_LPCR_PAGE_SIZE is set. There is also NX_PAGE_SIZE in drivers/
crypto/nx/nx.h, which is set to 4096, but I don't know if that's related to 
kernel page size at all. Without a better idea of the code base, I didn't
examine more thoroughly.

[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/powerpc/platforms/powernv/vas-window.c#n293

> static int init_north_ctl(struct proc_chip *chip)
> {
>  uint64_t val = 0ULL;
> 
>  val = SETFIELD(VAS_64K_MODE_MASK, val,
> true); val = SETFIELD(VAS_ACCEPT_PASTE_MASK, val, true); val =
> SETFIELD(VAS_ENABLE_WC_MMIO_BAR, val, true); val =
> SETFIELD(VAS_ENABLE_UWC_MMIO_BAR, val, true); val =
> SETFIELD(VAS_ENABLE_RMA_MMIO_BAR, val, true);
> 
>   

Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-12-01 Thread Carlos Eduardo de Paula
On Tue, Dec 1, 2020 at 2:54 AM Sukadev Bhattiprolu 
wrote:

>
> Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> > Hi,
> >
> > Le 19/11/2020 à 11:58, Will Springer a écrit :
> > > I learned about the POWER9 gzip accelerator a few months ago when the
> > > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > > dictates that VAS depends on a 64k page size, which is problematic as I
> > > run Void Linux, which uses a 4k-page kernel.
> > >
> > > Some early poking by others indicated there wasn't an obvious page size
> > > dependency in the code, and suggested I try modifying the config to
> switch
> > > it on. I did so, but was stopped by a minor complaint of an
> "unexpected DT
> > > configuration" by the VAS code. I wasn't equipped to figure out
> exactly what
> > > this meant, even after finding the offending condition, so after
> writing a
> > > very drawn-out forum post asking for help, I dropped the subject.
> > >
> > > Fast forward to today, when I was reminded of the whole thing again,
> and
> > > decided to debug a bit further. Apparently the VAS platform device
> > > (derived from the DT node) has 5 resources on my 4k kernel, instead of
> 4
> > > (which evidently works for others who have had success on 64k
> kernels). I
> > > have no idea what this means in practice (I don't know how to
> introspect
> > > it), but after making a tiny patch[1], everything came up smoothly and
> I
> > > was doing blazing-fast gzip (de)compression in no time.
> > >
> > > Everything seems to work fine on 4k pages. So, what's up? Are there
> > > pitfalls lurking around that I've yet to stumble over? More reasonably,
> > > I'm curious as to why the feature supposedly depends on 64k pages, or
> if
> > > there's anything else I should be concerned about.
>
> Will,
>
> The reason I put in that config check is because we were only able to
> test 64K pages at that point.
>
> It is interesting that it is working for you. Following code in skiboot
> https://github.com/open-power/skiboot/blob/master/hw/vas.c should restrict
> it to 64K pages. IIRC there is also a corresponding change in some NX
> registers that should also be configured to allow 4K pages.
>
>
> static int init_north_ctl(struct proc_chip *chip)
> {
> uint64_t val = 0ULL;
>
> val = SETFIELD(VAS_64K_MODE_MASK, val, true);
> val = SETFIELD(VAS_ACCEPT_PASTE_MASK, val, true);
> val = SETFIELD(VAS_ENABLE_WC_MMIO_BAR, val, true);
> val = SETFIELD(VAS_ENABLE_UWC_MMIO_BAR, val, true);
> val = SETFIELD(VAS_ENABLE_RMA_MMIO_BAR, val, true);
>
> return vas_scom_write(chip, VAS_MISC_N_CTL, val);
> }
>
> I am copying Bulent Albali and Haren Myneni who have been working with
> VAS/NX for their thoughts/experience.
>
> > >
> >
> > Maybe ask Sukadev who did the implementation and is maintaining it ?
> >
> > > I do have to say I'm quite satisfied with the results of the NX
> > > accelerator, though. Being able to shuffle data to a RaptorCS box over
> gigE
> > > and get compressed data back faster than most software gzip could ever
> > > hope to achieve is no small feat, let alone the instantaneous results
> locally.
> > > :)
> > >
> > > Cheers,
> > > Will Springer [she/her]
> > >
> > > [1]:
> https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch
> > >
> >
> >
> > Christophe
>

Hi all, I'd like to report that with Will's patch, I'm using NX-Gzip
perfectly on Linux 5.9.10 built with 4K pages and no changes on firmware in
a Raptor Computing Blackbird workstation.

I'm using Debian 10 distro.

Ref. https://twitter.com/carlosedp/status/1328424799216021511

Carlos


-- 

Carlos Eduardo de Paula
m...@carlosedp.com
http://carlosedp.com
https://twitter.com/carlosedp
https://www.linkedin.com/in/carlosedp/



Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-12-01 Thread Bulent Abali
I don't know anything about VAS page size requirements in the kernel.  I 
checked the user compression library and saw that we do a sysconf to get 
the page size; so the library should be immune to page size by design.
But it wouldn't surprise me if a 64KB constant is inadvertently hardcoded 
somewhere else in the library.  Giving heads up to Tulio and Raphael who 
are owners of the github repo.

https://github.com/libnxz/power-gzip/blob/master/lib/nx_zlib.c#L922

If we got this wrong in the library it might manifest itself as an error 
message of the sort "excessive page faults".  The library must touch pages 
ahead to make them present in the memory; occasional page faults is 
acceptable. It will retry.


Bulent




From:   "Sukadev Bhattiprolu" 
To: "Christophe Leroy" 
Cc: "Will Springer" , 
linuxppc-dev@lists.ozlabs.org, dan...@octaforge.org, Bulent 
Abali/Watson/IBM@IBM, ha...@linux.ibm.com
Date:   12/01/2020 12:53 AM
Subject:    Re: CONFIG_PPC_VAS depends on 64k pages...?




Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> Hi,
> 
> Le 19/11/2020 à 11:58, Will Springer a écrit :
> > I learned about the POWER9 gzip accelerator a few months ago when the
> > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > dictates that VAS depends on a 64k page size, which is problematic as 
I
> > run Void Linux, which uses a 4k-page kernel.
> > 
> > Some early poking by others indicated there wasn't an obvious page 
size
> > dependency in the code, and suggested I try modifying the config to 
switch
> > it on. I did so, but was stopped by a minor complaint of an 
"unexpected DT
> > configuration" by the VAS code. I wasn't equipped to figure out 
exactly what
> > this meant, even after finding the offending condition, so after 
writing a
> > very drawn-out forum post asking for help, I dropped the subject.
> > 
> > Fast forward to today, when I was reminded of the whole thing again, 
and
> > decided to debug a bit further. Apparently the VAS platform device
> > (derived from the DT node) has 5 resources on my 4k kernel, instead of 
4
> > (which evidently works for others who have had success on 64k 
kernels). I
> > have no idea what this means in practice (I don't know how to 
introspect
> > it), but after making a tiny patch[1], everything came up smoothly and 
I
> > was doing blazing-fast gzip (de)compression in no time.
> > 
> > Everything seems to work fine on 4k pages. So, what's up? Are there
> > pitfalls lurking around that I've yet to stumble over? More 
reasonably,
> > I'm curious as to why the feature supposedly depends on 64k pages, or 
if
> > there's anything else I should be concerned about.

Will,

The reason I put in that config check is because we were only able to
test 64K pages at that point.

It is interesting that it is working for you. Following code in skiboot
https://github.com/open-power/skiboot/blob/master/hw/vas.c should restrict
it to 64K pages. IIRC there is also a corresponding change in some NX 
registers that should also be configured to allow 4K pages.


 static int init_north_ctl(struct proc_chip *chip)
 {
 uint64_t val = 0ULL;

 val = SETFIELD(VAS_64K_MODE_MASK, val, 
true);
 val = SETFIELD(VAS_ACCEPT_PASTE_MASK, 
val, true);
 val = SETFIELD(VAS_ENABLE_WC_MMIO_BAR, 
val, true);
 val = SETFIELD(VAS_ENABLE_UWC_MMIO_BAR, 
val, true);
 val = SETFIELD(VAS_ENABLE_RMA_MMIO_BAR, 
val, true);

 return vas_scom_write(chip, 
VAS_MISC_N_CTL, val);
 }

I am copying Bulent Albali and Haren Myneni who have been working with
VAS/NX for their thoughts/experience.

> > 
> 
> Maybe ask Sukadev who did the implementation and is maintaining it ?
> 
> > I do have to say I'm quite satisfied with the results of the NX
> > accelerator, though. Being able to shuffle data to a RaptorCS box over 
gigE
> > and get compressed data back faster than most software gzip could ever
> > hope to achieve is no small feat, let alone the instantaneous results 
locally.
> > :)
> > 
> > Cheers,
> > Will Springer [she/her]
> > 
> > [1]: 
https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch

> > 
> 
> 
> Christophe






Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-11-30 Thread Sukadev Bhattiprolu


Christophe Leroy [christophe.le...@csgroup.eu] wrote:
> Hi,
> 
> Le 19/11/2020 à 11:58, Will Springer a écrit :
> > I learned about the POWER9 gzip accelerator a few months ago when the
> > support hit upstream Linux 5.8. However, for some reason the Kconfig
> > dictates that VAS depends on a 64k page size, which is problematic as I
> > run Void Linux, which uses a 4k-page kernel.
> > 
> > Some early poking by others indicated there wasn't an obvious page size
> > dependency in the code, and suggested I try modifying the config to switch
> > it on. I did so, but was stopped by a minor complaint of an "unexpected DT
> > configuration" by the VAS code. I wasn't equipped to figure out exactly what
> > this meant, even after finding the offending condition, so after writing a
> > very drawn-out forum post asking for help, I dropped the subject.
> > 
> > Fast forward to today, when I was reminded of the whole thing again, and
> > decided to debug a bit further. Apparently the VAS platform device
> > (derived from the DT node) has 5 resources on my 4k kernel, instead of 4
> > (which evidently works for others who have had success on 64k kernels). I
> > have no idea what this means in practice (I don't know how to introspect
> > it), but after making a tiny patch[1], everything came up smoothly and I
> > was doing blazing-fast gzip (de)compression in no time.
> > 
> > Everything seems to work fine on 4k pages. So, what's up? Are there
> > pitfalls lurking around that I've yet to stumble over? More reasonably,
> > I'm curious as to why the feature supposedly depends on 64k pages, or if
> > there's anything else I should be concerned about.

Will,

The reason I put in that config check is because we were only able to
test 64K pages at that point.

It is interesting that it is working for you. Following code in skiboot
https://github.com/open-power/skiboot/blob/master/hw/vas.c should restrict
it to 64K pages. IIRC there is also a corresponding change in some NX 
registers that should also be configured to allow 4K pages.


static int init_north_ctl(struct proc_chip *chip)
{
uint64_t val = 0ULL;

val = SETFIELD(VAS_64K_MODE_MASK, val, true);
val = SETFIELD(VAS_ACCEPT_PASTE_MASK, val, true);
val = SETFIELD(VAS_ENABLE_WC_MMIO_BAR, val, true);
val = SETFIELD(VAS_ENABLE_UWC_MMIO_BAR, val, true);
val = SETFIELD(VAS_ENABLE_RMA_MMIO_BAR, val, true);

return vas_scom_write(chip, VAS_MISC_N_CTL, val);
}

I am copying Bulent Albali and Haren Myneni who have been working with
VAS/NX for their thoughts/experience.

> > 
> 
> Maybe ask Sukadev who did the implementation and is maintaining it ?
> 
> > I do have to say I'm quite satisfied with the results of the NX
> > accelerator, though. Being able to shuffle data to a RaptorCS box over gigE
> > and get compressed data back faster than most software gzip could ever
> > hope to achieve is no small feat, let alone the instantaneous results 
> > locally.
> > :)
> > 
> > Cheers,
> > Will Springer [she/her]
> > 
> > [1]: 
> > https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch
> > 
> 
> 
> Christophe


Re: CONFIG_PPC_VAS depends on 64k pages...?

2020-11-19 Thread Christophe Leroy

Hi,

Le 19/11/2020 à 11:58, Will Springer a écrit :

I learned about the POWER9 gzip accelerator a few months ago when the
support hit upstream Linux 5.8. However, for some reason the Kconfig
dictates that VAS depends on a 64k page size, which is problematic as I
run Void Linux, which uses a 4k-page kernel.

Some early poking by others indicated there wasn't an obvious page size
dependency in the code, and suggested I try modifying the config to switch
it on. I did so, but was stopped by a minor complaint of an "unexpected DT
configuration" by the VAS code. I wasn't equipped to figure out exactly what
this meant, even after finding the offending condition, so after writing a
very drawn-out forum post asking for help, I dropped the subject.

Fast forward to today, when I was reminded of the whole thing again, and
decided to debug a bit further. Apparently the VAS platform device
(derived from the DT node) has 5 resources on my 4k kernel, instead of 4
(which evidently works for others who have had success on 64k kernels). I
have no idea what this means in practice (I don't know how to introspect
it), but after making a tiny patch[1], everything came up smoothly and I
was doing blazing-fast gzip (de)compression in no time.

Everything seems to work fine on 4k pages. So, what's up? Are there
pitfalls lurking around that I've yet to stumble over? More reasonably,
I'm curious as to why the feature supposedly depends on 64k pages, or if
there's anything else I should be concerned about.



Maybe ask Sukadev who did the implementation and is maintaining it ?


I do have to say I'm quite satisfied with the results of the NX
accelerator, though. Being able to shuffle data to a RaptorCS box over gigE
and get compressed data back faster than most software gzip could ever
hope to achieve is no small feat, let alone the instantaneous results locally.
:)

Cheers,
Will Springer [she/her]

[1]: 
https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch




Christophe


CONFIG_PPC_VAS depends on 64k pages...?

2020-11-19 Thread Will Springer
I learned about the POWER9 gzip accelerator a few months ago when the 
support hit upstream Linux 5.8. However, for some reason the Kconfig 
dictates that VAS depends on a 64k page size, which is problematic as I 
run Void Linux, which uses a 4k-page kernel.

Some early poking by others indicated there wasn't an obvious page size 
dependency in the code, and suggested I try modifying the config to switch 
it on. I did so, but was stopped by a minor complaint of an "unexpected DT 
configuration" by the VAS code. I wasn't equipped to figure out exactly what 
this meant, even after finding the offending condition, so after writing a 
very drawn-out forum post asking for help, I dropped the subject.

Fast forward to today, when I was reminded of the whole thing again, and 
decided to debug a bit further. Apparently the VAS platform device 
(derived from the DT node) has 5 resources on my 4k kernel, instead of 4 
(which evidently works for others who have had success on 64k kernels). I 
have no idea what this means in practice (I don't know how to introspect 
it), but after making a tiny patch[1], everything came up smoothly and I 
was doing blazing-fast gzip (de)compression in no time.

Everything seems to work fine on 4k pages. So, what's up? Are there 
pitfalls lurking around that I've yet to stumble over? More reasonably, 
I'm curious as to why the feature supposedly depends on 64k pages, or if 
there's anything else I should be concerned about.

I do have to say I'm quite satisfied with the results of the NX 
accelerator, though. Being able to shuffle data to a RaptorCS box over gigE 
and get compressed data back faster than most software gzip could ever
hope to achieve is no small feat, let alone the instantaneous results locally.
:)

Cheers,
Will Springer [she/her]

[1]: 
https://github.com/Skirmisher/void-packages/blob/vas-4k-pages/srcpkgs/linux5.9/patches/ppc-vas-on-4k.patch