Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies

2008-02-12 Thread Gabriel Paubert
On Tue, Feb 12, 2008 at 12:49:05PM +0100, Gabriel Paubert wrote:
> On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote:
> > 
> > On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote:
> > > - couple of fixes and preparatory patches
> > > 
> > > - rework of PowerMac media-bay support ([un]register IDE devices instead 
> > > of
> > >   [un]registering IDE interface) [ it is the main reason for spamming PPC 
> > > ML ]
> > 
> > Interesting... I was thinking about doing a full remove of the device at
> > a higher level instead but I suppose what you propose is easier.
> 
> Well, I have serious problem on a Pegasos which appeared some time
> between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of 
> 
> hdb: empty DMA table?
> 
> I'm trying to bisect it right now.

Argh, the first bisect point ended up with timeouts on hdb...

Flagged as bad, to try to see when the problems started, but 
I suspect that there are several.

Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies

2008-02-12 Thread Gabriel Paubert
On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote:
> 
> On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote:
> > - couple of fixes and preparatory patches
> > 
> > - rework of PowerMac media-bay support ([un]register IDE devices instead of
> >   [un]registering IDE interface) [ it is the main reason for spamming PPC 
> > ML ]
> 
> Interesting... I was thinking about doing a full remove of the device at
> a higher level instead but I suppose what you propose is easier.

Well, I have serious problem on a Pegasos which appeared some time
between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of 

hdb: empty DMA table?

I'm trying to bisect it right now.

Gabriel


> 
> I'll have a look & test next week hopefully.
> 
> Ben.
> 
> 
> ___
> Linuxppc-dev mailing list
> [EMAIL PROTECTED]
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies

2008-02-13 Thread Gabriel Paubert
On Tue, Feb 12, 2008 at 01:30:03PM +0100, Gabriel Paubert wrote:
> On Tue, Feb 12, 2008 at 12:49:05PM +0100, Gabriel Paubert wrote:
> > On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote:
> > > 
> > > On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote:
> > > > - couple of fixes and preparatory patches
> > > > 
> > > > - rework of PowerMac media-bay support ([un]register IDE devices 
> > > > instead of
> > > >   [un]registering IDE interface) [ it is the main reason for spamming 
> > > > PPC ML ]
> > > 
> > > Interesting... I was thinking about doing a full remove of the device at
> > > a higher level instead but I suppose what you propose is easier.
> > 
> > Well, I have serious problem on a Pegasos which appeared some time
> > between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of 
> > 
> > hdb: empty DMA table?
> > 
> > I'm trying to bisect it right now.
> 
> Argh, the first bisect point ended up with timeouts on hdb...
> 
> Flagged as bad, to try to see when the problems started, but 
> I suspect that there are several.

Well, it's worse than this. After 7 or 8 bisections (all flagged
bad) the symptoms changed: the system boots up to the point where
it does not recognize LVM volumes. So there are at least 3 bugs:
1) the timeout on interrupts (only seen on hdb)
2) the empty DMA table messages (seen on hda and hdb)
3) the inability to see logical volumes ("pvs" does not find them,
  they are back when rebooting into 2.6.24). This is the apparently
  the earliest introduced bug (no problems with the disks with that
  version).

I am away from that machine until next Tuesday. IIRC the last git
bisect was something like 160 revisions left.

Gabriel

> 
>   Gabriel
> ___
> Linuxppc-dev mailing list
> [EMAIL PROTECTED]
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel Real Time Clock (RTC) Support for I2C Devices

2001-04-18 Thread Gabriel Paubert



On Wed, 18 Apr 2001, Grant Erickson wrote:

> >From the looks of drivers/char/rtc.c it would appear that this kernel
> driver only supports bus-attached RTCs such as the mentioned MC146818. Is
> this correct?

I think so. 

> 
> What is the correct access method / kernel tie-in for supporting such an
> I2C-based RTC device using the "standard" interfaces?

Adding a new kind of clock to the kernel and setting the correct pointers
in ppc_md seems the right (if not necessarily simple) solution to your
problem.  

I wonder how you calibrate the decrementer frequency on this machine, I2C
is too slow to get any precision, unless you accept to wait for one minute
or so at boot. I still have problems of reproducibility of clock frequency
measurements with bus attached RTC (on machines on which the RTC is the
only moderately precise timing source and its interrupt line is
unfortunately not connected).

> 
> My hope is to use 'hwclock' from util-linux w/o modification. Is this
> reasonable?

No.

Regards,
Gabriel.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: high-res-timers start code.

2001-04-24 Thread Gabriel Paubert



On Mon, 23 Apr 2001, george anzinger wrote:

> "Robert H. de Vries" wrote:
> > 
> > On Monday 23 April 2001 19:45, you wrote:
> > 
> > > By the way, is the user land stuff the same for all "arch"s?
> > 
> > Not if you plan to handle the CPU cycle counter in user space. That is at
> > least what I would propose.
> 
> Just got interesting, lets let the world look in.
> 
> What did you have in mind here?  I suspect that on some archs the cycle
> counter is not available to user code.  I know that on parisc it is
> optionally available (kernel can set a bit to make it available), but by
> it self it is only good for intervals.  You need to peg some value to a
> CLOCK to use it to get timeofday, for instance.

On Intel there is a also bit to disable unprivileged RDTSC, IIRC. On PPC
the timebase is always available (but the old 601 needs spacial casing: it
uses different registers and does not count in binary :-().

> On the other hand, if there is an area of memory that both users and
> system can read but only system can write, one might put the soft clock
> there.  This would allow gettimeofday (with the cycle counter) to work
> without a system call.  To the best of my knowledge the system does not
> have such an area as yet.
> 
> comments?

Well, there may be work in this area, since x86-64 will not enter kernel
mode for gettimeofday() if I understand correctly what Andrea said. Linus
hinted once at exporting (kernel) code to user space.

Some data also will also need to be accessible but as long as you don't
guarantee compatibility on data layout, only AFAIU on interface for these
calls (it was not clear to me if it would be a fixed address forever or
dynamic linking with kernel exported symbols), it's not a problem.

Of course it will SIGSEGV instead of returning -EFAULT but this is a good
thing IMHO, nobody checks for -EFAULT from gettimeofday(). I think
that system calls should rather force SIGSEGV than return -EFAULT anyway,
to make syscalls indistinguishable from pure library calls. 

Regards,
Gabriel.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: missing mxcsr initialization

2000-10-20 Thread Gabriel Paubert



On Fri, 20 Oct 2000, Andrea Arcangeli wrote:

> Many thanks to Doug and Gabriel for very useful explanations about this FPU
> stuff. I suggest Gabriel to submit his way faster and more correct tag word
> conversion function to Linus for 2.4.x.

Here it a first shot, twd_i387_to_fxsr is guaranteed to not add any
new bug (since there are only 65536 cases, I wrote a short test program
to check that the results are the same). 

For the other way around, I only corrected the most glaring error: that
the sign never affects the tag. Some obscure corner cases may still be
wrong, but at least now negative zero and positive infinity and NaNs are
handled correctly.  Performance could also be improved by handling
separately the common case of an empty FP stack.

Now the question: is it worth bloating the code to exactly return the tag
values of a 387 (I actually find that the Intel doc is confusing, what 
is the state of the tag word after an fnsave when MMX is in use, 0x
or 0x, not counting the bugs in this area) ? 

My feeling is that the only important information in the tag field is
empty/not empty. Does any code (debuggers, etc...) actually rely on the
tags for anything else ?

Patch relative to 2.4.0-test9.

Gabriel

= arch/i386/kernel/i387.c 1.1 vs edited =
--- 1.1/arch/i386/kernel/i387.c Tue Jun 27 20:14:20 2000
+++ edited/arch/i386/kernel/i387.c  Fri Oct 20 17:56:49 2000
@@ -79,16 +79,16 @@
 
 static inline unsigned short twd_i387_to_fxsr( unsigned short twd )
 {
-   unsigned short ret = 0;
-   int i;
-
-   for ( i = 0 ; i < 8 ; i++ ) {
-   if ( (twd & 0x3) != 0x3 ) {
-   ret |= (1 << i);
-   }
-   twd = twd >> 2;
-   }
-   return ret;
+   unsigned int tmp; /* to avoid 16 bit prefixes in the code */
+ 
+   /* Transform each pair of bits into 01 (valid) or 00 (empty) */
+tmp = ~twd;
+tmp = (tmp | (tmp>>1)) & 0x; /* 0V0V0V0V0V0V0V0V */
+/* and move the valid bits to the lower byte. */
+tmp = (tmp | (tmp >> 1)) & 0x; /* 00VV00VV00VV00VV */
+tmp = (tmp | (tmp >> 2)) & 0x0f0f; /*  */
+tmp = (tmp | (tmp >> 4)) & 0x00ff; /*  */
+return tmp;
 }
 
 static inline unsigned long twd_fxsr_to_i387( struct i387_fxsave_struct *fxsave )
@@ -105,8 +105,8 @@
if ( twd & 0x1 ) {
st = (struct _fpxreg *) FPREG_ADDR( fxsave, i );
 
-   switch ( st->exponent ) {
-   case 0x:
+   switch ( st->exponent & 0x7fff ) {
+   case 0x7fff:
tag = 2;/* Special */
break;
case 0x:


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] quiet non-x86 option ROM warnings

2005-02-18 Thread Gabriel Paubert
On Thu, Feb 17, 2005 at 05:56:03PM -0500, Jon Smirl wrote:
> On Fri, 18 Feb 2005 09:45:50 +1100, Benjamin Herrenschmidt
> <[EMAIL PROTECTED]> wrote:
> 
> > Can't the size be obtained like any other BAR ?
> 
> yes, but cards that don't fully decode their ROM address space can
> waste memory in copy_rom. For example I have a card around here that
> reports a BAR address space of 128K and has a 2K ROM in it. You only
> want to copy the 2K, not the 128K.

Indeed, but they normally repeat by powers of 2, ignoring
high order address bits. Is it that hard to detect?

For example if it declares 128k, compare the two halves, reduce
to 64k if equal. Lather, rinse, repeat.

It's equivalent to reading the BAR declared size twice in 
the worst case, so it's not that bad performance-wise.

That would only be in the case of an unknown signature
in the first bytes, otherwise the third byte gives you
the size IIUC.

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.11-rc5

2005-02-24 Thread Gabriel Paubert
On Thu, Feb 24, 2005 at 04:57:56PM +0200, M.Baris Demiray wrote:
> 
> Steven Rostedt wrote:
> 
> >[...]
> >
> >Anyone know if the linux-2.6.11-rc5.tar.bz2 has all the changes in it?
> >The tarball rc5 is smaller than rc4. Was there a lot taken out?
> 
> Take a look at the diffview of 2.6.11-rc4-rc5 incremental patch.
> 
> 
> 5599 files changed, 166396 insertions(+), 293627 deletions(-)
> 

Wow! Over 100 thousand lines removed! 
If true, color me impressed, and kudos 
to the developers. 

Small is beautiful, I really like this :-)

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] nfs: Fix mismatch between encode_dent_fn and filldir_t

2007-01-17 Thread Gabriel Paubert
The 5th parameter of filldir_t function type used by vfs_readdir
was changed from ino_t to u64 in October. Unfortunately the patch 
missed some files in fs/nfsd where functions pointers of type 
encode_dent_fn are passed around and finally cast to filldir_t.

The effect is only visible when an NFS server is run on a 32 bit
big-endian machine (it would have been visible on all 32 bit
architectures if the 6th parameter had been used). The results
are interesting: all files have an inode of 0 (unique you say?) 
from getdents(2) and even ls(1) does not find any files.

Signed-off-by: Gabriel Paubert <[EMAIL PROTECTED]>

--

P.S.: this shows that function pointer casts are evil, but I did not 
find a simpler way to fix this.

diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 277df40..20c5f4a 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -991,14 +991,14 @@ encode_entry(struct readdir_cd *ccd, const char *name,
 
 int
 nfs3svc_encode_entry(struct readdir_cd *cd, const char *name,
-int namlen, loff_t offset, ino_t ino, unsigned int d_type)
+int namlen, loff_t offset, u64 ino, unsigned int d_type)
 {
return encode_entry(cd, name, namlen, offset, ino, d_type, 0);
 }
 
 int
 nfs3svc_encode_entry_plus(struct readdir_cd *cd, const char *name,
- int namlen, loff_t offset, ino_t ino, unsigned int 
d_type)
+ int namlen, loff_t offset, u64 ino, unsigned int 
d_type)
 {
return encode_entry(cd, name, namlen, offset, ino, d_type, 1);
 }
diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
index f5243f9..244406e 100644
--- a/fs/nfsd/nfsxdr.c
+++ b/fs/nfsd/nfsxdr.c
@@ -463,7 +463,7 @@ nfssvc_encode_statfsres(struct svc_rqst *rqstp, __be32 *p,
 
 int
 nfssvc_encode_entry(struct readdir_cd *ccd, const char *name,
-   int namlen, loff_t offset, ino_t ino, unsigned int d_type)
+   int namlen, loff_t offset, u64 ino, unsigned int d_type)
 {
struct nfsd_readdirres *cd = container_of(ccd, struct nfsd_readdirres, 
common);
__be32  *p = cd->buffer;
diff --git a/include/linux/nfsd/nfsd.h b/include/linux/nfsd/nfsd.h
index 0727774..3ba5141 100644
--- a/include/linux/nfsd/nfsd.h
+++ b/include/linux/nfsd/nfsd.h
@@ -53,7 +53,7 @@ struct readdir_cd {
__be32  err;/* 0, nfserr, or nfserr_eof */
 };
 typedef int(*encode_dent_fn)(struct readdir_cd *, const char *,
-   int, loff_t, ino_t, unsigned 
int);
+   int, loff_t, u64, unsigned int);
 typedef int (*nfsd_dirop_t)(struct inode *, struct dentry *, int, int);
 
 extern struct svc_program  nfsd_program;
diff --git a/include/linux/nfsd/xdr.h b/include/linux/nfsd/xdr.h
index 877192d..328520c 100644
--- a/include/linux/nfsd/xdr.h
+++ b/include/linux/nfsd/xdr.h
@@ -166,7 +166,7 @@ int nfssvc_encode_statfsres(struct svc_rqst *, __be32 *, 
struct nfsd_statfsres *
 int nfssvc_encode_readdirres(struct svc_rqst *, __be32 *, struct 
nfsd_readdirres *);
 
 int nfssvc_encode_entry(struct readdir_cd *, const char *name,
-   int namlen, loff_t offset, ino_t ino, unsigned 
int);
+   int namlen, loff_t offset, u64 ino, unsigned 
int);
 
 int nfssvc_release_fhandle(struct svc_rqst *, __be32 *, struct nfsd_fhandle *);
 
diff --git a/include/linux/nfsd/xdr3.h b/include/linux/nfsd/xdr3.h
index 7996386..bc8b171 100644
--- a/include/linux/nfsd/xdr3.h
+++ b/include/linux/nfsd/xdr3.h
@@ -332,10 +332,10 @@ int nfs3svc_release_fhandle(struct svc_rqst *, __be32 *,
 int nfs3svc_release_fhandle2(struct svc_rqst *, __be32 *,
struct nfsd3_fhandle_pair *);
 int nfs3svc_encode_entry(struct readdir_cd *, const char *name,
-   int namlen, loff_t offset, ino_t ino,
+   int namlen, loff_t offset, u64 ino,
unsigned int);
 int nfs3svc_encode_entry_plus(struct readdir_cd *, const char *name,
-   int namlen, loff_t offset, ino_t ino,
+   int namlen, loff_t offset, u64 ino,
unsigned int);
 /* Helper functions for NFSv3 ACL code */
 __be32 *nfs3svc_encode_post_op_attr(struct svc_rqst *rqstp, __be32 *p,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/10] local_t : powerpc

2007-01-24 Thread Gabriel Paubert
On Wed, Jan 24, 2007 at 08:08:12PM +1100, Paul Mackerras wrote:
> Mathieu Desnoyers writes:
> 
> > +static __inline__ int local_dec_if_positive(local_t *l)
> > +{
> > +   int t;
> > +
> > +   __asm__ __volatile__(
> > +"1:lwarx   %0,0,%1 # local_dec_if_positive\n\
> > +   addic.  %0,%0,-1\n\
> > +   blt-2f\n"
> > +   PPC405_ERR77(0,%1)
> > +"  stwcx.  %0,0,%1\n\
> > +   bne-1b"
> 
> This has the same bugs that we fixed recently in atomic_dec_if_positive;
> first, on 64-bit machines, the lwarx will zero-extend the word loaded
> from memory, and so the result of the addic will be negative only if
> the word was originally 0.  Secondly, even on 32-bit machines,
> 0x8000 will be considered positive since decrementing it gives
> 0x7fff, which is positive.
> 
> > +/* Use these for per-cpu local_t variables: on some archs they are
> > + * much more efficient than these naive implementations.  Note they take
> > + * a variable, not an address.
> > + *
> > + * This could be done better if we moved the per cpu data directly
> > + * after GS.
> > + */
> 
> What's "GS"?  Does this comment really apply on powerpc?
> 

1) It's an (application visible) i386/x86_64 segment register used
   to make memory addressing more confusing.
2) Because of 1), obviously not ;-)

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rt2] PowerPC: decrementer clockevent driver

2007-07-10 Thread Gabriel Paubert
On Fri, May 18, 2007 at 05:52:45PM +0100, Matt Sealey wrote:
> Kumar Gala wrote:
> > 
> > On May 18, 2007, at 9:48 AM, Thomas Gleixner wrote:
> > 
> >> On Fri, 2007-05-18 at 15:28 +0100, Matt Sealey wrote:
> >>>
> >>> I think both the MPC52xx GPT0-7 and the SLT0-1 fulfil this fairly
> >>> easily.
> >>
> >> There is some basic work for MPC5200 available:
> >>
> >> http://www.pengutronix.de/oselas/bsp/phytec/index_en.html#phyCORE-MPC5200B-tiny
> >>
> > 
> > I asked this earlier, but figured you might have a better insight.  Is
> > their value in having 'drivers' for more than one clock source?  I'd say
> > most (of not all) the PPC SoCs have timers on the system side that we
> > could provide drivers for, I'm just not sure if that does anything for
> > anyone.
> 
> As I asked after, I'm also very intrigued as to what is going to end
> up using these timers, but likewise, not much use writing a driver if
> everyone can use the extremely high resolution decrementer all at
> once..
> 
> As I said before too, at least Intel has decided there is a great need
> for up to 256 high resolution timer sources on a system, but since this
> is a fairly new concept to Linux (and hrtimers and dynticks too) it
> only seems to be used in the case of i8254/RTC emulation, mostly on
> x86-64.
> 
> I'm looking at it now and finding "users" of hrtimers is looking very
> thin on the ground. Maybe it's justified on the basis that more is
> better, and having support is preferable to not having it (even if
> nobody really uses it) but it seems the entire gamut of timing
> possibility in Linux can be handled through a simple, and single,
> high resolution timer and a queue of events..
> 
> So do we need some more? :D
> 

I don't think so. I really wonder what other capabilities the
other clock sources bring, except more convoluted and complex
code in the end.

There may be one exception to it: scheduled wakeup with 
very deep sleep in which you can't even afford the power
dissipated by the processor in the modes in with the
timebase/decrementer keeps running (which is fairly low
to start with). But this would be a special wakeup path 
where timebase and decrementer are reinitialized and 
become again the main timer source.

Otherwise the decrementer can provide essentially arbitrary 
high resolution (i.e., comparable to the time of an external 
bus access) on a fairly large number of independent timers. 
This scales with the number of processors since there is one 
decrementer per core (is it per thread on SMT chips? 
I don't know).


A few months ago, I implemented here for out internal needs
the equivalent of HRT (I called them FGT for fine-grained timers)
for our VME machines still running a 2.2 kernel. We have boards
who should have generated regular interrupts at programmable
rate interpreting GPS timestamp signals (IRIG B). Unfortunately
the signals were sometimes corrupt and the interrupts were no
more regularly spaced. Now the IRIGB signal is only used to feed
NTP which acts as a low pass filter with an extremely small
cut-off frequency (1mHz or less) and the regular interrupts
are generated from the decrementer in parallel with the standard
ticks. The code is not really polished and would not scale to
a large number of timers but it took me about one day and a half 
to write and debug it. I can even monitor it from a /proc file:

[EMAIL PROTECTED] /root]# cat /proc/fgtimer
 Calls   Lost Name
  60865181158 System timer tick
  77889762  0 itFast (128Hz)

There are a few lost ticks for the main system timer because 
some operations mask interrupts for too long during early
boot, especially with root on NFS which is the case here,
they never happen after the root file system is mounted.

This is also a somewhat special 2.2 kernel, it essentially has
the timekeeping code I submitted for 2.4 and which is
still the core of PPC timekeeping.


Regards,
Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Section mismatch warnings (was Re: [PATCH] early_pfn_to_nid needs to be __meminit)

2007-05-09 Thread Gabriel Paubert
On Thu, May 10, 2007 at 02:25:52AM +1000, Stephen Rothwell wrote:
> since it is referenced by memmap_init_zone (which is __meminit) via the
> early_pfn_in_nid macro when CONFIG_NODES_SPAN_OTHER_NODES is set (which
> basically means PowerPC 64).
> 
> This removes a section mismatch warning in those circumstances.

Speaking of this, I just tried to compile an official (Linus' git tree) 
kernel for my old PMac G4 and I get a lot of section mismatch warnings at 
the end of the compilation:

  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol
WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol
WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to 
.init.data:.got2 from dt_string_start (offset 0x8)
WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to 
.init.data:.got2 from dt_string_end (offset 0xc)
...
WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to 
.init.data:.got2 from dt_struct_end (offset 0x20c)
WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to 
.init.data:.got2 from disp_BAT (offset 0x234)
WARNING: "primary_pteg_full" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "next_slot" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: "htab_hash_searches" [arch/powerpc/mm/built-in] is COMMON symbol
WARNING: arch/powerpc/mm/built-in.o - Section mismatch: reference to 
.init.text:early_get_page from .text between 'pte_alloc_one_kernel' (at offset 
0xf50) and 'v_mapped_by_bats'
WARNING: arch/powerpc/platforms/built-in.o - Section mismatch: reference to 
.init.data:.got2 from  (offset 0x0)
...
WARNING: arch/powerpc/platforms/built-in.o - Section mismatch: reference to 
.init.text:pmac_pcibios_after_init from .machine.desc after 'mach_powermac' (at 
offset 0xc4)
WARNING: mm/built-in.o - Section mismatch: reference to 
.init.text:set_up_list3s from .text between 'kmem_cache_create' (at
offset 0x1de88) and 'kmem_cache_shrink'
WARNING: mm/built-in.o - Section mismatch: reference to 
.init.text:set_up_list3s from .text between 'kmem_cache_create' (at
offset 0x1dec0) and 'kmem_cache_shrink'

I find these 50 or so warnings so scary that I've not yet tried 
to boot the kernel. Note that this is a non-modular kernel.

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/5] firmware: annotate thou shalt not request fw on init or probe

2016-08-24 Thread Gabriel Paubert
On Tue, Aug 23, 2016 at 05:45:04PM -0700, mcg...@kernel.org wrote:

[snip]
> ---
>  Documentation/firmware_class/README|  20 
>  drivers/base/Kconfig   |   2 +-
>  .../request_firmware-avoid-init-probe-init.cocci   | 130 
> +
>  3 files changed, 151 insertions(+), 1 deletion(-)
>  create mode 100644 
> scripts/coccinelle/api/request_firmware-avoid-init-probe-init.cocci
> 
> diff --git a/Documentation/firmware_class/README 
> b/Documentation/firmware_class/README
> index cafdca8b3b15..056d1cb9d365 100644
> --- a/Documentation/firmware_class/README
> +++ b/Documentation/firmware_class/README
> @@ -93,6 +93,26 @@
> user contexts to request firmware asynchronously, but can't be called
> in atomic contexts.
>  
> +Requirements:
> +=
> +
> +You should avoid at all costs requesting firmware on both init and probe 
> paths
> +of your device driver. Reason for this is the complexity needed to ensure a
> +firmware will be available for a driver early in boot through different
> +build configurations. Consider built-in drivers needing firmware early, or
> +consider a driver assuming it will only get firmware after pivot_root().
> +
> +Drivers that really need firmware early should use stuff the firmware in

Minor grammatical nit: s/use//

> +initramfs or consider using CONFIG_EXTRA_FIRMWARE. Using initramfs is much
> +more portable to more distributions as not all distributions wish to enable
> +CONFIG_EXTRA_FIRMWARE. Should a driver require the firmware being built-in
> +it should depend on CONFIG_EXTRA_FIRMWARE. There is no current annotation for
> +requiring a firmware on initramfs.
> +
> +If you're a maintainer you can help police this with:
> +
> +$ export 
> COCCI=scripts/coccinelle/api/request_firmware-avoid-init-probe-init.cocci
> +$ make coccicheck MODE=report
>  
>   about in-kernel persistence:
>   ---


Re: [PATCH][V2] crypto/nx: fix spelling mistake: "availavle" -> "available"

2017-11-14 Thread Gabriel Paubert
On Tue, Nov 14, 2017 at 02:32:17PM +, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in pr_err error message text. Also
> fix spelling mistake in proceeding comment.

s/proceeding/preceding/ ?

> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/crypto/nx/nx-842-powernv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/nx/nx-842-powernv.c 
> b/drivers/crypto/nx/nx-842-powernv.c
> index f2246a5abcf6..1e87637c412d 100644
> --- a/drivers/crypto/nx/nx-842-powernv.c
> +++ b/drivers/crypto/nx/nx-842-powernv.c
> @@ -743,8 +743,8 @@ static int nx842_open_percpu_txwins(void)
>   }
>  
>   if (!per_cpu(cpu_txwin, i)) {
> - /* shoudn't happen, Each chip will have NX engine */
> - pr_err("NX engine is not availavle for CPU %d\n", i);
> + /* shouldn't happen, Each chip will have NX engine */
> + pr_err("NX engine is not available for CPU %d\n", i);
>   return -EINVAL;
>   }
>   }
> -- 
> 2.14.1


Re: [PATCH v2] powerpc/mpc5xxx: Avoid dereferencing potentially freed memory

2015-10-16 Thread Gabriel Paubert
On Fri, Oct 16, 2015 at 08:20:13AM +0200, Christophe JAILLET wrote:
> Le 15/10/2015 08:36, Michael Ellerman a écrit :
> >On Thu, 2015-10-15 at 07:56 +0200, Christophe JAILLET wrote:
> >>Use 'of_property_read_u32()' instead of 'of_get_property()'+pointer
> >>dereference in order to avoid access to potentially freed memory.
> >>
> >>Use 'of_get_next_parent()' to simplify the while() loop and avoid the
> >>need of a temp variable.
> >>
> >>Signed-off-by: Christophe JAILLET 
> >>---
> >>v2: Use of_property_read_u32 instead of of_get_property+pointer dereference
> >>*** Untested ***
> >Thanks.
> >
> >Can someone with an mpc5xxx test this?
> >
> >cheers
> >
> 
> Hi,
> I don't think it is an issue, but while looking at another similar
> patch, I noticed that the proposed patch adds a call to
> be32_to_cpup() (within of_property_read_u32).
> Apparently, powerPC is a BE architecture, so this call should be a no-op.

Sadly no more. 32 bit is BE only, but 64 bit can be either BEtter or
LEsser.

Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] macintosh: Add module license to ans-lcd

2018-01-29 Thread Gabriel Paubert
On Mon, Jan 29, 2018 at 01:33:08PM -0600, Larry Finger wrote:
> In kernel 4.15, the modprobe step on my PowerBook G5 started complaining that

PowerBook G5? Really, could you send a pic! :-)

> there was no module license for ans-lcd.
> 
> Signed-off-by: Larry Finger 
> ---
>  drivers/macintosh/ans-lcd.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/macintosh/ans-lcd.c b/drivers/macintosh/ans-lcd.c
> index 1de81d922d8a..c8e078b911c7 100644
> --- a/drivers/macintosh/ans-lcd.c
> +++ b/drivers/macintosh/ans-lcd.c
> @@ -201,3 +201,4 @@ anslcd_exit(void)
>  
>  module_init(anslcd_init);
>  module_exit(anslcd_exit);
> +MODULE_LICENSE("GPL v2");
> -- 
> 2.16.1
> 


Re: [PATCH] powerpc/traps: Declare unrecoverable_exception() as __noreturn

2021-02-11 Thread Gabriel Paubert
On Thu, Feb 11, 2021 at 06:34:43AM +, Christophe Leroy wrote:
> unrecoverable_exception() is never expected to return, most callers
> have an infiniteloop in case it returns.
> 
> Ensure it really never returns by terminating it with a BUG(), and
> declare it __no_return.
> 
> It always GCC to really simplify functions calling it. In the exemple below,

s/always/allows ?

(Otherwise I can't parse it.)

> it avoids the stack frame in the likely fast path and avoids code duplication
> for the exit.

Indeed, nice code generation improvement.

> 
> With this patch:
> 
>   0348 :
>348:   81 43 00 84 lwz r10,132(r3)
>34c:   71 48 00 02 andi.   r8,r10,2
>350:   41 82 00 2c beq 37c 
>354:   71 4a 40 00 andi.   r10,r10,16384
>358:   40 82 00 20 bne 378 
>35c:   80 62 00 70 lwz r3,112(r2)
>360:   74 63 00 01 andis.  r3,r3,1
>364:   40 82 00 28 bne 38c 
>368:   7d 40 00 a6 mfmsr   r10
>36c:   7c 11 13 a6 mtspr   81,r0
>370:   7c 12 13 a6 mtspr   82,r0
>374:   4e 80 00 20 blr
>378:   48 00 00 00 b   378 

Infinite loop (seems to be on test of MSR_PR)?

Gabriel

>37c:   94 21 ff f0 stwur1,-16(r1)
>380:   7c 08 02 a6 mflrr0
>384:   90 01 00 14 stw r0,20(r1)
>388:   48 00 00 01 bl  388 
>   388: R_PPC_REL24unrecoverable_exception
>38c:   38 e2 00 70 addir7,r2,112
>390:   3d 00 00 01 lis r8,1
>394:   7c c0 38 28 lwarx   r6,0,r7
>398:   7c c6 40 78 andcr6,r6,r8
>39c:   7c c0 39 2d stwcx.  r6,0,r7
>3a0:   40 a2 ff f4 bne 394 
>3a4:   38 60 00 01 li  r3,1
>3a8:   4b ff ff c0 b   368 
> 
> Without this patch:
> 
>   0348 :
>348:   94 21 ff f0 stwur1,-16(r1)
>34c:   93 e1 00 0c stw r31,12(r1)
>350:   7c 7f 1b 78 mr  r31,r3
>354:   81 23 00 84 lwz r9,132(r3)
>358:   71 2a 00 02 andi.   r10,r9,2
>35c:   41 82 00 34 beq 390 
>360:   71 29 40 00 andi.   r9,r9,16384
>364:   40 82 00 28 bne 38c 
>368:   80 62 00 70 lwz r3,112(r2)
>36c:   74 63 00 01 andis.  r3,r3,1
>370:   40 82 00 3c bne 3ac 
>374:   7d 20 00 a6 mfmsr   r9
>378:   7c 11 13 a6 mtspr   81,r0
>37c:   7c 12 13 a6 mtspr   82,r0
>380:   83 e1 00 0c lwz r31,12(r1)
>384:   38 21 00 10 addir1,r1,16
>388:   4e 80 00 20 blr
>38c:   48 00 00 00 b   38c 
>390:   7c 08 02 a6 mflrr0
>394:   90 01 00 14 stw r0,20(r1)
>398:   48 00 00 01 bl  398 
>   398: R_PPC_REL24unrecoverable_exception
>39c:   80 01 00 14 lwz r0,20(r1)
>3a0:   81 3f 00 84 lwz r9,132(r31)
>3a4:   7c 08 03 a6 mtlrr0
>3a8:   4b ff ff b8 b   360 
>3ac:   39 02 00 70 addir8,r2,112
>3b0:   3d 40 00 01 lis r10,1
>3b4:   7c e0 40 28 lwarx   r7,0,r8
>3b8:   7c e7 50 78 andcr7,r7,r10
>3bc:   7c e0 41 2d stwcx.  r7,0,r8
>3c0:   40 a2 ff f4 bne 3b4 
>3c4:   38 60 00 01 li  r3,1
>3c8:   4b ff ff ac b   374 
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/interrupt.h | 2 +-
>  arch/powerpc/kernel/interrupt.c  | 1 -
>  arch/powerpc/kernel/traps.c  | 2 ++
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/interrupt.h 
> b/arch/powerpc/include/asm/interrupt.h
> index dcff30e3919b..fa8bfb91f8df 100644
> --- a/arch/powerpc/include/asm/interrupt.h
> +++ b/arch/powerpc/include/asm/interrupt.h
> @@ -411,7 +411,7 @@ DECLARE_INTERRUPT_HANDLER(altivec_assist_exception);
>  DECLARE_INTERRUPT_HANDLER(CacheLockingException);
>  DECLARE_INTERRUPT_HANDLER(SPEFloatingPointException);
>  DECLARE_INTERRUPT_HANDLER(SPEFloatingPointRoundException);
> -DECLARE_INTERRUPT_HANDLER(unrecoverable_exception);
> +DECLARE_INTERRUPT_HANDLER(unrecoverable_exception) __noreturn;
>  DECLARE_INTERRUPT_HANDLER(WatchdogException);
>  DECLARE_INTERRUPT_HANDLER(kernel_bad_stack);
>  
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index eca3be36c18c..7e7106641ca9 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -440,7 +440,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct 
> pt_regs *regs, unsigned
>   return ret;
>  }
>  
> -void unrecoverable_exception(struct pt_regs *regs);
>  void preempt_schedule_irq(void);
>  
>  notrace unsigned long interrupt_exit

Re: [PATCH 3/3] perf: Use 64-bit value when comparing sample_regs

2014-03-06 Thread Gabriel Paubert
On Thu, Mar 06, 2014 at 09:44:47AM +, David Laight wrote:
> From: Sukadev Bhattiprolu
> > When checking whether a bit representing a register is set in
> > sample_regs, a 64-bit mask, use 64-bit value (1LL).
> > 
> > Signed-off-by: Sukadev Bhattiprolu 
> > ---
> >  tools/perf/util/unwind.c |4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/perf/util/unwind.c b/tools/perf/util/unwind.c
> > index 742f23b..2b888c6 100644
> > --- a/tools/perf/util/unwind.c
> > +++ b/tools/perf/util/unwind.c
> > @@ -396,11 +396,11 @@ static int reg_value(unw_word_t *valp, struct 
> > regs_dump *regs, int id,
> >  {
> > int i, idx = 0;
> > 
> > -   if (!(sample_regs & (1 << id)))
> > +   if (!(sample_regs & (1LL << id)))
> > return -EINVAL;
> > 
> > for (i = 0; i < id; i++) {
> > -   if (sample_regs & (1 << i))
> > +   if (sample_regs & (1LL << i))
> > idx++;
> > }
> 
> There are much faster ways to count the number of set bits, especially
> if you might need to check a significant number of bits.
> There might even be a function defined somewhere to do it.

Indeed, look for Hamming weight (hweight family of functions)
in asm/hweight.h and what is included from there.

Besides that, many modern processors also have a machine instruction
to perform this task. In the processor manuals the instruction is 
described as population count and the mnemonic starts with "popcnt"
on x86 and ppc.

Gabriel

> Basically you just add up the bits, for 16 bit it would be:
>   val = (val & 0x) + (val >> 1) & 0x;
>   val = (val & 0x) + (val >> 2) & 0x;
>   val = (val & 0x0f0f) + (val >> 4) & 0x0f0f;
>   val = (val & 0x00ff) + (val >> 8) & 0x00ff;
> As the size of the work increases the improvement is more significant.
> (Some of the later masking can probably be proven unnecessary.)
> 
>   David
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] powerpc: add ioremap_wt

2014-02-03 Thread Gabriel Paubert
On Mon, Feb 03, 2014 at 08:16:49AM +0100, Michael Moese wrote:
> Allow for IO memory to be mapped cacheable for performing
> PCI read bursts.
> 
> Signed-off-by: Michael Moese 
> ---
>  arch/powerpc/include/asm/io.h | 3 +++
>  arch/powerpc/mm/pgtable_32.c  | 8 
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> index 45698d5..9591fff 100644
> --- a/arch/powerpc/include/asm/io.h
> +++ b/arch/powerpc/include/asm/io.h
> @@ -631,6 +631,8 @@ static inline void iosync(void)
>   *
>   * * ioremap_wc enables write combining
>   *
> + * * ioremap_wc enables write thru

Typo: _wc -> _wt

Looks fine in principle, but there is a significant difference with wc
on x86, where read accesses always go to the bus (no read caching).

Gabriel

> + *
>   * * iounmap undoes such a mapping and can be hooked
>   *
>   * * __ioremap_at (and the pending __iounmap_at) are low level functions to
> @@ -652,6 +654,7 @@ extern void __iomem *ioremap(phys_addr_t address, 
> unsigned long size);
>  extern void __iomem *ioremap_prot(phys_addr_t address, unsigned long size,
> unsigned long flags);
>  extern void __iomem *ioremap_wc(phys_addr_t address, unsigned long size);
> +extern void __iomem *ioremap_wt(phys_addr_t address, unsigned long size);
>  #define ioremap_nocache(addr, size)  ioremap((addr), (size))
>  
>  extern void iounmap(volatile void __iomem *addr);
> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
> index 51f8795..9ab0a54 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -141,6 +141,14 @@ ioremap_wc(phys_addr_t addr, unsigned long size)
>  EXPORT_SYMBOL(ioremap_wc);
>  
>  void __iomem *
> +ioremap_wt(phys_addr_t addr, unsigned long size)
> +{
> + return __ioremap_caller(addr, size, _PAGE_WRITETHRU,
> + __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL(ioremap_wt);
> +
> +void __iomem *
>  ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags)
>  {
>   /* writeable implies dirty for kernel addresses */
> -- 
> 1.8.5.3
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: deb-pkg: Add support for powerpc little endian

2014-09-05 Thread Gabriel Paubert
On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote:
> The Debian powerpc little endian architecture is called ppc64le.  This

Huh? ppc64le or ppc64el?

> is the default architecture used by Ubuntu for powerpc.
> 
> The below checks the kernel config to see if we are compiling little
> endian and sets the Debian arch appropriately.
> 
> Signed-off-by: Michael Neuling 
> 
> diff --git a/scripts/package/builddeb b/scripts/package/builddeb
> index 35d5a58..6f4a1af 100644
> --- a/scripts/package/builddeb
> +++ b/scripts/package/builddeb
> @@ -37,7 +37,7 @@ create_package() {
>   s390*)
>   debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG && echo x 
> || true) ;;
>   ppc*)
> - debarch=powerpc ;;
> + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG && echo 
> ppc64el || echo powerpc) ;;
>   parisc*)
>   debarch=hppa ;;
>   mips*)


Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] powerpc32: use stmw/lmw for non volatile registers save/restore

2016-05-23 Thread Gabriel Paubert
On Mon, May 23, 2016 at 10:46:36AM +0200, Christophe Leroy wrote:
> lmw/stmw have a 1 cycle (2 cycles for lmw on some ppc) in addition
> and implies serialising, however it reduces the amount of instructions
> hence the amount of instruction fetch compared to the equivalent
> operation with several lzw/stw. It means less pressure on cache and

Minor typo, s/lzw/lwz/.

> less fetching delays on slow memory.
> When we transfer 20 registers, it is worth it.
> gcc uses stmw/lmw at function entry/exit to save/restore non
> volatile register, so lets also do it that way.
> 
> On powerpc64, we can't use lmw/stmw as it only handles 32 bits, so
> we move longjmp() and setjmp() from misc.S to misc_64.S, and we
> write a 32 bits version in misc_32.S using stmw/lmw
> 
> Signed-off-by: Christophe Leroy 
> ---
> The patch goes on top of "powerpc: inline current_stack_pointer()" or
> requires trivial manual merge in arch/powerpc/kernel/misc.S
> 
>  arch/powerpc/include/asm/ppc_asm.h |  6 ++--
>  arch/powerpc/kernel/misc.S | 61 
> --
>  arch/powerpc/kernel/misc_32.S  | 22 ++
>  arch/powerpc/kernel/misc_64.S  | 61 
> ++
>  4 files changed, 85 insertions(+), 65 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/ppc_asm.h 
> b/arch/powerpc/include/asm/ppc_asm.h
> index 2b31632..e29b649 100644
> --- a/arch/powerpc/include/asm/ppc_asm.h
> +++ b/arch/powerpc/include/asm/ppc_asm.h
> @@ -82,10 +82,8 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
>  #else
>  #define SAVE_GPR(n, base)stw n,GPR0+4*(n)(base)
>  #define REST_GPR(n, base)lwz n,GPR0+4*(n)(base)
> -#define SAVE_NVGPRS(base)SAVE_GPR(13, base); SAVE_8GPRS(14, base); \
> - SAVE_10GPRS(22, base)
> -#define REST_NVGPRS(base)REST_GPR(13, base); REST_8GPRS(14, base); \
> - REST_10GPRS(22, base)
> +#define SAVE_NVGPRS(base)stmw13, GPR0+4*13(base)
> +#define REST_NVGPRS(base)lmw 13, GPR0+4*13(base)
>  #endif
>  
>  #define SAVE_2GPRS(n, base)  SAVE_GPR(n, base); SAVE_GPR(n+1, base)
> diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
> index 7ce26d4..9de71d8 100644
> --- a/arch/powerpc/kernel/misc.S
> +++ b/arch/powerpc/kernel/misc.S
> @@ -53,64 +53,3 @@ _GLOBAL(add_reloc_offset)
>  
>   .align  3
>  2:   PPC_LONG 1b
> -
> -_GLOBAL(setjmp)
> - mflrr0
> - PPC_STL r0,0(r3)
> - PPC_STL r1,SZL(r3)
> - PPC_STL r2,2*SZL(r3)
> - mfcrr0
> - PPC_STL r0,3*SZL(r3)
> - PPC_STL r13,4*SZL(r3)
> - PPC_STL r14,5*SZL(r3)
> - PPC_STL r15,6*SZL(r3)
> - PPC_STL r16,7*SZL(r3)
> - PPC_STL r17,8*SZL(r3)
> - PPC_STL r18,9*SZL(r3)
> - PPC_STL r19,10*SZL(r3)
> - PPC_STL r20,11*SZL(r3)
> - PPC_STL r21,12*SZL(r3)
> - PPC_STL r22,13*SZL(r3)
> - PPC_STL r23,14*SZL(r3)
> - PPC_STL r24,15*SZL(r3)
> - PPC_STL r25,16*SZL(r3)
> - PPC_STL r26,17*SZL(r3)
> - PPC_STL r27,18*SZL(r3)
> - PPC_STL r28,19*SZL(r3)
> - PPC_STL r29,20*SZL(r3)
> - PPC_STL r30,21*SZL(r3)
> - PPC_STL r31,22*SZL(r3)
> - li  r3,0
> - blr
> -
> -_GLOBAL(longjmp)
> - PPC_LCMPI r4,0
> - bne 1f
> - li  r4,1
> -1:   PPC_LL  r13,4*SZL(r3)
> - PPC_LL  r14,5*SZL(r3)
> - PPC_LL  r15,6*SZL(r3)
> - PPC_LL  r16,7*SZL(r3)
> - PPC_LL  r17,8*SZL(r3)
> - PPC_LL  r18,9*SZL(r3)
> - PPC_LL  r19,10*SZL(r3)
> - PPC_LL  r20,11*SZL(r3)
> - PPC_LL  r21,12*SZL(r3)
> - PPC_LL  r22,13*SZL(r3)
> - PPC_LL  r23,14*SZL(r3)
> - PPC_LL  r24,15*SZL(r3)
> - PPC_LL  r25,16*SZL(r3)
> - PPC_LL  r26,17*SZL(r3)
> - PPC_LL  r27,18*SZL(r3)
> - PPC_LL  r28,19*SZL(r3)
> - PPC_LL  r29,20*SZL(r3)
> - PPC_LL  r30,21*SZL(r3)
> - PPC_LL  r31,22*SZL(r3)
> - PPC_LL  r0,3*SZL(r3)
> - mtcrf   0x38,r0
> - PPC_LL  r0,0(r3)
> - PPC_LL  r1,SZL(r3)
> - PPC_LL  r2,2*SZL(r3)
> - mtlrr0
> - mr  r3,r4
> - blr
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index d9c912b..de419e9 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -1086,3 +1086,25 @@ relocate_new_kernel_end:
>  relocate_new_kernel_size:
>   .long relocate_new_kernel_end - relocate_new_kernel
>  #endif
> +
> +_GLOBAL(setjmp)
> + mflrr0
> + li  r3, 0
> + stw r0, 0(r3)

Huh? Explicitly writing to address 0? Has this code been test run at
least once?

At least move the li r3,0 to just before the blr.

Gabriel

> + stw r1, 4(r3)
> + stw r2, 8(r3)
> + mfcrr12
> + stmwr12, 12(r3)
> + blr
> +
> +_GLOBAL(longjmp)
> + lwz r0, 0(r3)
> + lwz r1, 4(r3)
> + lwz r2, 8(r3)
> + lmw r12, 12(r3)
> + mtcrf   0x38, r12
> + mtlrr0
> + mr. r3, r4
> + bnelr
> + li  r3, 1
> + blr
> diff --git a/arch/p

Re: [PATCH] powerpc: inline current_stack_pointer()

2016-05-23 Thread Gabriel Paubert
On Mon, May 23, 2016 at 10:46:02AM +0200, Christophe Leroy wrote:
> current_stack_pointeur() is a single instruction function. it
> It is not worth breaking the execution flow with a bl/blr for a
> single instruction

Are you sure that the result is always the same? 

Calling an external function prevents the compiler from considering the
caller of of current_stack_pointer a leaf function, which certainly 
does not help the compiler, but in a leaf function the compiler is free 
not to establish a new frame.

If the compiler decides to establishes a new frame (typically with 
"stwu r1,-frame_size(r1)"), *r1 is the previous stack pointer, or
the caller's stack pointer, or the current function frame pointer if
I remember correctly the ABI definitions. 

However, if the compiler decides that it can get away without a frame
for the function, *r1 is the stack pointer of the caller's caller.

Depending on the application, this may or may not be important.

By the way, isn't there a GCC builtin which can perform this task,
for example builtin_frame_address()?

Gabriel
> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/reg.h  | 7 ++-
>  arch/powerpc/kernel/misc.S  | 4 
>  arch/powerpc/kernel/ppc_ksyms.c | 2 --
>  3 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index c1e82e9..7ce6777 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1301,7 +1301,12 @@ static inline unsigned long mfvtb (void)
>  
>  #define proc_trap()  asm volatile("trap")
>  
> -extern unsigned long current_stack_pointer(void);
> +static inline unsigned long current_stack_pointer(void)
> +{
> + register unsigned long *ptr asm("r1");
> +
> + return *ptr;
> +}
>  
>  extern unsigned long scom970_read(unsigned int address);
>  extern void scom970_write(unsigned int address, unsigned long value);
> diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
> index 0d43219..7ce26d4 100644
> --- a/arch/powerpc/kernel/misc.S
> +++ b/arch/powerpc/kernel/misc.S
> @@ -114,7 +114,3 @@ _GLOBAL(longjmp)
>   mtlrr0
>   mr  r3,r4
>   blr
> -
> -_GLOBAL(current_stack_pointer)
> - PPC_LL  r3,0(r1)
> - blr
> diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
> index 9f01e28..eb5c5dc 100644
> --- a/arch/powerpc/kernel/ppc_ksyms.c
> +++ b/arch/powerpc/kernel/ppc_ksyms.c
> @@ -33,5 +33,3 @@ EXPORT_SYMBOL(store_vr_state);
>  #ifdef CONFIG_EPAPR_PARAVIRT
>  EXPORT_SYMBOL(epapr_hypercall_start);
>  #endif
> -
> -EXPORT_SYMBOL(current_stack_pointer);
> -- 
> 2.1.0
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: powerpc: Discard ffs() function and use builtin_ffs instead

2016-05-13 Thread Gabriel Paubert
On Fri, May 13, 2016 at 04:16:57PM +1000, Michael Ellerman wrote:
> On Thu, 2016-12-05 at 15:32:22 UTC, Christophe Leroy wrote:
> > With the ffs() function as defined in arch/powerpc/include/asm/bitops.h
> > GCC will not optimise the code in case of constant parameter, as shown
> > by the small exemple below.
> > 
> > int ffs_test(void)
> > {
> > return 4 << ffs(31);
> > }
> > 
> > c0012334 :
> > c0012334:   39 20 00 01 li  r9,1
> > c0012338:   38 60 00 04 li  r3,4
> > c001233c:   7d 29 00 34 cntlzw  r9,r9
> > c0012340:   21 29 00 20 subfic  r9,r9,32
> > c0012344:   7c 63 48 30 slw r3,r3,r9
> > c0012348:   4e 80 00 20 blr
> > 
> > With this patch, the same function will compile as follows:
> > 
> > c0012334 :
> > c0012334:   38 60 00 08 li  r3,8
> > c0012338:   4e 80 00 20 blr
> 
> 
> But what code does it generate when it's not a constant?
> 
> And which gcc version first added the builtin version?

It already existed in gcc-2.95, which you do not want to use to compile
anything today but I have in a corner of a chroot environment to maintain
~1997 vintage embedded stuff, running a 2.2.12 kernel!

Hopefully this clears up your concerns :-)

Cheers,
Gabriel


Re: [RFC] Design for flag bit outputs from asms

2015-05-05 Thread Gabriel Paubert
On Mon, May 04, 2015 at 12:33:38PM -0700, Richard Henderson wrote:
[snipped]
> (3) Note that ppc is both easier and more complicated.
> 
>   There we have 8 4-bit registers, although most of the integer
>   non-comparisons only write to CR0.  And the vector non-comparisons
>   only write to CR1, though of course that's of less interest in the
>   context of kernel code.

Actually vector (Altivec) write to CR6. Standard FPU optionally write to
CR1, but the written value does not exactly depend on the result of the last
instruction; it is an instead an accrued exception status.

> 
>   For the purposes of cr0, the same scheme could certainly work, although
>   the hook would not insert a hard register use, but rather a pseudo to
>   be allocated to cr0 (constaint "x").

Yes, but we might also want to leave the choice of a cr register to the 
compiler.

> 
>   That said, it's my understanding that "dot insns", setting cr0 are
>   expensive in current processor generations.  

Not that much if I understand properly power7.md and power8.md: 
no (P7) or one (P8) additional clock for common instructions 
(add/sub and logical), but nothing else, so they are likely a win. 

Shift/rotate/sign extensions seem to have more decoding restrictions: 
the recording ("dot") forms are "cracked" and use 2 integer units.

>   There's also a lot less
>   of the x86-style "operate and set a flag based on something useful".
> 

But there is at least an important one, which I occasionally wished I had: 
the conditional stores. 

The overflow bit might also be useful, not really 
for the kernel, but for applications (and mfxer is slow).

Regards,
Gabriel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100

2016-03-09 Thread Gabriel Paubert
On Wed, Mar 09, 2016 at 12:38:18AM -0600, Scott Wood wrote:
> On Tue, Mar 08, 2016 at 08:59:12AM +0100, Alessio Igor Bogani wrote:
> > The mtmsr() function hangs during restart. Make reboot works on
> > MVME5100 removing that function call.
> > ---
> >  arch/powerpc/platforms/embedded6xx/mvme5100.c | 2 --
> >  1 file changed, 2 deletions(-)
> 
> Missing signoff
> 
> Do you know why MSR_IP was there to begin with? 

Huh? The patch sets MSR_IP for reset but it is cleared while running Linux.
I don't have any MVME5100, however I do have MVME2400/2700 with the same
bridge (Hawk), so I can say that the address space layout is quite standard: 
memory at 0, ROM at the high end of the 32-bit address space. However the
reset method is quite different (no external register set on the Hawk).

> Does this board have a switch that determines whether boot vectors 
> are high or low (I remember some 83xx boards that did), in which 
> case is this fixing one config by breaking another?

For the switch, no AFAICT. And the code is MVME5100 specific so I
suspect that it is very unlikely to break other boards.

Very likely the source of the problem is that the restart address is remapped 
(ioremap) but never accessed while the kernel is running (the only access to
*restart is in the reboot routine) so we take a DSI exception to fill the hash 
table when attempting to reboot. 

It would be enough to move the setting of MSR_IP until after triggering the 
restart, but this performs a hard reset of the CPU, which will set MSR_IP 
anyway (granted that the CPU will probably set MSR_IP way before the reset 
signal comes in).

One way to check this hypothesis would be to introduce a write of 0 to
the restart address before setting MSR_IP.

This said restart is declared as u_char *, so the cast in the out_8 
register access is useless and ugly.

Gabriel

P.S.: my MVME24xx/26x/27xx do not run such a modern kernel. 


Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100

2016-03-10 Thread Gabriel Paubert
On Wed, Mar 09, 2016 at 03:26:21PM -0600, Scott Wood wrote:
> On Wed, 2016-03-09 at 11:28 +0100, Gabriel Paubert wrote:
> > On Wed, Mar 09, 2016 at 12:38:18AM -0600, Scott Wood wrote:
> > > On Tue, Mar 08, 2016 at 08:59:12AM +0100, Alessio Igor Bogani wrote:
> > > > The mtmsr() function hangs during restart. Make reboot works on
> > > > MVME5100 removing that function call.
> > > > ---
> > > >  arch/powerpc/platforms/embedded6xx/mvme5100.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > 
> > > Missing signoff
> > > 
> > > Do you know why MSR_IP was there to begin with? 
> > 
> > Huh? The patch sets MSR_IP for reset but it is cleared while running Linux.
> > I don't have any MVME5100, however I do have MVME2400/2700 with the same
> > bridge (Hawk), so I can say that the address space layout is quite standard:
> > memory at 0, ROM at the high end of the 32-bit address space. However the
> > reset method is quite different (no external register set on the Hawk).
> 
> I meant why the line of code was there, not why the bit was ever set.  It
> seems odd to me that the pre-reset value of MSR would matter at all --
> shouldn't it get reset along with the rest of the CPU, as you later suggest? 

I don't know, it may create a soft reset. I have the board documentation
and it is not clear (at least not 100% clear to me) whether this reset 
drives the HRESET of SRESET pin, although HRESET is more likely.

>  And if it does matter, then why are Alessio and whoever wrote the code
> (assuming it wasn't an untested copy-and-paste from somewhere else) seeing
> different results regarding whether IP needs to be set or unset?

The different results are perfectly explained by taking a DSI exception
on accessing *restart (the only access to this address in the whole
code, so the hash table has to be filled at this time). With MSR_IP set, 
DSI exception goes to 0xfff00300, and then probably hangs.

> 
> > > Does this board have a switch that determines whether boot vectors 
> > > are high or low (I remember some 83xx boards that did), in which 
> > > case is this fixing one config by breaking another?
> > 
> > For the switch, no AFAICT. And the code is MVME5100 specific so I
> > suspect that it is very unlikely to break other boards.
> >
> > Very likely the source of the problem is that the restart address is
> > remapped 
> > (ioremap) but never accessed while the kernel is running (the only access to
> > *restart is in the reboot routine) so we take a DSI exception to fill the
> > hash 
> > table when attempting to reboot. 
> >
> > It would be enough to move the setting of MSR_IP until after triggering the 
> > restart, but this performs a hard reset of the CPU, which will set MSR_IP 
> > anyway (granted that the CPU will probably set MSR_IP way before the reset 
> > signal comes in).
> 
> How can you perform anything after the reset is triggered?  There might be
> some lag before it takes effect, but depending on that seems hackish and
> unreliable, and why would it make a difference?

It would make a difference if this reset is connected to SRESET, which
does not affect MSR_IP.

> 
> > One way to check this hypothesis would be to introduce a write of 0 to
> > the restart address before setting MSR_IP.
> 
> I'm not familiar with the register interface for this board logic, but what
> would that demonstrate?

Writing a 0 would not trigger the reset, but ensure that the hash table
is filled, then you can change MSR_IP and perform the real reset, and be
sure that you'll end up executing the reset in FLASH, not in RAM.

The alternative solution it to put code at 0x100 that
sets MSR_IP and branches to 0xfff00100.

> 
> > This said restart is declared as u_char *, so the cast in the out_8 
> > register access is useless and ugly.
> 
> Yes, and restart should be __iomem.

Indeed.

Gabriel


Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C

2019-09-03 Thread Gabriel Paubert
On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote:
> On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote:
> > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit :
> > >(Why are they separate though?  It could just be one loop var).
> > 
> > Yes it could just be a single loop var, but in that case it would have 
> > to be reset at the start of the second loop, which means we would have 
> > to pass 'addr' for resetting the loop anyway,
> 
> Right, I noticed that after hitting send, as usual.
> 
> > so I opted to do it 
> > outside the inline asm by using to separate loop vars set to their 
> > starting value outside the inline asm.
> 
> The thing is, the way it is written now, it will get separate registers
> for each loop (with proper earlyclobbers added).  Not that that really
> matters of course, it just feels wrong :-)

After "mtmsr %3", it is always possible to copy %0 to %3 and use it as
an address register for the second loop. One register less to allocate
for the compiler. Constraints of course have to be adjusted.

Gabriel
> 
> 
> Segher


Re: [PATCH v2 13/29] arch: add split IPC system calls where needed

2019-01-18 Thread Gabriel Paubert
On Fri, Jan 18, 2019 at 05:18:19PM +0100, Arnd Bergmann wrote:
> The IPC system call handling is highly inconsistent across architectures,
> some use sys_ipc, some use separate calls, and some use both.  We also
> have some architectures that require passing IPC_64 in the flags, and
> others that set it implicitly.
> 
> For the additon of a y2083 safe semtimedop() system call, I chose to only

It's not critical, but there are two typos in that line: 
additon -> addition
2083 -> 2038

Gabriel

> support the separate entry points, but that requires first supporting
> the regular ones with their own syscall numbers.
> 
> The IPC_64 is now implied by the new semctl/shmctl/msgctl system
> calls even on the architectures that require passing it with the ipc()
> multiplexer.
> 
> I'm not adding the new semtimedop() or semop() on 32-bit architectures,
> those will get implemented using the new semtimedop_time64() version
> that gets added along with the other time64 calls.
> Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().
> 
> Signed-off-by: Arnd Bergmann 


Re: [PATCH v13 00/10] powerpc: Switch to CONFIG_THREAD_INFO_IN_TASK

2019-01-24 Thread Gabriel Paubert
On Thu, Jan 24, 2019 at 04:58:41PM +0100, Christophe Leroy wrote:
> 
> 
> Le 24/01/2019 à 16:01, Christophe Leroy a écrit :
> > 
> > 
> > Le 24/01/2019 à 10:43, Christophe Leroy a écrit :
> > > 
> > > 
> > > On 01/24/2019 01:06 AM, Michael Ellerman wrote:
> > > > Christophe Leroy  writes:
> > > > > Le 12/01/2019 à 10:55, Christophe Leroy a écrit :
> > > > > > The purpose of this serie is to activate
> > > > > > CONFIG_THREAD_INFO_IN_TASK which
> > > > > > moves the thread_info into task_struct.
> > > > > > 
> > > > > > Moving thread_info into task_struct has the following advantages:
> > > > > > - It protects thread_info from corruption in the case of stack
> > > > > > overflows.
> > > > > > - Its address is harder to determine if stack addresses are
> > > > > > leaked, making a number of attacks more difficult.
> > > > > 
> > > > > I ran null_syscall and context_switch benchmark selftests
> > > > > and the result
> > > > > is surprising. There is slight degradation in context_switch and a
> > > > > significant one on null_syscall:
> > > > > 
> > > > > Without the serie:
> > > > > 
> > > > > ~# chrt -f 98 ./context_switch --no-altivec --no-vector --no-fp
> > > > > 55542
> > > > > 55562
> > > > > 55564
> > > > > 55562
> > > > > 55568
> > > > > ...
> > > > > 
> > > > > ~# ./null_syscall
> > > > >  2546.71 ns 336.17 cycles
> > > > > 
> > > > > 
> > > > > With the serie:
> > > > > 
> > > > > ~# chrt -f 98 ./context_switch --no-altivec --no-vector --no-fp
> > > > > 55138
> > > > > 55142
> > > > > 55152
> > > > > 55144
> > > > > 55142
> > > > > 
> > > > > ~# ./null_syscall
> > > > >  3479.54 ns 459.30 cycles
> > > > > 
> > > > > So 0,8% less context switches per second and 37% more time
> > > > > for one syscall ?
> > > > > 
> > > > > Any idea ?
> > > > 
> > > > What platform is that on?
> > > 
> > > It is on the 8xx
> 
> On the 83xx, I have a slight improvment:
> 
> Without the serie:
> 
> root@vgoippro:~# ./null_syscall
> 921.44 ns 307.15 cycles
> 
> With the serie:
> 
> root@vgoippro:~# ./null_syscall
> 918.78 ns 306.26 cycles
> 

The 8xx has very low cache associativity, something like 2, right?

In this case it may be access patterns which change the number of
cache line transfers when you move things around. 

Try to move things around in main(), for example allocate a variable of
~1kB on the stack in the function that performs the null_syscalls (use 
the variable before and after the loop, to avoid clever compiler
optimizations).

Gabriel


> Christophe
> 
> > > 
> > > > 
> > > > On 64-bit we have to turn one mtmsrd into two and that's obviously a
> > > > slow down. But I don't see that you've done anything similar in 32-bit
> > > > code.
> > > > 
> > > > I assume it's patch 8 that causes the slow down?
> > > 
> > > I have not digged into it yet, but why patch 8 ?
> > > 
> > 
> > The increase of null_syscall duration happens with patch 5 when we
> > activate CONFIG_THREAD_INFO_IN_TASK.
> > 


Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2

2016-08-10 Thread Gabriel Paubert
On Fri, Aug 05, 2016 at 01:28:02PM +0200, Christophe Leroy wrote:
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/misc_32.S | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index e025230..e18055c 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -578,9 +578,8 @@ _GLOBAL(__bswapdi2)
>   rlwimi  r9,r4,24,0,7
>   rlwimi  r10,r3,24,0,7
>   rlwimi  r9,r4,24,16,23
> - rlwimi  r10,r3,24,16,23
> + rlwimi  r4,r3,24,16,23
>   mr  r3,r9
> - mr  r4,r10
>   blr
>  

Hmmm, are you sure that it works? rlwimi is a bit special since the
first operand is both an input and an output of the instruction.


In other words, did you test the code?

Gabriel


Re: [PATCH] powerpc/8xx: fix single_step debug

2016-08-18 Thread Gabriel Paubert
On Thu, Aug 18, 2016 at 11:44:20AM +0200, Christophe Leroy wrote:
> SPRN_ICR must be read for clearing the internal freeze signal which
> is asserted by the single step exception, otherwise the timebase and
> decrementer remain freezed

Minor nit: s/freezed/frozen/

If the timebase and decrementer are frozen even for a few cycles, this
probably upsets timekeeping. I consider this a completely stupid design
decision, and maybe I'm not alone.

Gabriel

> 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/reg_8xx.h | 1 +
>  arch/powerpc/kernel/traps.c| 8 
>  2 files changed, 9 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/reg_8xx.h 
> b/arch/powerpc/include/asm/reg_8xx.h
> index feaf641..6dae71f 100644
> --- a/arch/powerpc/include/asm/reg_8xx.h
> +++ b/arch/powerpc/include/asm/reg_8xx.h
> @@ -17,6 +17,7 @@
>  #define SPRN_DC_DAT  570 /* Read-only data register */
>  
>  /* Misc Debug */
> +#define SPRN_ICR 148
>  #define SPRN_DPDR630
>  #define SPRN_MI_CAM  816
>  #define SPRN_MI_RAM0 817
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 2cb5892..0f1f0ce 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -400,8 +400,16 @@ static inline int check_io_access(struct pt_regs *regs)
>  #define REASON_TRAP  0x2
>  
>  #define single_stepping(regs)((regs)->msr & MSR_SE)
> +#ifdef CONFIG_PPC_8xx
> +static inline void clear_single_step(struct pt_regs *regs)
> +{
> + regs->msr &= ~MSR_SE;
> + mfspr(SPRN_ICR);
> +}
> +#else
>  #define clear_single_step(regs)  ((regs)->msr &= ~MSR_SE)
>  #endif
> +#endif
>  
>  #if defined(CONFIG_4xx)
>  int machine_check_4xx(struct pt_regs *regs)
> -- 
> 2.1.0


Re: [PATCH] powerpc/8xx: fix single_step debug

2016-08-18 Thread Gabriel Paubert
On Thu, Aug 18, 2016 at 12:13:21PM +0200, Christophe Leroy wrote:
> 
> 
> Le 18/08/2016 à 11:58, Gabriel Paubert a écrit :
> >On Thu, Aug 18, 2016 at 11:44:20AM +0200, Christophe Leroy wrote:
> >>SPRN_ICR must be read for clearing the internal freeze signal which
> >>is asserted by the single step exception, otherwise the timebase and
> >>decrementer remain freezed
> >
> >Minor nit: s/freezed/frozen/
> >
> >If the timebase and decrementer are frozen even for a few cycles, this
> >probably upsets timekeeping. I consider this a completely stupid design
> >decision, and maybe I'm not alone.
> >
> >Gabriel
> 
> We could also unset TBF bit (TimeBase Freeze enable) in TBSCR
> register (today it is set in
> arch/powerpc/platforms/8xx/m8xx_setup.c) but then it would impact
> debug done with an external BDM system which expects the decrementer
> and TB frozen when it freezes the execution.

Ok, I believe that systematically setting it is a mistake, but then I'm
always a bit nervous about screwing up timekeeping (it certainly is
always a very bad idea when you are driving telescopes).

Gabriel


Re: Re:Re: [PATCH] powerpc: Fix a bug in __div64_32 if divisor is zero

2020-08-22 Thread Gabriel Paubert
On Sun, Aug 23, 2020 at 12:54:33AM +0800, Guohua Zhong wrote:
> >In generic version in lib/math/div64.c, there is no checking of 'base' 
> >either.
> >Do we really want to add this check in the powerpc version only ?
> 
> >The only user of __div64_32() is do_div() in 
> >include/asm-generic/div64.h. Wouldn't it be better to do the check there ?
> 
> >Christophe
> 
> Yet, I have noticed that there is no checking of 'base' in these functions.
> But I am not sure how to check is better.As we know that the result is 
> undefined when divisor is zero. It maybe good to print error and dump stack.
>  Let the process to know that the divisor is zero by sending SIGFPE. 
> 
> diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
> index a3b98c86f077..161c656ee3ee 100644
> --- a/include/asm-generic/div64.h
> +++ b/include/asm-generic/div64.h
> @@ -43,6 +43,11 @@
>  # define do_div(n,base) ({ \
> uint32_t __base = (base);   \
> uint32_t __rem; \
> + if (unlikely(base == 0)) {  \
> + pr_err("do_div base=%d\n",base);\
> + dump_stack();   \
> + force_sig(SIGFPE);  \
> + }  
> 

I suspect this will generate a strong reaction. SIGFPE is for user space
instruction attempting a division by zero. A division by zero in the
kernel is a kernel bug, period, and you don't want to kill a user
process for this reason.

If it happens in an interrupt, the context of the kernel may not even be
related to the current process.

Many other architectures (x86 for example) already trigger an exception
on a division by zero but the handler will find that the exception
happened in kernel context and generate an Oops, not raise a signal in a
(possibly innocent) userland process.

Gabriel

> Then it also needto add this checking in functions of
> div64_s64(), div64_u64(), div64_u64_rem(), div_s64_rem and div_u64_rem () 
> in include/linux/math64.h
> 
> + if (unlikely(divisor == 0)) {
> + pr_err("%s divisor=0\n",__func__);
> + dump_stack();
> + force_sig(SIGFPE);
> + }
> 
> Guohua
> 
> >>lwz r5,0(r3)# get the dividend into r5/r6
> >>lwz r6,4(r3)
> >>cmplw   r5,r4
> >>@@ -52,6 +55,7 @@ __div64_32:
> >>  4:stw r7,0(r3)# return the quotient in *r3
> >>stw r8,4(r3)
> >>mr  r3,r6   # return the remainder in r3
> >>+5: # return if divisor r4 is zero
> >>blr
> >>  
> >>  /*
> >>diff --git a/arch/powerpc/lib/div64.S b/arch/powerpc/lib/div64.S
> >>index 3d5426e7dcc4..1cc9bcabf678 100644
> >>--- a/arch/powerpc/lib/div64.S
> >>+++ b/arch/powerpc/lib/div64.S
> >>@@ -13,6 +13,9 @@
> >>  #include 
> >>  
> >>  _GLOBAL(__div64_32)
> >>+   li  r9,0
> >>+   cmplw   r4,r9   # check if divisor r4 is zero
> >>+   beq 5f  # jump to label 5 if r4(divisor) is zero
> >>lwz r5,0(r3)# get the dividend into r5/r6
> >>lwz r6,4(r3)
> >>cmplw   r5,r4
> >>@@ -52,4 +55,5 @@ _GLOBAL(__div64_32)
> >>  4:stw r7,0(r3)# return the quotient in *r3
> >>stw r8,4(r3)
> >>mr  r3,r6   # return the remainder in r3
> >>+5: # return if divisor r4 is zero
> >>blr
> >>
> 
 



Re: [PATCH v2 2/5] powerpc: Allow 4224 bytes of stack expansion for the signal frame

2020-07-27 Thread Gabriel Paubert
On Fri, Jul 24, 2020 at 07:25:25PM +1000, Michael Ellerman wrote:
> We have powerpc specific logic in our page fault handling to decide if
> an access to an unmapped address below the stack pointer should expand
> the stack VMA.
> 
> The code was originally added in 2004 "ported from 2.4". The rough
> logic is that the stack is allowed to grow to 1MB with no extra
> checking. Over 1MB the access must be within 2048 bytes of the stack
> pointer, or be from a user instruction that updates the stack pointer.
> 
> The 2048 byte allowance below the stack pointer is there to cover the
> 288 byte "red zone" as well as the "about 1.5kB" needed by the signal
> delivery code.
> 
> Unfortunately since then the signal frame has expanded, and is now
> 4224 bytes on 64-bit kernels with transactional memory enabled.

Are there really users of transactional memory in the wild? 

Just asking because Power10 removes TM, and Power9 has had some issues
with it AFAICT.

Getting rid of it (if possible) would result in smaller signal frames,
with simpler signal delivery code (probably slightly faster also).

Gabriel
 



Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface

2017-12-19 Thread Gabriel Paubert
On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote:
> On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote:
> > On 12/18/2017 02:18 PM, Ram Pai wrote:
> > > b) minimum number of keys available to the application.
> > >   if libraries consumes a few, they could provide a library
> > >   interface to the application informing the number available to
> > >   the application.  The library interface can leverage (b) to
> > >   provide the information.
> > 
> > OK, let's see a real user of this including a few libraries.  Then we'll
> > put it in the kernel.
> > 
> > > c) types of disable-rights supported by keys.
> > >   Helps the application to determine the types of disable-features
> > >   available. This is helpful, otherwise the app has to 
> > >   make pkey_alloc() call with the corresponding parameter set
> > >   and see if it suceeds or fails. Painful from an application
> > >   point of view, in my opinion.
> > 
> > Again, let's see a real-world use of this.  How does it look?  How does
> > an app "fall back" if it can't set a restriction the way it wants to?
> > 
> > Are we *sure* that such an interface makes sense?  For instance, will it
> > be possible for some keys to be execute-disable while other are only
> > write-disable?
> 
> Can it be on x86?
> 
> its not possible on ppc. the keys can *all* be  the-same-attributes-disable 
> all the
> time.
> 
> However you are right. Its conceivable that some arch could provide a
> feature where it can be x-attribute-disable for key 'a' and
> y-attribute-disable for key 'b'.  But than its a bit of a headache
> for an application to consume such a feature.
> 
> Ben: I recall you requesting this feature.  Thoughts?
> 
> > 
> > > I think on x86 you look for some hardware registers to determine
> > > which hardware features are enabled by the kernel.
> > 
> > No, we use CPUID.  It's a part of the ISA that's designed for
> > enumerating CPU and (sometimes) OS support for CPU features.
> > 
> > > We do not have generic support for something like that on ppc.  The
> > > kernel looks at the device tree to determine what hardware features
> > > are available. But does not have mechanism to tell the hardware to
> > > track which of its features are currently enabled/used by the
> > > kernel; atleast not for the memory-key feature.
> > 
> > Bummer.  You're missing out.
> > 
> > But, you could still do this with a syscall.  "Hey, kernel, do you
> > support this feature?"
> 
> or do powerpc specific sysfs interface.
> or a debugfs interface.

getauxval(3) ?

With AT_HWCAP or HWCAP2 as parameter already gives information about
features supported by the hardware and the kernel.

Taking one bit to expose the availability of protection keys to
applications does not look impossible.

Do I miss something obvious?

Gabriel


Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2

2016-08-11 Thread Gabriel Paubert
On Wed, Aug 10, 2016 at 12:18:15PM +0200, Christophe Leroy wrote:
> 
> 
> Le 10/08/2016 à 10:56, Gabriel Paubert a écrit :
> >On Fri, Aug 05, 2016 at 01:28:02PM +0200, Christophe Leroy wrote:
> >>Signed-off-by: Christophe Leroy 
> >>---
> >> arch/powerpc/kernel/misc_32.S | 3 +--
> >> 1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >>diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> >>index e025230..e18055c 100644
> >>--- a/arch/powerpc/kernel/misc_32.S
> >>+++ b/arch/powerpc/kernel/misc_32.S
> >>@@ -578,9 +578,8 @@ _GLOBAL(__bswapdi2)
> >>rlwimi  r9,r4,24,0,7
> >>rlwimi  r10,r3,24,0,7
> >>rlwimi  r9,r4,24,16,23
> >>-   rlwimi  r10,r3,24,16,23
> >>+   rlwimi  r4,r3,24,16,23
> >>mr  r3,r9
> >>-   mr  r4,r10
> >>blr
> >>
> >
> >Hmmm, are you sure that it works? rlwimi is a bit special since the
> >first operand is both an input and an output of the instruction.
> >
> >
> 
> Oops, you are right ...

I just found this: 

http://hardwarebug.org/2010/01/14/beware-the-builtins/

the bswapdi2 suggested sequence only needs a single mr instruction, the 
other one is absorbed in a rotlwi.

The scheduling looks poor, but it seems impossible to interleave the
operations between the two halves without adding another instructions,
and the routine is 8 instructions long, which happens to be exactly a
cache line on most 32 bit processors.

On the other hand gcc did at the time a very poor job (quite an
understatement) at bswapdi when compiling for 64 bit processors 
(see the example).

But what do modern compilers generate for bswapdi these days? Do they
still call the library or not?

After all, bswapdi on 32 bit processors only takes 6 instructions if the
input and output registers don't overlap.

Gabriel



Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2

2016-08-12 Thread Gabriel Paubert
On Thu, Aug 11, 2016 at 05:11:19PM -0500, Segher Boessenkool wrote:
> On Thu, Aug 11, 2016 at 11:34:37PM +0200, Gabriel Paubert wrote:
> > On the other hand gcc did at the time a very poor job (quite an
> > understatement) at bswapdi when compiling for 64 bit processors 
> > (see the example).
> > 
> > But what do modern compilers generate for bswapdi these days? Do they
> > still call the library or not?
> 
> Nope.

Great, could then these functions be removed from misc_32.S, or are
compilers that use libcalls still supported for kernel builds?

> 
> > After all, bswapdi on 32 bit processors only takes 6 instructions if the
> > input and output registers don't overlap.
> 
> For this testcase:
> ===
> typedef unsigned long long u64;
> u64 bs(u64 x) { return __builtin_bswap64(x); }
> ===
> 
> we get with -m32:
> ===
> bs:
>   mr 9,3
>   rotlwi 3,4,24
>   rlwimi 3,4,8,8,15
>   rlwimi 3,4,8,24,31
>   rotlwi 4,9,24
>   rlwimi 4,9,8,8,15
>   rlwimi 4,9,8,24,31
>   blr

In this case the compiler is constrained by the fact that the input and
ouput registers are the same. When inlined with other things it can
probably perform better scheduling and interleaving of operations.


> ===
> 
> and with -m64:
> ===
> .L.bs:
>   srdi 10,3,32
>   mr 9,3
>   rotlwi 3,3,24
>   rotlwi 8,10,24
>   rlwimi 3,9,8,8,15
>   rlwimi 8,10,8,8,15
>   rlwimi 3,9,8,24,31
>   rlwimi 8,10,8,24,31
>   sldi 3,3,32
>   or 3,3,8
>   blr
> ===
> 

As demonstrated here where the two halves of the 64 bit quantity
are byte swapped in an interleaved fashion. Not perfect (I think
that with proper ordering the last 2 instructions could be replaced
by a rldimi), but reasonable.

> Neither as tight as possible, but neither horrible either.
> 

Indeed.

Gabriel