Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies
On Tue, Feb 12, 2008 at 12:49:05PM +0100, Gabriel Paubert wrote: > On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote: > > > > On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote: > > > - couple of fixes and preparatory patches > > > > > > - rework of PowerMac media-bay support ([un]register IDE devices instead > > > of > > > [un]registering IDE interface) [ it is the main reason for spamming PPC > > > ML ] > > > > Interesting... I was thinking about doing a full remove of the device at > > a higher level instead but I suppose what you propose is easier. > > Well, I have serious problem on a Pegasos which appeared some time > between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of > > hdb: empty DMA table? > > I'm trying to bisect it right now. Argh, the first bisect point ended up with timeouts on hdb... Flagged as bad, to try to see when the problems started, but I suspect that there are several. Gabriel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies
On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote: > > On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote: > > - couple of fixes and preparatory patches > > > > - rework of PowerMac media-bay support ([un]register IDE devices instead of > > [un]registering IDE interface) [ it is the main reason for spamming PPC > > ML ] > > Interesting... I was thinking about doing a full remove of the device at > a higher level instead but I suppose what you propose is easier. Well, I have serious problem on a Pegasos which appeared some time between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of hdb: empty DMA table? I'm trying to bisect it right now. Gabriel > > I'll have a look & test next week hopefully. > > Ben. > > > ___ > Linuxppc-dev mailing list > [EMAIL PROTECTED] > https://ozlabs.org/mailman/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/18] ide: warm-plug support for IDE devices and other goodies
On Tue, Feb 12, 2008 at 01:30:03PM +0100, Gabriel Paubert wrote: > On Tue, Feb 12, 2008 at 12:49:05PM +0100, Gabriel Paubert wrote: > > On Fri, Feb 08, 2008 at 07:40:43PM +1100, Benjamin Herrenschmidt wrote: > > > > > > On Fri, 2008-02-08 at 01:44 +0100, Bartlomiej Zolnierkiewicz wrote: > > > > - couple of fixes and preparatory patches > > > > > > > > - rework of PowerMac media-bay support ([un]register IDE devices > > > > instead of > > > > [un]registering IDE interface) [ it is the main reason for spamming > > > > PPC ML ] > > > > > > Interesting... I was thinking about doing a full remove of the device at > > > a higher level instead but I suppose what you propose is easier. > > > > Well, I have serious problem on a Pegasos which appeared some time > > between 2.6.24 and 2.6.25-rc1: at boot I get an infinite string of > > > > hdb: empty DMA table? > > > > I'm trying to bisect it right now. > > Argh, the first bisect point ended up with timeouts on hdb... > > Flagged as bad, to try to see when the problems started, but > I suspect that there are several. Well, it's worse than this. After 7 or 8 bisections (all flagged bad) the symptoms changed: the system boots up to the point where it does not recognize LVM volumes. So there are at least 3 bugs: 1) the timeout on interrupts (only seen on hdb) 2) the empty DMA table messages (seen on hda and hdb) 3) the inability to see logical volumes ("pvs" does not find them, they are back when rebooting into 2.6.24). This is the apparently the earliest introduced bug (no problems with the disks with that version). I am away from that machine until next Tuesday. IIRC the last git bisect was something like 160 revisions left. Gabriel > > Gabriel > ___ > Linuxppc-dev mailing list > [EMAIL PROTECTED] > https://ozlabs.org/mailman/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel Real Time Clock (RTC) Support for I2C Devices
On Wed, 18 Apr 2001, Grant Erickson wrote: > >From the looks of drivers/char/rtc.c it would appear that this kernel > driver only supports bus-attached RTCs such as the mentioned MC146818. Is > this correct? I think so. > > What is the correct access method / kernel tie-in for supporting such an > I2C-based RTC device using the "standard" interfaces? Adding a new kind of clock to the kernel and setting the correct pointers in ppc_md seems the right (if not necessarily simple) solution to your problem. I wonder how you calibrate the decrementer frequency on this machine, I2C is too slow to get any precision, unless you accept to wait for one minute or so at boot. I still have problems of reproducibility of clock frequency measurements with bus attached RTC (on machines on which the RTC is the only moderately precise timing source and its interrupt line is unfortunately not connected). > > My hope is to use 'hwclock' from util-linux w/o modification. Is this > reasonable? No. Regards, Gabriel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: high-res-timers start code.
On Mon, 23 Apr 2001, george anzinger wrote: > "Robert H. de Vries" wrote: > > > > On Monday 23 April 2001 19:45, you wrote: > > > > > By the way, is the user land stuff the same for all "arch"s? > > > > Not if you plan to handle the CPU cycle counter in user space. That is at > > least what I would propose. > > Just got interesting, lets let the world look in. > > What did you have in mind here? I suspect that on some archs the cycle > counter is not available to user code. I know that on parisc it is > optionally available (kernel can set a bit to make it available), but by > it self it is only good for intervals. You need to peg some value to a > CLOCK to use it to get timeofday, for instance. On Intel there is a also bit to disable unprivileged RDTSC, IIRC. On PPC the timebase is always available (but the old 601 needs spacial casing: it uses different registers and does not count in binary :-(). > On the other hand, if there is an area of memory that both users and > system can read but only system can write, one might put the soft clock > there. This would allow gettimeofday (with the cycle counter) to work > without a system call. To the best of my knowledge the system does not > have such an area as yet. > > comments? Well, there may be work in this area, since x86-64 will not enter kernel mode for gettimeofday() if I understand correctly what Andrea said. Linus hinted once at exporting (kernel) code to user space. Some data also will also need to be accessible but as long as you don't guarantee compatibility on data layout, only AFAIU on interface for these calls (it was not clear to me if it would be a fixed address forever or dynamic linking with kernel exported symbols), it's not a problem. Of course it will SIGSEGV instead of returning -EFAULT but this is a good thing IMHO, nobody checks for -EFAULT from gettimeofday(). I think that system calls should rather force SIGSEGV than return -EFAULT anyway, to make syscalls indistinguishable from pure library calls. Regards, Gabriel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: missing mxcsr initialization
On Fri, 20 Oct 2000, Andrea Arcangeli wrote: > Many thanks to Doug and Gabriel for very useful explanations about this FPU > stuff. I suggest Gabriel to submit his way faster and more correct tag word > conversion function to Linus for 2.4.x. Here it a first shot, twd_i387_to_fxsr is guaranteed to not add any new bug (since there are only 65536 cases, I wrote a short test program to check that the results are the same). For the other way around, I only corrected the most glaring error: that the sign never affects the tag. Some obscure corner cases may still be wrong, but at least now negative zero and positive infinity and NaNs are handled correctly. Performance could also be improved by handling separately the common case of an empty FP stack. Now the question: is it worth bloating the code to exactly return the tag values of a 387 (I actually find that the Intel doc is confusing, what is the state of the tag word after an fnsave when MMX is in use, 0x or 0x, not counting the bugs in this area) ? My feeling is that the only important information in the tag field is empty/not empty. Does any code (debuggers, etc...) actually rely on the tags for anything else ? Patch relative to 2.4.0-test9. Gabriel = arch/i386/kernel/i387.c 1.1 vs edited = --- 1.1/arch/i386/kernel/i387.c Tue Jun 27 20:14:20 2000 +++ edited/arch/i386/kernel/i387.c Fri Oct 20 17:56:49 2000 @@ -79,16 +79,16 @@ static inline unsigned short twd_i387_to_fxsr( unsigned short twd ) { - unsigned short ret = 0; - int i; - - for ( i = 0 ; i < 8 ; i++ ) { - if ( (twd & 0x3) != 0x3 ) { - ret |= (1 << i); - } - twd = twd >> 2; - } - return ret; + unsigned int tmp; /* to avoid 16 bit prefixes in the code */ + + /* Transform each pair of bits into 01 (valid) or 00 (empty) */ +tmp = ~twd; +tmp = (tmp | (tmp>>1)) & 0x; /* 0V0V0V0V0V0V0V0V */ +/* and move the valid bits to the lower byte. */ +tmp = (tmp | (tmp >> 1)) & 0x; /* 00VV00VV00VV00VV */ +tmp = (tmp | (tmp >> 2)) & 0x0f0f; /* */ +tmp = (tmp | (tmp >> 4)) & 0x00ff; /* */ +return tmp; } static inline unsigned long twd_fxsr_to_i387( struct i387_fxsave_struct *fxsave ) @@ -105,8 +105,8 @@ if ( twd & 0x1 ) { st = (struct _fpxreg *) FPREG_ADDR( fxsave, i ); - switch ( st->exponent ) { - case 0x: + switch ( st->exponent & 0x7fff ) { + case 0x7fff: tag = 2;/* Special */ break; case 0x: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] quiet non-x86 option ROM warnings
On Thu, Feb 17, 2005 at 05:56:03PM -0500, Jon Smirl wrote: > On Fri, 18 Feb 2005 09:45:50 +1100, Benjamin Herrenschmidt > <[EMAIL PROTECTED]> wrote: > > > Can't the size be obtained like any other BAR ? > > yes, but cards that don't fully decode their ROM address space can > waste memory in copy_rom. For example I have a card around here that > reports a BAR address space of 128K and has a 2K ROM in it. You only > want to copy the 2K, not the 128K. Indeed, but they normally repeat by powers of 2, ignoring high order address bits. Is it that hard to detect? For example if it declares 128k, compare the two halves, reduce to 64k if equal. Lather, rinse, repeat. It's equivalent to reading the BAR declared size twice in the worst case, so it's not that bad performance-wise. That would only be in the case of an unknown signature in the first bytes, otherwise the third byte gives you the size IIUC. Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc5
On Thu, Feb 24, 2005 at 04:57:56PM +0200, M.Baris Demiray wrote: > > Steven Rostedt wrote: > > >[...] > > > >Anyone know if the linux-2.6.11-rc5.tar.bz2 has all the changes in it? > >The tarball rc5 is smaller than rc4. Was there a lot taken out? > > Take a look at the diffview of 2.6.11-rc4-rc5 incremental patch. > > > 5599 files changed, 166396 insertions(+), 293627 deletions(-) > Wow! Over 100 thousand lines removed! If true, color me impressed, and kudos to the developers. Small is beautiful, I really like this :-) Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] nfs: Fix mismatch between encode_dent_fn and filldir_t
The 5th parameter of filldir_t function type used by vfs_readdir was changed from ino_t to u64 in October. Unfortunately the patch missed some files in fs/nfsd where functions pointers of type encode_dent_fn are passed around and finally cast to filldir_t. The effect is only visible when an NFS server is run on a 32 bit big-endian machine (it would have been visible on all 32 bit architectures if the 6th parameter had been used). The results are interesting: all files have an inode of 0 (unique you say?) from getdents(2) and even ls(1) does not find any files. Signed-off-by: Gabriel Paubert <[EMAIL PROTECTED]> -- P.S.: this shows that function pointer casts are evil, but I did not find a simpler way to fix this. diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 277df40..20c5f4a 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -991,14 +991,14 @@ encode_entry(struct readdir_cd *ccd, const char *name, int nfs3svc_encode_entry(struct readdir_cd *cd, const char *name, -int namlen, loff_t offset, ino_t ino, unsigned int d_type) +int namlen, loff_t offset, u64 ino, unsigned int d_type) { return encode_entry(cd, name, namlen, offset, ino, d_type, 0); } int nfs3svc_encode_entry_plus(struct readdir_cd *cd, const char *name, - int namlen, loff_t offset, ino_t ino, unsigned int d_type) + int namlen, loff_t offset, u64 ino, unsigned int d_type) { return encode_entry(cd, name, namlen, offset, ino, d_type, 1); } diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c index f5243f9..244406e 100644 --- a/fs/nfsd/nfsxdr.c +++ b/fs/nfsd/nfsxdr.c @@ -463,7 +463,7 @@ nfssvc_encode_statfsres(struct svc_rqst *rqstp, __be32 *p, int nfssvc_encode_entry(struct readdir_cd *ccd, const char *name, - int namlen, loff_t offset, ino_t ino, unsigned int d_type) + int namlen, loff_t offset, u64 ino, unsigned int d_type) { struct nfsd_readdirres *cd = container_of(ccd, struct nfsd_readdirres, common); __be32 *p = cd->buffer; diff --git a/include/linux/nfsd/nfsd.h b/include/linux/nfsd/nfsd.h index 0727774..3ba5141 100644 --- a/include/linux/nfsd/nfsd.h +++ b/include/linux/nfsd/nfsd.h @@ -53,7 +53,7 @@ struct readdir_cd { __be32 err;/* 0, nfserr, or nfserr_eof */ }; typedef int(*encode_dent_fn)(struct readdir_cd *, const char *, - int, loff_t, ino_t, unsigned int); + int, loff_t, u64, unsigned int); typedef int (*nfsd_dirop_t)(struct inode *, struct dentry *, int, int); extern struct svc_program nfsd_program; diff --git a/include/linux/nfsd/xdr.h b/include/linux/nfsd/xdr.h index 877192d..328520c 100644 --- a/include/linux/nfsd/xdr.h +++ b/include/linux/nfsd/xdr.h @@ -166,7 +166,7 @@ int nfssvc_encode_statfsres(struct svc_rqst *, __be32 *, struct nfsd_statfsres * int nfssvc_encode_readdirres(struct svc_rqst *, __be32 *, struct nfsd_readdirres *); int nfssvc_encode_entry(struct readdir_cd *, const char *name, - int namlen, loff_t offset, ino_t ino, unsigned int); + int namlen, loff_t offset, u64 ino, unsigned int); int nfssvc_release_fhandle(struct svc_rqst *, __be32 *, struct nfsd_fhandle *); diff --git a/include/linux/nfsd/xdr3.h b/include/linux/nfsd/xdr3.h index 7996386..bc8b171 100644 --- a/include/linux/nfsd/xdr3.h +++ b/include/linux/nfsd/xdr3.h @@ -332,10 +332,10 @@ int nfs3svc_release_fhandle(struct svc_rqst *, __be32 *, int nfs3svc_release_fhandle2(struct svc_rqst *, __be32 *, struct nfsd3_fhandle_pair *); int nfs3svc_encode_entry(struct readdir_cd *, const char *name, - int namlen, loff_t offset, ino_t ino, + int namlen, loff_t offset, u64 ino, unsigned int); int nfs3svc_encode_entry_plus(struct readdir_cd *, const char *name, - int namlen, loff_t offset, ino_t ino, + int namlen, loff_t offset, u64 ino, unsigned int); /* Helper functions for NFSv3 ACL code */ __be32 *nfs3svc_encode_post_op_attr(struct svc_rqst *rqstp, __be32 *p, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/10] local_t : powerpc
On Wed, Jan 24, 2007 at 08:08:12PM +1100, Paul Mackerras wrote: > Mathieu Desnoyers writes: > > > +static __inline__ int local_dec_if_positive(local_t *l) > > +{ > > + int t; > > + > > + __asm__ __volatile__( > > +"1:lwarx %0,0,%1 # local_dec_if_positive\n\ > > + addic. %0,%0,-1\n\ > > + blt-2f\n" > > + PPC405_ERR77(0,%1) > > +" stwcx. %0,0,%1\n\ > > + bne-1b" > > This has the same bugs that we fixed recently in atomic_dec_if_positive; > first, on 64-bit machines, the lwarx will zero-extend the word loaded > from memory, and so the result of the addic will be negative only if > the word was originally 0. Secondly, even on 32-bit machines, > 0x8000 will be considered positive since decrementing it gives > 0x7fff, which is positive. > > > +/* Use these for per-cpu local_t variables: on some archs they are > > + * much more efficient than these naive implementations. Note they take > > + * a variable, not an address. > > + * > > + * This could be done better if we moved the per cpu data directly > > + * after GS. > > + */ > > What's "GS"? Does this comment really apply on powerpc? > 1) It's an (application visible) i386/x86_64 segment register used to make memory addressing more confusing. 2) Because of 1), obviously not ;-) Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.21-rt2] PowerPC: decrementer clockevent driver
On Fri, May 18, 2007 at 05:52:45PM +0100, Matt Sealey wrote: > Kumar Gala wrote: > > > > On May 18, 2007, at 9:48 AM, Thomas Gleixner wrote: > > > >> On Fri, 2007-05-18 at 15:28 +0100, Matt Sealey wrote: > >>> > >>> I think both the MPC52xx GPT0-7 and the SLT0-1 fulfil this fairly > >>> easily. > >> > >> There is some basic work for MPC5200 available: > >> > >> http://www.pengutronix.de/oselas/bsp/phytec/index_en.html#phyCORE-MPC5200B-tiny > >> > > > > I asked this earlier, but figured you might have a better insight. Is > > their value in having 'drivers' for more than one clock source? I'd say > > most (of not all) the PPC SoCs have timers on the system side that we > > could provide drivers for, I'm just not sure if that does anything for > > anyone. > > As I asked after, I'm also very intrigued as to what is going to end > up using these timers, but likewise, not much use writing a driver if > everyone can use the extremely high resolution decrementer all at > once.. > > As I said before too, at least Intel has decided there is a great need > for up to 256 high resolution timer sources on a system, but since this > is a fairly new concept to Linux (and hrtimers and dynticks too) it > only seems to be used in the case of i8254/RTC emulation, mostly on > x86-64. > > I'm looking at it now and finding "users" of hrtimers is looking very > thin on the ground. Maybe it's justified on the basis that more is > better, and having support is preferable to not having it (even if > nobody really uses it) but it seems the entire gamut of timing > possibility in Linux can be handled through a simple, and single, > high resolution timer and a queue of events.. > > So do we need some more? :D > I don't think so. I really wonder what other capabilities the other clock sources bring, except more convoluted and complex code in the end. There may be one exception to it: scheduled wakeup with very deep sleep in which you can't even afford the power dissipated by the processor in the modes in with the timebase/decrementer keeps running (which is fairly low to start with). But this would be a special wakeup path where timebase and decrementer are reinitialized and become again the main timer source. Otherwise the decrementer can provide essentially arbitrary high resolution (i.e., comparable to the time of an external bus access) on a fairly large number of independent timers. This scales with the number of processors since there is one decrementer per core (is it per thread on SMT chips? I don't know). A few months ago, I implemented here for out internal needs the equivalent of HRT (I called them FGT for fine-grained timers) for our VME machines still running a 2.2 kernel. We have boards who should have generated regular interrupts at programmable rate interpreting GPS timestamp signals (IRIG B). Unfortunately the signals were sometimes corrupt and the interrupts were no more regularly spaced. Now the IRIGB signal is only used to feed NTP which acts as a low pass filter with an extremely small cut-off frequency (1mHz or less) and the regular interrupts are generated from the decrementer in parallel with the standard ticks. The code is not really polished and would not scale to a large number of timers but it took me about one day and a half to write and debug it. I can even monitor it from a /proc file: [EMAIL PROTECTED] /root]# cat /proc/fgtimer Calls Lost Name 60865181158 System timer tick 77889762 0 itFast (128Hz) There are a few lost ticks for the main system timer because some operations mask interrupts for too long during early boot, especially with root on NFS which is the case here, they never happen after the root file system is mounted. This is also a somewhat special 2.2 kernel, it essentially has the timekeeping code I submitted for 2.4 and which is still the core of PPC timekeeping. Regards, Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Section mismatch warnings (was Re: [PATCH] early_pfn_to_nid needs to be __meminit)
On Thu, May 10, 2007 at 02:25:52AM +1000, Stephen Rothwell wrote: > since it is referenced by memmap_init_zone (which is __meminit) via the > early_pfn_in_nid macro when CONFIG_NODES_SPAN_OTHER_NODES is set (which > basically means PowerPC 64). > > This removes a section mismatch warning in those circumstances. Speaking of this, I just tried to compile an official (Linus' git tree) kernel for my old PMac G4 and I get a lot of section mismatch warnings at the end of the compilation: LD vmlinux SYSMAP System.map SYSMAP .tmp_System.map MODPOST vmlinux WARNING: "fee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: "ee_restarts" [arch/powerpc/kernel/built-in] is COMMON symbol WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_string_start (offset 0x8) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_string_end (offset 0xc) ... WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from dt_struct_end (offset 0x20c) WARNING: arch/powerpc/kernel/built-in.o - Section mismatch: reference to .init.data:.got2 from disp_BAT (offset 0x234) WARNING: "primary_pteg_full" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "next_slot" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: "htab_hash_searches" [arch/powerpc/mm/built-in] is COMMON symbol WARNING: arch/powerpc/mm/built-in.o - Section mismatch: reference to .init.text:early_get_page from .text between 'pte_alloc_one_kernel' (at offset 0xf50) and 'v_mapped_by_bats' WARNING: arch/powerpc/platforms/built-in.o - Section mismatch: reference to .init.data:.got2 from (offset 0x0) ... WARNING: arch/powerpc/platforms/built-in.o - Section mismatch: reference to .init.text:pmac_pcibios_after_init from .machine.desc after 'mach_powermac' (at offset 0xc4) WARNING: mm/built-in.o - Section mismatch: reference to .init.text:set_up_list3s from .text between 'kmem_cache_create' (at offset 0x1de88) and 'kmem_cache_shrink' WARNING: mm/built-in.o - Section mismatch: reference to .init.text:set_up_list3s from .text between 'kmem_cache_create' (at offset 0x1dec0) and 'kmem_cache_shrink' I find these 50 or so warnings so scary that I've not yet tried to boot the kernel. Note that this is a non-modular kernel. Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/5] firmware: annotate thou shalt not request fw on init or probe
On Tue, Aug 23, 2016 at 05:45:04PM -0700, mcg...@kernel.org wrote: [snip] > --- > Documentation/firmware_class/README| 20 > drivers/base/Kconfig | 2 +- > .../request_firmware-avoid-init-probe-init.cocci | 130 > + > 3 files changed, 151 insertions(+), 1 deletion(-) > create mode 100644 > scripts/coccinelle/api/request_firmware-avoid-init-probe-init.cocci > > diff --git a/Documentation/firmware_class/README > b/Documentation/firmware_class/README > index cafdca8b3b15..056d1cb9d365 100644 > --- a/Documentation/firmware_class/README > +++ b/Documentation/firmware_class/README > @@ -93,6 +93,26 @@ > user contexts to request firmware asynchronously, but can't be called > in atomic contexts. > > +Requirements: > += > + > +You should avoid at all costs requesting firmware on both init and probe > paths > +of your device driver. Reason for this is the complexity needed to ensure a > +firmware will be available for a driver early in boot through different > +build configurations. Consider built-in drivers needing firmware early, or > +consider a driver assuming it will only get firmware after pivot_root(). > + > +Drivers that really need firmware early should use stuff the firmware in Minor grammatical nit: s/use// > +initramfs or consider using CONFIG_EXTRA_FIRMWARE. Using initramfs is much > +more portable to more distributions as not all distributions wish to enable > +CONFIG_EXTRA_FIRMWARE. Should a driver require the firmware being built-in > +it should depend on CONFIG_EXTRA_FIRMWARE. There is no current annotation for > +requiring a firmware on initramfs. > + > +If you're a maintainer you can help police this with: > + > +$ export > COCCI=scripts/coccinelle/api/request_firmware-avoid-init-probe-init.cocci > +$ make coccicheck MODE=report > > about in-kernel persistence: > ---
Re: [PATCH][V2] crypto/nx: fix spelling mistake: "availavle" -> "available"
On Tue, Nov 14, 2017 at 02:32:17PM +, Colin King wrote: > From: Colin Ian King > > Trivial fix to spelling mistake in pr_err error message text. Also > fix spelling mistake in proceeding comment. s/proceeding/preceding/ ? > > Signed-off-by: Colin Ian King > --- > drivers/crypto/nx/nx-842-powernv.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/nx/nx-842-powernv.c > b/drivers/crypto/nx/nx-842-powernv.c > index f2246a5abcf6..1e87637c412d 100644 > --- a/drivers/crypto/nx/nx-842-powernv.c > +++ b/drivers/crypto/nx/nx-842-powernv.c > @@ -743,8 +743,8 @@ static int nx842_open_percpu_txwins(void) > } > > if (!per_cpu(cpu_txwin, i)) { > - /* shoudn't happen, Each chip will have NX engine */ > - pr_err("NX engine is not availavle for CPU %d\n", i); > + /* shouldn't happen, Each chip will have NX engine */ > + pr_err("NX engine is not available for CPU %d\n", i); > return -EINVAL; > } > } > -- > 2.14.1
Re: [PATCH v2] powerpc/mpc5xxx: Avoid dereferencing potentially freed memory
On Fri, Oct 16, 2015 at 08:20:13AM +0200, Christophe JAILLET wrote: > Le 15/10/2015 08:36, Michael Ellerman a écrit : > >On Thu, 2015-10-15 at 07:56 +0200, Christophe JAILLET wrote: > >>Use 'of_property_read_u32()' instead of 'of_get_property()'+pointer > >>dereference in order to avoid access to potentially freed memory. > >> > >>Use 'of_get_next_parent()' to simplify the while() loop and avoid the > >>need of a temp variable. > >> > >>Signed-off-by: Christophe JAILLET > >>--- > >>v2: Use of_property_read_u32 instead of of_get_property+pointer dereference > >>*** Untested *** > >Thanks. > > > >Can someone with an mpc5xxx test this? > > > >cheers > > > > Hi, > I don't think it is an issue, but while looking at another similar > patch, I noticed that the proposed patch adds a call to > be32_to_cpup() (within of_property_read_u32). > Apparently, powerPC is a BE architecture, so this call should be a no-op. Sadly no more. 32 bit is BE only, but 64 bit can be either BEtter or LEsser. Gabriel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] macintosh: Add module license to ans-lcd
On Mon, Jan 29, 2018 at 01:33:08PM -0600, Larry Finger wrote: > In kernel 4.15, the modprobe step on my PowerBook G5 started complaining that PowerBook G5? Really, could you send a pic! :-) > there was no module license for ans-lcd. > > Signed-off-by: Larry Finger > --- > drivers/macintosh/ans-lcd.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/macintosh/ans-lcd.c b/drivers/macintosh/ans-lcd.c > index 1de81d922d8a..c8e078b911c7 100644 > --- a/drivers/macintosh/ans-lcd.c > +++ b/drivers/macintosh/ans-lcd.c > @@ -201,3 +201,4 @@ anslcd_exit(void) > > module_init(anslcd_init); > module_exit(anslcd_exit); > +MODULE_LICENSE("GPL v2"); > -- > 2.16.1 >
Re: [PATCH] powerpc/traps: Declare unrecoverable_exception() as __noreturn
On Thu, Feb 11, 2021 at 06:34:43AM +, Christophe Leroy wrote: > unrecoverable_exception() is never expected to return, most callers > have an infiniteloop in case it returns. > > Ensure it really never returns by terminating it with a BUG(), and > declare it __no_return. > > It always GCC to really simplify functions calling it. In the exemple below, s/always/allows ? (Otherwise I can't parse it.) > it avoids the stack frame in the likely fast path and avoids code duplication > for the exit. Indeed, nice code generation improvement. > > With this patch: > > 0348 : >348: 81 43 00 84 lwz r10,132(r3) >34c: 71 48 00 02 andi. r8,r10,2 >350: 41 82 00 2c beq 37c >354: 71 4a 40 00 andi. r10,r10,16384 >358: 40 82 00 20 bne 378 >35c: 80 62 00 70 lwz r3,112(r2) >360: 74 63 00 01 andis. r3,r3,1 >364: 40 82 00 28 bne 38c >368: 7d 40 00 a6 mfmsr r10 >36c: 7c 11 13 a6 mtspr 81,r0 >370: 7c 12 13 a6 mtspr 82,r0 >374: 4e 80 00 20 blr >378: 48 00 00 00 b 378 Infinite loop (seems to be on test of MSR_PR)? Gabriel >37c: 94 21 ff f0 stwur1,-16(r1) >380: 7c 08 02 a6 mflrr0 >384: 90 01 00 14 stw r0,20(r1) >388: 48 00 00 01 bl 388 > 388: R_PPC_REL24unrecoverable_exception >38c: 38 e2 00 70 addir7,r2,112 >390: 3d 00 00 01 lis r8,1 >394: 7c c0 38 28 lwarx r6,0,r7 >398: 7c c6 40 78 andcr6,r6,r8 >39c: 7c c0 39 2d stwcx. r6,0,r7 >3a0: 40 a2 ff f4 bne 394 >3a4: 38 60 00 01 li r3,1 >3a8: 4b ff ff c0 b 368 > > Without this patch: > > 0348 : >348: 94 21 ff f0 stwur1,-16(r1) >34c: 93 e1 00 0c stw r31,12(r1) >350: 7c 7f 1b 78 mr r31,r3 >354: 81 23 00 84 lwz r9,132(r3) >358: 71 2a 00 02 andi. r10,r9,2 >35c: 41 82 00 34 beq 390 >360: 71 29 40 00 andi. r9,r9,16384 >364: 40 82 00 28 bne 38c >368: 80 62 00 70 lwz r3,112(r2) >36c: 74 63 00 01 andis. r3,r3,1 >370: 40 82 00 3c bne 3ac >374: 7d 20 00 a6 mfmsr r9 >378: 7c 11 13 a6 mtspr 81,r0 >37c: 7c 12 13 a6 mtspr 82,r0 >380: 83 e1 00 0c lwz r31,12(r1) >384: 38 21 00 10 addir1,r1,16 >388: 4e 80 00 20 blr >38c: 48 00 00 00 b 38c >390: 7c 08 02 a6 mflrr0 >394: 90 01 00 14 stw r0,20(r1) >398: 48 00 00 01 bl 398 > 398: R_PPC_REL24unrecoverable_exception >39c: 80 01 00 14 lwz r0,20(r1) >3a0: 81 3f 00 84 lwz r9,132(r31) >3a4: 7c 08 03 a6 mtlrr0 >3a8: 4b ff ff b8 b 360 >3ac: 39 02 00 70 addir8,r2,112 >3b0: 3d 40 00 01 lis r10,1 >3b4: 7c e0 40 28 lwarx r7,0,r8 >3b8: 7c e7 50 78 andcr7,r7,r10 >3bc: 7c e0 41 2d stwcx. r7,0,r8 >3c0: 40 a2 ff f4 bne 3b4 >3c4: 38 60 00 01 li r3,1 >3c8: 4b ff ff ac b 374 > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/interrupt.h | 2 +- > arch/powerpc/kernel/interrupt.c | 1 - > arch/powerpc/kernel/traps.c | 2 ++ > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/interrupt.h > b/arch/powerpc/include/asm/interrupt.h > index dcff30e3919b..fa8bfb91f8df 100644 > --- a/arch/powerpc/include/asm/interrupt.h > +++ b/arch/powerpc/include/asm/interrupt.h > @@ -411,7 +411,7 @@ DECLARE_INTERRUPT_HANDLER(altivec_assist_exception); > DECLARE_INTERRUPT_HANDLER(CacheLockingException); > DECLARE_INTERRUPT_HANDLER(SPEFloatingPointException); > DECLARE_INTERRUPT_HANDLER(SPEFloatingPointRoundException); > -DECLARE_INTERRUPT_HANDLER(unrecoverable_exception); > +DECLARE_INTERRUPT_HANDLER(unrecoverable_exception) __noreturn; > DECLARE_INTERRUPT_HANDLER(WatchdogException); > DECLARE_INTERRUPT_HANDLER(kernel_bad_stack); > > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c > index eca3be36c18c..7e7106641ca9 100644 > --- a/arch/powerpc/kernel/interrupt.c > +++ b/arch/powerpc/kernel/interrupt.c > @@ -440,7 +440,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct > pt_regs *regs, unsigned > return ret; > } > > -void unrecoverable_exception(struct pt_regs *regs); > void preempt_schedule_irq(void); > > notrace unsigned long interrupt_exit
Re: [PATCH 3/3] perf: Use 64-bit value when comparing sample_regs
On Thu, Mar 06, 2014 at 09:44:47AM +, David Laight wrote: > From: Sukadev Bhattiprolu > > When checking whether a bit representing a register is set in > > sample_regs, a 64-bit mask, use 64-bit value (1LL). > > > > Signed-off-by: Sukadev Bhattiprolu > > --- > > tools/perf/util/unwind.c |4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/tools/perf/util/unwind.c b/tools/perf/util/unwind.c > > index 742f23b..2b888c6 100644 > > --- a/tools/perf/util/unwind.c > > +++ b/tools/perf/util/unwind.c > > @@ -396,11 +396,11 @@ static int reg_value(unw_word_t *valp, struct > > regs_dump *regs, int id, > > { > > int i, idx = 0; > > > > - if (!(sample_regs & (1 << id))) > > + if (!(sample_regs & (1LL << id))) > > return -EINVAL; > > > > for (i = 0; i < id; i++) { > > - if (sample_regs & (1 << i)) > > + if (sample_regs & (1LL << i)) > > idx++; > > } > > There are much faster ways to count the number of set bits, especially > if you might need to check a significant number of bits. > There might even be a function defined somewhere to do it. Indeed, look for Hamming weight (hweight family of functions) in asm/hweight.h and what is included from there. Besides that, many modern processors also have a machine instruction to perform this task. In the processor manuals the instruction is described as population count and the mnemonic starts with "popcnt" on x86 and ppc. Gabriel > Basically you just add up the bits, for 16 bit it would be: > val = (val & 0x) + (val >> 1) & 0x; > val = (val & 0x) + (val >> 2) & 0x; > val = (val & 0x0f0f) + (val >> 4) & 0x0f0f; > val = (val & 0x00ff) + (val >> 8) & 0x00ff; > As the size of the work increases the improvement is more significant. > (Some of the later masking can probably be proven unnecessary.) > > David > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] powerpc: add ioremap_wt
On Mon, Feb 03, 2014 at 08:16:49AM +0100, Michael Moese wrote: > Allow for IO memory to be mapped cacheable for performing > PCI read bursts. > > Signed-off-by: Michael Moese > --- > arch/powerpc/include/asm/io.h | 3 +++ > arch/powerpc/mm/pgtable_32.c | 8 > 2 files changed, 11 insertions(+) > > diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h > index 45698d5..9591fff 100644 > --- a/arch/powerpc/include/asm/io.h > +++ b/arch/powerpc/include/asm/io.h > @@ -631,6 +631,8 @@ static inline void iosync(void) > * > * * ioremap_wc enables write combining > * > + * * ioremap_wc enables write thru Typo: _wc -> _wt Looks fine in principle, but there is a significant difference with wc on x86, where read accesses always go to the bus (no read caching). Gabriel > + * > * * iounmap undoes such a mapping and can be hooked > * > * * __ioremap_at (and the pending __iounmap_at) are low level functions to > @@ -652,6 +654,7 @@ extern void __iomem *ioremap(phys_addr_t address, > unsigned long size); > extern void __iomem *ioremap_prot(phys_addr_t address, unsigned long size, > unsigned long flags); > extern void __iomem *ioremap_wc(phys_addr_t address, unsigned long size); > +extern void __iomem *ioremap_wt(phys_addr_t address, unsigned long size); > #define ioremap_nocache(addr, size) ioremap((addr), (size)) > > extern void iounmap(volatile void __iomem *addr); > diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c > index 51f8795..9ab0a54 100644 > --- a/arch/powerpc/mm/pgtable_32.c > +++ b/arch/powerpc/mm/pgtable_32.c > @@ -141,6 +141,14 @@ ioremap_wc(phys_addr_t addr, unsigned long size) > EXPORT_SYMBOL(ioremap_wc); > > void __iomem * > +ioremap_wt(phys_addr_t addr, unsigned long size) > +{ > + return __ioremap_caller(addr, size, _PAGE_WRITETHRU, > + __builtin_return_address(0)); > +} > +EXPORT_SYMBOL(ioremap_wt); > + > +void __iomem * > ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) > { > /* writeable implies dirty for kernel addresses */ > -- > 1.8.5.3 > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: deb-pkg: Add support for powerpc little endian
On Fri, Sep 05, 2014 at 03:28:47PM +1000, Michael Neuling wrote: > The Debian powerpc little endian architecture is called ppc64le. This Huh? ppc64le or ppc64el? > is the default architecture used by Ubuntu for powerpc. > > The below checks the kernel config to see if we are compiling little > endian and sets the Debian arch appropriately. > > Signed-off-by: Michael Neuling > > diff --git a/scripts/package/builddeb b/scripts/package/builddeb > index 35d5a58..6f4a1af 100644 > --- a/scripts/package/builddeb > +++ b/scripts/package/builddeb > @@ -37,7 +37,7 @@ create_package() { > s390*) > debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG && echo x > || true) ;; > ppc*) > - debarch=powerpc ;; > + debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG && echo > ppc64el || echo powerpc) ;; > parisc*) > debarch=hppa ;; > mips*) Gabriel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] powerpc32: use stmw/lmw for non volatile registers save/restore
On Mon, May 23, 2016 at 10:46:36AM +0200, Christophe Leroy wrote: > lmw/stmw have a 1 cycle (2 cycles for lmw on some ppc) in addition > and implies serialising, however it reduces the amount of instructions > hence the amount of instruction fetch compared to the equivalent > operation with several lzw/stw. It means less pressure on cache and Minor typo, s/lzw/lwz/. > less fetching delays on slow memory. > When we transfer 20 registers, it is worth it. > gcc uses stmw/lmw at function entry/exit to save/restore non > volatile register, so lets also do it that way. > > On powerpc64, we can't use lmw/stmw as it only handles 32 bits, so > we move longjmp() and setjmp() from misc.S to misc_64.S, and we > write a 32 bits version in misc_32.S using stmw/lmw > > Signed-off-by: Christophe Leroy > --- > The patch goes on top of "powerpc: inline current_stack_pointer()" or > requires trivial manual merge in arch/powerpc/kernel/misc.S > > arch/powerpc/include/asm/ppc_asm.h | 6 ++-- > arch/powerpc/kernel/misc.S | 61 > -- > arch/powerpc/kernel/misc_32.S | 22 ++ > arch/powerpc/kernel/misc_64.S | 61 > ++ > 4 files changed, 85 insertions(+), 65 deletions(-) > > diff --git a/arch/powerpc/include/asm/ppc_asm.h > b/arch/powerpc/include/asm/ppc_asm.h > index 2b31632..e29b649 100644 > --- a/arch/powerpc/include/asm/ppc_asm.h > +++ b/arch/powerpc/include/asm/ppc_asm.h > @@ -82,10 +82,8 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) > #else > #define SAVE_GPR(n, base)stw n,GPR0+4*(n)(base) > #define REST_GPR(n, base)lwz n,GPR0+4*(n)(base) > -#define SAVE_NVGPRS(base)SAVE_GPR(13, base); SAVE_8GPRS(14, base); \ > - SAVE_10GPRS(22, base) > -#define REST_NVGPRS(base)REST_GPR(13, base); REST_8GPRS(14, base); \ > - REST_10GPRS(22, base) > +#define SAVE_NVGPRS(base)stmw13, GPR0+4*13(base) > +#define REST_NVGPRS(base)lmw 13, GPR0+4*13(base) > #endif > > #define SAVE_2GPRS(n, base) SAVE_GPR(n, base); SAVE_GPR(n+1, base) > diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S > index 7ce26d4..9de71d8 100644 > --- a/arch/powerpc/kernel/misc.S > +++ b/arch/powerpc/kernel/misc.S > @@ -53,64 +53,3 @@ _GLOBAL(add_reloc_offset) > > .align 3 > 2: PPC_LONG 1b > - > -_GLOBAL(setjmp) > - mflrr0 > - PPC_STL r0,0(r3) > - PPC_STL r1,SZL(r3) > - PPC_STL r2,2*SZL(r3) > - mfcrr0 > - PPC_STL r0,3*SZL(r3) > - PPC_STL r13,4*SZL(r3) > - PPC_STL r14,5*SZL(r3) > - PPC_STL r15,6*SZL(r3) > - PPC_STL r16,7*SZL(r3) > - PPC_STL r17,8*SZL(r3) > - PPC_STL r18,9*SZL(r3) > - PPC_STL r19,10*SZL(r3) > - PPC_STL r20,11*SZL(r3) > - PPC_STL r21,12*SZL(r3) > - PPC_STL r22,13*SZL(r3) > - PPC_STL r23,14*SZL(r3) > - PPC_STL r24,15*SZL(r3) > - PPC_STL r25,16*SZL(r3) > - PPC_STL r26,17*SZL(r3) > - PPC_STL r27,18*SZL(r3) > - PPC_STL r28,19*SZL(r3) > - PPC_STL r29,20*SZL(r3) > - PPC_STL r30,21*SZL(r3) > - PPC_STL r31,22*SZL(r3) > - li r3,0 > - blr > - > -_GLOBAL(longjmp) > - PPC_LCMPI r4,0 > - bne 1f > - li r4,1 > -1: PPC_LL r13,4*SZL(r3) > - PPC_LL r14,5*SZL(r3) > - PPC_LL r15,6*SZL(r3) > - PPC_LL r16,7*SZL(r3) > - PPC_LL r17,8*SZL(r3) > - PPC_LL r18,9*SZL(r3) > - PPC_LL r19,10*SZL(r3) > - PPC_LL r20,11*SZL(r3) > - PPC_LL r21,12*SZL(r3) > - PPC_LL r22,13*SZL(r3) > - PPC_LL r23,14*SZL(r3) > - PPC_LL r24,15*SZL(r3) > - PPC_LL r25,16*SZL(r3) > - PPC_LL r26,17*SZL(r3) > - PPC_LL r27,18*SZL(r3) > - PPC_LL r28,19*SZL(r3) > - PPC_LL r29,20*SZL(r3) > - PPC_LL r30,21*SZL(r3) > - PPC_LL r31,22*SZL(r3) > - PPC_LL r0,3*SZL(r3) > - mtcrf 0x38,r0 > - PPC_LL r0,0(r3) > - PPC_LL r1,SZL(r3) > - PPC_LL r2,2*SZL(r3) > - mtlrr0 > - mr r3,r4 > - blr > diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S > index d9c912b..de419e9 100644 > --- a/arch/powerpc/kernel/misc_32.S > +++ b/arch/powerpc/kernel/misc_32.S > @@ -1086,3 +1086,25 @@ relocate_new_kernel_end: > relocate_new_kernel_size: > .long relocate_new_kernel_end - relocate_new_kernel > #endif > + > +_GLOBAL(setjmp) > + mflrr0 > + li r3, 0 > + stw r0, 0(r3) Huh? Explicitly writing to address 0? Has this code been test run at least once? At least move the li r3,0 to just before the blr. Gabriel > + stw r1, 4(r3) > + stw r2, 8(r3) > + mfcrr12 > + stmwr12, 12(r3) > + blr > + > +_GLOBAL(longjmp) > + lwz r0, 0(r3) > + lwz r1, 4(r3) > + lwz r2, 8(r3) > + lmw r12, 12(r3) > + mtcrf 0x38, r12 > + mtlrr0 > + mr. r3, r4 > + bnelr > + li r3, 1 > + blr > diff --git a/arch/p
Re: [PATCH] powerpc: inline current_stack_pointer()
On Mon, May 23, 2016 at 10:46:02AM +0200, Christophe Leroy wrote: > current_stack_pointeur() is a single instruction function. it > It is not worth breaking the execution flow with a bl/blr for a > single instruction Are you sure that the result is always the same? Calling an external function prevents the compiler from considering the caller of of current_stack_pointer a leaf function, which certainly does not help the compiler, but in a leaf function the compiler is free not to establish a new frame. If the compiler decides to establishes a new frame (typically with "stwu r1,-frame_size(r1)"), *r1 is the previous stack pointer, or the caller's stack pointer, or the current function frame pointer if I remember correctly the ABI definitions. However, if the compiler decides that it can get away without a frame for the function, *r1 is the stack pointer of the caller's caller. Depending on the application, this may or may not be important. By the way, isn't there a GCC builtin which can perform this task, for example builtin_frame_address()? Gabriel > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/reg.h | 7 ++- > arch/powerpc/kernel/misc.S | 4 > arch/powerpc/kernel/ppc_ksyms.c | 2 -- > 3 files changed, 6 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h > index c1e82e9..7ce6777 100644 > --- a/arch/powerpc/include/asm/reg.h > +++ b/arch/powerpc/include/asm/reg.h > @@ -1301,7 +1301,12 @@ static inline unsigned long mfvtb (void) > > #define proc_trap() asm volatile("trap") > > -extern unsigned long current_stack_pointer(void); > +static inline unsigned long current_stack_pointer(void) > +{ > + register unsigned long *ptr asm("r1"); > + > + return *ptr; > +} > > extern unsigned long scom970_read(unsigned int address); > extern void scom970_write(unsigned int address, unsigned long value); > diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S > index 0d43219..7ce26d4 100644 > --- a/arch/powerpc/kernel/misc.S > +++ b/arch/powerpc/kernel/misc.S > @@ -114,7 +114,3 @@ _GLOBAL(longjmp) > mtlrr0 > mr r3,r4 > blr > - > -_GLOBAL(current_stack_pointer) > - PPC_LL r3,0(r1) > - blr > diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c > index 9f01e28..eb5c5dc 100644 > --- a/arch/powerpc/kernel/ppc_ksyms.c > +++ b/arch/powerpc/kernel/ppc_ksyms.c > @@ -33,5 +33,3 @@ EXPORT_SYMBOL(store_vr_state); > #ifdef CONFIG_EPAPR_PARAVIRT > EXPORT_SYMBOL(epapr_hypercall_start); > #endif > - > -EXPORT_SYMBOL(current_stack_pointer); > -- > 2.1.0 > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: Discard ffs() function and use builtin_ffs instead
On Fri, May 13, 2016 at 04:16:57PM +1000, Michael Ellerman wrote: > On Thu, 2016-12-05 at 15:32:22 UTC, Christophe Leroy wrote: > > With the ffs() function as defined in arch/powerpc/include/asm/bitops.h > > GCC will not optimise the code in case of constant parameter, as shown > > by the small exemple below. > > > > int ffs_test(void) > > { > > return 4 << ffs(31); > > } > > > > c0012334 : > > c0012334: 39 20 00 01 li r9,1 > > c0012338: 38 60 00 04 li r3,4 > > c001233c: 7d 29 00 34 cntlzw r9,r9 > > c0012340: 21 29 00 20 subfic r9,r9,32 > > c0012344: 7c 63 48 30 slw r3,r3,r9 > > c0012348: 4e 80 00 20 blr > > > > With this patch, the same function will compile as follows: > > > > c0012334 : > > c0012334: 38 60 00 08 li r3,8 > > c0012338: 4e 80 00 20 blr > > > But what code does it generate when it's not a constant? > > And which gcc version first added the builtin version? It already existed in gcc-2.95, which you do not want to use to compile anything today but I have in a corner of a chroot environment to maintain ~1997 vintage embedded stuff, running a 2.2.12 kernel! Hopefully this clears up your concerns :-) Cheers, Gabriel
Re: [RFC] Design for flag bit outputs from asms
On Mon, May 04, 2015 at 12:33:38PM -0700, Richard Henderson wrote: [snipped] > (3) Note that ppc is both easier and more complicated. > > There we have 8 4-bit registers, although most of the integer > non-comparisons only write to CR0. And the vector non-comparisons > only write to CR1, though of course that's of less interest in the > context of kernel code. Actually vector (Altivec) write to CR6. Standard FPU optionally write to CR1, but the written value does not exactly depend on the result of the last instruction; it is an instead an accrued exception status. > > For the purposes of cr0, the same scheme could certainly work, although > the hook would not insert a hard register use, but rather a pseudo to > be allocated to cr0 (constaint "x"). Yes, but we might also want to leave the choice of a cr register to the compiler. > > That said, it's my understanding that "dot insns", setting cr0 are > expensive in current processor generations. Not that much if I understand properly power7.md and power8.md: no (P7) or one (P8) additional clock for common instructions (add/sub and logical), but nothing else, so they are likely a win. Shift/rotate/sign extensions seem to have more decoding restrictions: the recording ("dot") forms are "cracked" and use 2 integer units. > There's also a lot less > of the x86-style "operate and set a flag based on something useful". > But there is at least an important one, which I occasionally wished I had: the conditional stores. The overflow bit might also be useful, not really for the kernel, but for applications (and mfxer is slow). Regards, Gabriel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100
On Wed, Mar 09, 2016 at 12:38:18AM -0600, Scott Wood wrote: > On Tue, Mar 08, 2016 at 08:59:12AM +0100, Alessio Igor Bogani wrote: > > The mtmsr() function hangs during restart. Make reboot works on > > MVME5100 removing that function call. > > --- > > arch/powerpc/platforms/embedded6xx/mvme5100.c | 2 -- > > 1 file changed, 2 deletions(-) > > Missing signoff > > Do you know why MSR_IP was there to begin with? Huh? The patch sets MSR_IP for reset but it is cleared while running Linux. I don't have any MVME5100, however I do have MVME2400/2700 with the same bridge (Hawk), so I can say that the address space layout is quite standard: memory at 0, ROM at the high end of the 32-bit address space. However the reset method is quite different (no external register set on the Hawk). > Does this board have a switch that determines whether boot vectors > are high or low (I remember some 83xx boards that did), in which > case is this fixing one config by breaking another? For the switch, no AFAICT. And the code is MVME5100 specific so I suspect that it is very unlikely to break other boards. Very likely the source of the problem is that the restart address is remapped (ioremap) but never accessed while the kernel is running (the only access to *restart is in the reboot routine) so we take a DSI exception to fill the hash table when attempting to reboot. It would be enough to move the setting of MSR_IP until after triggering the restart, but this performs a hard reset of the CPU, which will set MSR_IP anyway (granted that the CPU will probably set MSR_IP way before the reset signal comes in). One way to check this hypothesis would be to introduce a write of 0 to the restart address before setting MSR_IP. This said restart is declared as u_char *, so the cast in the out_8 register access is useless and ugly. Gabriel P.S.: my MVME24xx/26x/27xx do not run such a modern kernel.
Re: [1/1] powerpc/embedded6xx: Make reboot works on MVME5100
On Wed, Mar 09, 2016 at 03:26:21PM -0600, Scott Wood wrote: > On Wed, 2016-03-09 at 11:28 +0100, Gabriel Paubert wrote: > > On Wed, Mar 09, 2016 at 12:38:18AM -0600, Scott Wood wrote: > > > On Tue, Mar 08, 2016 at 08:59:12AM +0100, Alessio Igor Bogani wrote: > > > > The mtmsr() function hangs during restart. Make reboot works on > > > > MVME5100 removing that function call. > > > > --- > > > > arch/powerpc/platforms/embedded6xx/mvme5100.c | 2 -- > > > > 1 file changed, 2 deletions(-) > > > > > > Missing signoff > > > > > > Do you know why MSR_IP was there to begin with? > > > > Huh? The patch sets MSR_IP for reset but it is cleared while running Linux. > > I don't have any MVME5100, however I do have MVME2400/2700 with the same > > bridge (Hawk), so I can say that the address space layout is quite standard: > > memory at 0, ROM at the high end of the 32-bit address space. However the > > reset method is quite different (no external register set on the Hawk). > > I meant why the line of code was there, not why the bit was ever set. It > seems odd to me that the pre-reset value of MSR would matter at all -- > shouldn't it get reset along with the rest of the CPU, as you later suggest? I don't know, it may create a soft reset. I have the board documentation and it is not clear (at least not 100% clear to me) whether this reset drives the HRESET of SRESET pin, although HRESET is more likely. > And if it does matter, then why are Alessio and whoever wrote the code > (assuming it wasn't an untested copy-and-paste from somewhere else) seeing > different results regarding whether IP needs to be set or unset? The different results are perfectly explained by taking a DSI exception on accessing *restart (the only access to this address in the whole code, so the hash table has to be filled at this time). With MSR_IP set, DSI exception goes to 0xfff00300, and then probably hangs. > > > > Does this board have a switch that determines whether boot vectors > > > are high or low (I remember some 83xx boards that did), in which > > > case is this fixing one config by breaking another? > > > > For the switch, no AFAICT. And the code is MVME5100 specific so I > > suspect that it is very unlikely to break other boards. > > > > Very likely the source of the problem is that the restart address is > > remapped > > (ioremap) but never accessed while the kernel is running (the only access to > > *restart is in the reboot routine) so we take a DSI exception to fill the > > hash > > table when attempting to reboot. > > > > It would be enough to move the setting of MSR_IP until after triggering the > > restart, but this performs a hard reset of the CPU, which will set MSR_IP > > anyway (granted that the CPU will probably set MSR_IP way before the reset > > signal comes in). > > How can you perform anything after the reset is triggered? There might be > some lag before it takes effect, but depending on that seems hackish and > unreliable, and why would it make a difference? It would make a difference if this reset is connected to SRESET, which does not affect MSR_IP. > > > One way to check this hypothesis would be to introduce a write of 0 to > > the restart address before setting MSR_IP. > > I'm not familiar with the register interface for this board logic, but what > would that demonstrate? Writing a 0 would not trigger the reset, but ensure that the hash table is filled, then you can change MSR_IP and perform the real reset, and be sure that you'll end up executing the reset in FLASH, not in RAM. The alternative solution it to put code at 0x100 that sets MSR_IP and branches to 0xfff00100. > > > This said restart is declared as u_char *, so the cast in the out_8 > > register access is useless and ugly. > > Yes, and restart should be __iomem. Indeed. Gabriel
Re: [PATCH v2 3/6] powerpc: Convert flush_icache_range & friends to C
On Tue, Sep 03, 2019 at 01:31:57PM -0500, Segher Boessenkool wrote: > On Tue, Sep 03, 2019 at 07:05:19PM +0200, Christophe Leroy wrote: > > Le 03/09/2019 à 18:04, Segher Boessenkool a écrit : > > >(Why are they separate though? It could just be one loop var). > > > > Yes it could just be a single loop var, but in that case it would have > > to be reset at the start of the second loop, which means we would have > > to pass 'addr' for resetting the loop anyway, > > Right, I noticed that after hitting send, as usual. > > > so I opted to do it > > outside the inline asm by using to separate loop vars set to their > > starting value outside the inline asm. > > The thing is, the way it is written now, it will get separate registers > for each loop (with proper earlyclobbers added). Not that that really > matters of course, it just feels wrong :-) After "mtmsr %3", it is always possible to copy %0 to %3 and use it as an address register for the second loop. One register less to allocate for the compiler. Constraints of course have to be adjusted. Gabriel > > > Segher
Re: [PATCH v2 13/29] arch: add split IPC system calls where needed
On Fri, Jan 18, 2019 at 05:18:19PM +0100, Arnd Bergmann wrote: > The IPC system call handling is highly inconsistent across architectures, > some use sys_ipc, some use separate calls, and some use both. We also > have some architectures that require passing IPC_64 in the flags, and > others that set it implicitly. > > For the additon of a y2083 safe semtimedop() system call, I chose to only It's not critical, but there are two typos in that line: additon -> addition 2083 -> 2038 Gabriel > support the separate entry points, but that requires first supporting > the regular ones with their own syscall numbers. > > The IPC_64 is now implied by the new semctl/shmctl/msgctl system > calls even on the architectures that require passing it with the ipc() > multiplexer. > > I'm not adding the new semtimedop() or semop() on 32-bit architectures, > those will get implemented using the new semtimedop_time64() version > that gets added along with the other time64 calls. > Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop(). > > Signed-off-by: Arnd Bergmann
Re: [PATCH v13 00/10] powerpc: Switch to CONFIG_THREAD_INFO_IN_TASK
On Thu, Jan 24, 2019 at 04:58:41PM +0100, Christophe Leroy wrote: > > > Le 24/01/2019 à 16:01, Christophe Leroy a écrit : > > > > > > Le 24/01/2019 à 10:43, Christophe Leroy a écrit : > > > > > > > > > On 01/24/2019 01:06 AM, Michael Ellerman wrote: > > > > Christophe Leroy writes: > > > > > Le 12/01/2019 à 10:55, Christophe Leroy a écrit : > > > > > > The purpose of this serie is to activate > > > > > > CONFIG_THREAD_INFO_IN_TASK which > > > > > > moves the thread_info into task_struct. > > > > > > > > > > > > Moving thread_info into task_struct has the following advantages: > > > > > > - It protects thread_info from corruption in the case of stack > > > > > > overflows. > > > > > > - Its address is harder to determine if stack addresses are > > > > > > leaked, making a number of attacks more difficult. > > > > > > > > > > I ran null_syscall and context_switch benchmark selftests > > > > > and the result > > > > > is surprising. There is slight degradation in context_switch and a > > > > > significant one on null_syscall: > > > > > > > > > > Without the serie: > > > > > > > > > > ~# chrt -f 98 ./context_switch --no-altivec --no-vector --no-fp > > > > > 55542 > > > > > 55562 > > > > > 55564 > > > > > 55562 > > > > > 55568 > > > > > ... > > > > > > > > > > ~# ./null_syscall > > > > > 2546.71 ns 336.17 cycles > > > > > > > > > > > > > > > With the serie: > > > > > > > > > > ~# chrt -f 98 ./context_switch --no-altivec --no-vector --no-fp > > > > > 55138 > > > > > 55142 > > > > > 55152 > > > > > 55144 > > > > > 55142 > > > > > > > > > > ~# ./null_syscall > > > > > 3479.54 ns 459.30 cycles > > > > > > > > > > So 0,8% less context switches per second and 37% more time > > > > > for one syscall ? > > > > > > > > > > Any idea ? > > > > > > > > What platform is that on? > > > > > > It is on the 8xx > > On the 83xx, I have a slight improvment: > > Without the serie: > > root@vgoippro:~# ./null_syscall > 921.44 ns 307.15 cycles > > With the serie: > > root@vgoippro:~# ./null_syscall > 918.78 ns 306.26 cycles > The 8xx has very low cache associativity, something like 2, right? In this case it may be access patterns which change the number of cache line transfers when you move things around. Try to move things around in main(), for example allocate a variable of ~1kB on the stack in the function that performs the null_syscalls (use the variable before and after the loop, to avoid clever compiler optimizations). Gabriel > Christophe > > > > > > > > > > > > On 64-bit we have to turn one mtmsrd into two and that's obviously a > > > > slow down. But I don't see that you've done anything similar in 32-bit > > > > code. > > > > > > > > I assume it's patch 8 that causes the slow down? > > > > > > I have not digged into it yet, but why patch 8 ? > > > > > > > The increase of null_syscall duration happens with patch 5 when we > > activate CONFIG_THREAD_INFO_IN_TASK. > >
Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2
On Fri, Aug 05, 2016 at 01:28:02PM +0200, Christophe Leroy wrote: > Signed-off-by: Christophe Leroy > --- > arch/powerpc/kernel/misc_32.S | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S > index e025230..e18055c 100644 > --- a/arch/powerpc/kernel/misc_32.S > +++ b/arch/powerpc/kernel/misc_32.S > @@ -578,9 +578,8 @@ _GLOBAL(__bswapdi2) > rlwimi r9,r4,24,0,7 > rlwimi r10,r3,24,0,7 > rlwimi r9,r4,24,16,23 > - rlwimi r10,r3,24,16,23 > + rlwimi r4,r3,24,16,23 > mr r3,r9 > - mr r4,r10 > blr > Hmmm, are you sure that it works? rlwimi is a bit special since the first operand is both an input and an output of the instruction. In other words, did you test the code? Gabriel
Re: [PATCH] powerpc/8xx: fix single_step debug
On Thu, Aug 18, 2016 at 11:44:20AM +0200, Christophe Leroy wrote: > SPRN_ICR must be read for clearing the internal freeze signal which > is asserted by the single step exception, otherwise the timebase and > decrementer remain freezed Minor nit: s/freezed/frozen/ If the timebase and decrementer are frozen even for a few cycles, this probably upsets timekeeping. I consider this a completely stupid design decision, and maybe I'm not alone. Gabriel > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/reg_8xx.h | 1 + > arch/powerpc/kernel/traps.c| 8 > 2 files changed, 9 insertions(+) > > diff --git a/arch/powerpc/include/asm/reg_8xx.h > b/arch/powerpc/include/asm/reg_8xx.h > index feaf641..6dae71f 100644 > --- a/arch/powerpc/include/asm/reg_8xx.h > +++ b/arch/powerpc/include/asm/reg_8xx.h > @@ -17,6 +17,7 @@ > #define SPRN_DC_DAT 570 /* Read-only data register */ > > /* Misc Debug */ > +#define SPRN_ICR 148 > #define SPRN_DPDR630 > #define SPRN_MI_CAM 816 > #define SPRN_MI_RAM0 817 > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c > index 2cb5892..0f1f0ce 100644 > --- a/arch/powerpc/kernel/traps.c > +++ b/arch/powerpc/kernel/traps.c > @@ -400,8 +400,16 @@ static inline int check_io_access(struct pt_regs *regs) > #define REASON_TRAP 0x2 > > #define single_stepping(regs)((regs)->msr & MSR_SE) > +#ifdef CONFIG_PPC_8xx > +static inline void clear_single_step(struct pt_regs *regs) > +{ > + regs->msr &= ~MSR_SE; > + mfspr(SPRN_ICR); > +} > +#else > #define clear_single_step(regs) ((regs)->msr &= ~MSR_SE) > #endif > +#endif > > #if defined(CONFIG_4xx) > int machine_check_4xx(struct pt_regs *regs) > -- > 2.1.0
Re: [PATCH] powerpc/8xx: fix single_step debug
On Thu, Aug 18, 2016 at 12:13:21PM +0200, Christophe Leroy wrote: > > > Le 18/08/2016 à 11:58, Gabriel Paubert a écrit : > >On Thu, Aug 18, 2016 at 11:44:20AM +0200, Christophe Leroy wrote: > >>SPRN_ICR must be read for clearing the internal freeze signal which > >>is asserted by the single step exception, otherwise the timebase and > >>decrementer remain freezed > > > >Minor nit: s/freezed/frozen/ > > > >If the timebase and decrementer are frozen even for a few cycles, this > >probably upsets timekeeping. I consider this a completely stupid design > >decision, and maybe I'm not alone. > > > >Gabriel > > We could also unset TBF bit (TimeBase Freeze enable) in TBSCR > register (today it is set in > arch/powerpc/platforms/8xx/m8xx_setup.c) but then it would impact > debug done with an external BDM system which expects the decrementer > and TB frozen when it freezes the execution. Ok, I believe that systematically setting it is a mistake, but then I'm always a bit nervous about screwing up timekeeping (it certainly is always a very bad idea when you are driving telescopes). Gabriel
Re: Re:Re: [PATCH] powerpc: Fix a bug in __div64_32 if divisor is zero
On Sun, Aug 23, 2020 at 12:54:33AM +0800, Guohua Zhong wrote: > >In generic version in lib/math/div64.c, there is no checking of 'base' > >either. > >Do we really want to add this check in the powerpc version only ? > > >The only user of __div64_32() is do_div() in > >include/asm-generic/div64.h. Wouldn't it be better to do the check there ? > > >Christophe > > Yet, I have noticed that there is no checking of 'base' in these functions. > But I am not sure how to check is better.As we know that the result is > undefined when divisor is zero. It maybe good to print error and dump stack. > Let the process to know that the divisor is zero by sending SIGFPE. > > diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h > index a3b98c86f077..161c656ee3ee 100644 > --- a/include/asm-generic/div64.h > +++ b/include/asm-generic/div64.h > @@ -43,6 +43,11 @@ > # define do_div(n,base) ({ \ > uint32_t __base = (base); \ > uint32_t __rem; \ > + if (unlikely(base == 0)) { \ > + pr_err("do_div base=%d\n",base);\ > + dump_stack(); \ > + force_sig(SIGFPE); \ > + } > I suspect this will generate a strong reaction. SIGFPE is for user space instruction attempting a division by zero. A division by zero in the kernel is a kernel bug, period, and you don't want to kill a user process for this reason. If it happens in an interrupt, the context of the kernel may not even be related to the current process. Many other architectures (x86 for example) already trigger an exception on a division by zero but the handler will find that the exception happened in kernel context and generate an Oops, not raise a signal in a (possibly innocent) userland process. Gabriel > Then it also needto add this checking in functions of > div64_s64(), div64_u64(), div64_u64_rem(), div_s64_rem and div_u64_rem () > in include/linux/math64.h > > + if (unlikely(divisor == 0)) { > + pr_err("%s divisor=0\n",__func__); > + dump_stack(); > + force_sig(SIGFPE); > + } > > Guohua > > >>lwz r5,0(r3)# get the dividend into r5/r6 > >>lwz r6,4(r3) > >>cmplw r5,r4 > >>@@ -52,6 +55,7 @@ __div64_32: > >> 4:stw r7,0(r3)# return the quotient in *r3 > >>stw r8,4(r3) > >>mr r3,r6 # return the remainder in r3 > >>+5: # return if divisor r4 is zero > >>blr > >> > >> /* > >>diff --git a/arch/powerpc/lib/div64.S b/arch/powerpc/lib/div64.S > >>index 3d5426e7dcc4..1cc9bcabf678 100644 > >>--- a/arch/powerpc/lib/div64.S > >>+++ b/arch/powerpc/lib/div64.S > >>@@ -13,6 +13,9 @@ > >> #include > >> > >> _GLOBAL(__div64_32) > >>+ li r9,0 > >>+ cmplw r4,r9 # check if divisor r4 is zero > >>+ beq 5f # jump to label 5 if r4(divisor) is zero > >>lwz r5,0(r3)# get the dividend into r5/r6 > >>lwz r6,4(r3) > >>cmplw r5,r4 > >>@@ -52,4 +55,5 @@ _GLOBAL(__div64_32) > >> 4:stw r7,0(r3)# return the quotient in *r3 > >>stw r8,4(r3) > >>mr r3,r6 # return the remainder in r3 > >>+5: # return if divisor r4 is zero > >>blr > >> >
Re: [PATCH v2 2/5] powerpc: Allow 4224 bytes of stack expansion for the signal frame
On Fri, Jul 24, 2020 at 07:25:25PM +1000, Michael Ellerman wrote: > We have powerpc specific logic in our page fault handling to decide if > an access to an unmapped address below the stack pointer should expand > the stack VMA. > > The code was originally added in 2004 "ported from 2.4". The rough > logic is that the stack is allowed to grow to 1MB with no extra > checking. Over 1MB the access must be within 2048 bytes of the stack > pointer, or be from a user instruction that updates the stack pointer. > > The 2048 byte allowance below the stack pointer is there to cover the > 288 byte "red zone" as well as the "about 1.5kB" needed by the signal > delivery code. > > Unfortunately since then the signal frame has expanded, and is now > 4224 bytes on 64-bit kernels with transactional memory enabled. Are there really users of transactional memory in the wild? Just asking because Power10 removes TM, and Power9 has had some issues with it AFAICT. Getting rid of it (if possible) would result in smaller signal frames, with simpler signal delivery code (probably slightly faster also). Gabriel
Re: [PATCH v9 29/51] mm/mprotect, powerpc/mm/pkeys, x86/mm/pkeys: Add sysfs interface
On Mon, Dec 18, 2017 at 03:15:51PM -0800, Ram Pai wrote: > On Mon, Dec 18, 2017 at 02:28:14PM -0800, Dave Hansen wrote: > > On 12/18/2017 02:18 PM, Ram Pai wrote: > > > b) minimum number of keys available to the application. > > > if libraries consumes a few, they could provide a library > > > interface to the application informing the number available to > > > the application. The library interface can leverage (b) to > > > provide the information. > > > > OK, let's see a real user of this including a few libraries. Then we'll > > put it in the kernel. > > > > > c) types of disable-rights supported by keys. > > > Helps the application to determine the types of disable-features > > > available. This is helpful, otherwise the app has to > > > make pkey_alloc() call with the corresponding parameter set > > > and see if it suceeds or fails. Painful from an application > > > point of view, in my opinion. > > > > Again, let's see a real-world use of this. How does it look? How does > > an app "fall back" if it can't set a restriction the way it wants to? > > > > Are we *sure* that such an interface makes sense? For instance, will it > > be possible for some keys to be execute-disable while other are only > > write-disable? > > Can it be on x86? > > its not possible on ppc. the keys can *all* be the-same-attributes-disable > all the > time. > > However you are right. Its conceivable that some arch could provide a > feature where it can be x-attribute-disable for key 'a' and > y-attribute-disable for key 'b'. But than its a bit of a headache > for an application to consume such a feature. > > Ben: I recall you requesting this feature. Thoughts? > > > > > > I think on x86 you look for some hardware registers to determine > > > which hardware features are enabled by the kernel. > > > > No, we use CPUID. It's a part of the ISA that's designed for > > enumerating CPU and (sometimes) OS support for CPU features. > > > > > We do not have generic support for something like that on ppc. The > > > kernel looks at the device tree to determine what hardware features > > > are available. But does not have mechanism to tell the hardware to > > > track which of its features are currently enabled/used by the > > > kernel; atleast not for the memory-key feature. > > > > Bummer. You're missing out. > > > > But, you could still do this with a syscall. "Hey, kernel, do you > > support this feature?" > > or do powerpc specific sysfs interface. > or a debugfs interface. getauxval(3) ? With AT_HWCAP or HWCAP2 as parameter already gives information about features supported by the hardware and the kernel. Taking one bit to expose the availability of protection keys to applications does not look impossible. Do I miss something obvious? Gabriel
Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2
On Wed, Aug 10, 2016 at 12:18:15PM +0200, Christophe Leroy wrote: > > > Le 10/08/2016 à 10:56, Gabriel Paubert a écrit : > >On Fri, Aug 05, 2016 at 01:28:02PM +0200, Christophe Leroy wrote: > >>Signed-off-by: Christophe Leroy > >>--- > >> arch/powerpc/kernel/misc_32.S | 3 +-- > >> 1 file changed, 1 insertion(+), 2 deletions(-) > >> > >>diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S > >>index e025230..e18055c 100644 > >>--- a/arch/powerpc/kernel/misc_32.S > >>+++ b/arch/powerpc/kernel/misc_32.S > >>@@ -578,9 +578,8 @@ _GLOBAL(__bswapdi2) > >>rlwimi r9,r4,24,0,7 > >>rlwimi r10,r3,24,0,7 > >>rlwimi r9,r4,24,16,23 > >>- rlwimi r10,r3,24,16,23 > >>+ rlwimi r4,r3,24,16,23 > >>mr r3,r9 > >>- mr r4,r10 > >>blr > >> > > > >Hmmm, are you sure that it works? rlwimi is a bit special since the > >first operand is both an input and an output of the instruction. > > > > > > Oops, you are right ... I just found this: http://hardwarebug.org/2010/01/14/beware-the-builtins/ the bswapdi2 suggested sequence only needs a single mr instruction, the other one is absorbed in a rotlwi. The scheduling looks poor, but it seems impossible to interleave the operations between the two halves without adding another instructions, and the routine is 8 instructions long, which happens to be exactly a cache line on most 32 bit processors. On the other hand gcc did at the time a very poor job (quite an understatement) at bswapdi when compiling for 64 bit processors (see the example). But what do modern compilers generate for bswapdi these days? Do they still call the library or not? After all, bswapdi on 32 bit processors only takes 6 instructions if the input and output registers don't overlap. Gabriel
Re: [PATCH] powerpc/32: Remove one insn in __bswapdi2
On Thu, Aug 11, 2016 at 05:11:19PM -0500, Segher Boessenkool wrote: > On Thu, Aug 11, 2016 at 11:34:37PM +0200, Gabriel Paubert wrote: > > On the other hand gcc did at the time a very poor job (quite an > > understatement) at bswapdi when compiling for 64 bit processors > > (see the example). > > > > But what do modern compilers generate for bswapdi these days? Do they > > still call the library or not? > > Nope. Great, could then these functions be removed from misc_32.S, or are compilers that use libcalls still supported for kernel builds? > > > After all, bswapdi on 32 bit processors only takes 6 instructions if the > > input and output registers don't overlap. > > For this testcase: > === > typedef unsigned long long u64; > u64 bs(u64 x) { return __builtin_bswap64(x); } > === > > we get with -m32: > === > bs: > mr 9,3 > rotlwi 3,4,24 > rlwimi 3,4,8,8,15 > rlwimi 3,4,8,24,31 > rotlwi 4,9,24 > rlwimi 4,9,8,8,15 > rlwimi 4,9,8,24,31 > blr In this case the compiler is constrained by the fact that the input and ouput registers are the same. When inlined with other things it can probably perform better scheduling and interleaving of operations. > === > > and with -m64: > === > .L.bs: > srdi 10,3,32 > mr 9,3 > rotlwi 3,3,24 > rotlwi 8,10,24 > rlwimi 3,9,8,8,15 > rlwimi 8,10,8,8,15 > rlwimi 3,9,8,24,31 > rlwimi 8,10,8,24,31 > sldi 3,3,32 > or 3,3,8 > blr > === > As demonstrated here where the two halves of the 64 bit quantity are byte swapped in an interleaved fashion. Not perfect (I think that with proper ordering the last 2 instructions could be replaced by a rldimi), but reasonable. > Neither as tight as possible, but neither horrible either. > Indeed. Gabriel