Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread Benjamin Herrenschmidt
On Sat, 2008-10-25 at 21:04 -0700, David Miller wrote:
 But back to my original wonder, since I've always tipped off of this
 generic IRQ layer cpu mask, when was it ever defaulting to zero
 and causing the behvaior your powerpc guys actually want? :-)

Well, I'm not sure what Kumar wants. Most powerpc SMP setups actually
want to spread interrupts to all CPUs, and those who can't tend to just
not implement set_affinity... So Kumar must have a special case of MPIC
usage here on FSL platforms.

In any case, the platform limitations should be dealt with there or the
user could break it by manipulating affinity via /proc anyway.

By yeah, I do expect default affinity to be all CPUs and in fact, I even
have an -OLD- comment in the code that says

/* let the mpic know we want intrs. default affinitya is 0x ...

Now, I've tried to track that down but it's hard because the generic code
seem to have changed in many ways around affinity handling...

So it looks like nowadays, the generic setup_irq() will call
irq_select_affinity() when an interrupt is first requested. Unless
you set CONFIG_AUTO_IRQ_AFFINITY and implement your own
irq_select_affinity(), thus, you will get the default one which copies
the content of this global irq_default_affinity to the interrupt.

However it does that _after_ your IRQ startup() has been called
(yes, this is very fishy), and so after you did your irq_choose_cpu()...

This is all very messy, along with hooks for balancing and other confusing
stuff that I suspect keeps changing. I'll have to spend more time next
week to sort out what exactly is happening on powerpc and whether we
get our interrupts spread or not...

That's the downside of having more generic irq code I suppose: now people
keep rewriting half of the generic code with x86 exclusively in mind and
we have to be extra careful :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread Benjamin Herrenschmidt

 What does this all mean to my GigE (dual 1.1 GHz 7455s)? Is this
 thing supposed to be able to spread irq between its cpus?

Depends on the interrupt controller. I don't know that machine
but for example the Apple Dual G5's use an MPIC that can spread
based on an internal HW round robin scheme. This isn't always
the best idea tho for cache reasons... depends if an at what level
your caches are shared between CPUs.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread David Miller
From: Benjamin Herrenschmidt [EMAIL PROTECTED]
Date: Sun, 26 Oct 2008 17:48:43 +1100

 
  What does this all mean to my GigE (dual 1.1 GHz 7455s)? Is this
  thing supposed to be able to spread irq between its cpus?
 
 Depends on the interrupt controller. I don't know that machine
 but for example the Apple Dual G5's use an MPIC that can spread
 based on an internal HW round robin scheme. This isn't always
 the best idea tho for cache reasons... depends if an at what level
 your caches are shared between CPUs.

it's always going to be the wrong thing to do for networking cards,
especially once we start doing RX flow seperation in software
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread Benjamin Herrenschmidt
On Sun, 2008-10-26 at 00:16 -0700, David Miller wrote:
 From: Benjamin Herrenschmidt [EMAIL PROTECTED]
 Date: Sun, 26 Oct 2008 17:48:43 +1100
 
  
   What does this all mean to my GigE (dual 1.1 GHz 7455s)? Is this
   thing supposed to be able to spread irq between its cpus?
  
  Depends on the interrupt controller. I don't know that machine
  but for example the Apple Dual G5's use an MPIC that can spread
  based on an internal HW round robin scheme. This isn't always
  the best idea tho for cache reasons... depends if an at what level
  your caches are shared between CPUs.
 
 it's always going to be the wrong thing to do for networking cards,
 especially once we start doing RX flow seperation in software

True, though I don't have policy in the kernel for that, ie, it's pretty
much irqbalanced job to do that. At this stage, the kernel always tries
to spread when it can... at least on powerpc.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: GPIO - marking individual pins (not) available in device tree

2008-10-26 Thread Matt Sealey



Mitch Bradley wrote:


I don't use device_type much, if at all, anymore.  Generic name + 
compatible just works better than device_type + specific name.  When

 I write code that has to find a node that is suitable for a given purpose,
 I look for the existence of suitable methods and perhaps other properties.
 I was just too hard to keep the list of device_type values properly
 synchronized with all the possible things that you might want to infer
 from that set of names.

The simple problem comes when you define a device_type for everything,
I do agree it's best not to add any *MORE* that aren't in the IEEE1275
or CHRP etc. bindings, but for those that still exist and are well
defined (serial port probably the best, but network devices too) I
think we should keep using them where possible and where relevant.

device_type is one of those things that seemed like a good idea at the 
time, but didn't work out as well as I had hoped.


I can imagine a scenario where you would want to perhaps have a serial
port, where you want to say a) that is is a serial port, b) that it is
for a specific purpose without creating some new standard or proprietary
property set and c) tell the world what kind of serial port it is.

How about name = debug or modem or something else, which gives you
a pretty name for what the port is for (and maybe matches the markings
on the outside of a case) but the device_type would always be serial,
and compatible would give you mpc5200b-psc-uart or so. You can find
all the serial ports, you can find the serial port that is assigned to
the modem or debug (this may actually allow the driver to be informed
not to do anything crazy - if you've ever connected a modem to a port
that gets set up to output firmware debug data or whatever, you'll know
sometimes it's kind of difficult to bring the modem back out of it's
funk from being hammered with data for the duration of boot), and you
know which driver to attach to it.

I personally think while deprecate and shouldn't be used for new
definitions, the old ones work really well for devices it encompasses.
On the MPC8641D board I have here there are 4 network ports; in the
device tree they're all called ethernet, device_type network,
compatible gianfar and have a model TSEC - in a real OF
implementation you shouldn't have to check for a ping method to
make sure it's ACTUALLY a network device :D

(I'd advocate all those ports being renamed to eTSEC{n} since that
matches the board documentation, for example, and while cell-index
tells you which port is is on the back of the board, this is not
user friendly (a simple boot eTSEC0 tftpboot/kernel is more
intuitive than cd /boot/[EMAIL PROTECTED], .properties, backing out
if it wasn't the one you were looking for.. or assuming that the
lowest numbered reg is the first port on the back of the chassis
and finding out that is not how it's connected :)

I'm really big on descriptive device trees that I can just browse
and know what I am looking at without delving. There is already
too much needless cross-referencing in Linux as it is.

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: GPIO - marking individual pins (not) available in device tree

2008-10-26 Thread Matt Sealey



Stephen Neuendorffer wrote:

One thing I had a crazy dream about was a GUI-based device tree
builder for platforms.


Why is this crazy?  This is essentially what we do today with PowerPC
and Microblaze processors in Xilinx FPGAs.  Even for ASIC SOCs, there
are several commercial 'connect-your-IP on the bus' tools that could (if
SOC providers thought it was important) generate the 'canonical' device
tree automagically.


Indeed I have to sit in front of Quartus all day at the moment and the
BDF window, pin planner etc. and the constraints scripts for the megacores
just seemed to me to be a great way to spit out a device tree without
doing any real work.


I think the real question is: if part of the device tree describes
'hardware' (either in the SOC or on the board that, more or less,
doesn't change) and part represents 'hardware configuration' (e.g. My
board has my one-off hardware hanging off the gpio bank connected to the
40 pin header), then how do we separate the two so that the hardware can
be in a canonical form separate from the configuration.


Personally, I think that goes against the whole point of the device
tree specification anyway. Or at least the greatest benefit - which is
to allow hardware designers to present to software developers a
reasonable description of what can and usually is an esoteric design
decision (which port goes where and what undetectable hardware is
used for X and Y) without exposing the software developer to the
programming model of each individual device (contrast any other system
where you might have a huge list of registers and positions, and use
a rom monitor to manually poke at certain registers, and need the
board schematics to get anywhere you can't read a chip name off the
board to use)

The definition of the binding defines what every peripheral should
look like if it's present - if the peripheral is multiplexed inside
the chip, then you can just copy-paste one feature and not use the
other. A difference in PHY for something is one thing you just can't
detect sometimes; on different boards, this will be different, but
it is all part of the hardware configuration, and not much to do with
the hardware itself (if you have 12 serial controllers but USB and
ethernet usage means you lose 5 of them to multiplexing, or a SerDes
shared between PCI Express, SATA or RapidIO but only one can be
active.. or a configurable clock module for internal devices which
would have the same quirks as an interrupt controller.. or even an
interrupt controller configured to cascade or slaved to something
else)


there are even three device tree fragments: one provided by an SOC
provider, one by a board provider, and one by the user, which can all be
nicely separated once the great device tree update happens... :)


If you have an SoC provider device tree fragment does it entertain
every possibility in the chip, or just the most common? Does the
board provider dt fragment then allow disable certain features in
the previous fragments? See examples above where defining 12 serial ports
on the SoC dts AND usb and ethernet functionality, just can't work,
and the board configuration of these devices is entirely relevant.

Updating a device tree at the user end is very useful but I do not
think that there is any fundamental difference between the hardware
itself and the board configuration from a DT point of view, except
that you need a binding or an example (but not a canonical device
tree excerpt) to base your final tree from. What is inside the SoC
rarely matches what is escaped from the chip, so a premade
fragment you could just load becomes rather redundant. Just a
useful reference would be better and that's what we have already.

As for user fragments we have that on OpenFirmware already* and
the idea that we may actually standardize on this kind of stuff
sort of excites me as it validates the point to remove all the
device tree fixups from Pegasos and Efika in prom_init.c and use
something a little less of the order of you have to recompile
your kernel every time. Be it a dtb fragment for U-Boot or a
Forth script for real OF, this is such a great idea, I don't
know why it's not already there :)

* http://www.powerdeveloper.org/platforms/efika/devicetree

--
Matt Sealey [EMAIL PROTECTED]
Genesi, Manager, Developer Relations
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: GPIO - marking individual pins (not) available in device tree

2008-10-26 Thread David Gibson
On Fri, Oct 24, 2008 at 05:14:26PM -0500, Matt Sealey wrote:


 David Gibson wrote:
 Don't be patronising.

 There is an existing address space defined by the gpio binding.
 Defining another one is pointless redundancy.  This is standard good
 ideas in computer science, no further argument necessary.

 The existing address space, and the patches Anton etc. just submitted
 which I started this discussion to address, don't fulfil certain
 needs.

Such as what?  Apparently none, since elsewhere in this thread you
seem to be happy with the suggestion of using a gpio-header node,
which does use the same address space.

 You could do better than call it insane, by describing how you would
 define a gpio bank that used 3 seperate pins which are NOT together
 in a register, using a base address (reg) and base property (offset
 of first pin) with the current system?

Um.. I can't actually follow what you're getting at there, sorry.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: GPIO - marking individual pins (not) available in device tree

2008-10-26 Thread David Gibson
On Sun, Oct 26, 2008 at 04:13:26PM -0500, Matt Sealey wrote:


 Mitch Bradley wrote:

 I don't use device_type much, if at all, anymore.  Generic name +  
 compatible just works better than device_type + specific name.  When
  I write code that has to find a node that is suitable for a given purpose,
  I look for the existence of suitable methods and perhaps other properties.
  I was just too hard to keep the list of device_type values properly
  synchronized with all the possible things that you might want to infer
  from that set of names.

 The simple problem comes when you define a device_type for everything,
 I do agree it's best not to add any *MORE* that aren't in the IEEE1275
 or CHRP etc. bindings, but for those that still exist and are well
 defined (serial port probably the best, but network devices too) I
 think we should keep using them where possible and where relevant.

device_type in 1275 defines the runtime method interface.  It's *not*
for declaring the general class of the device, although it often
matches that in practice.  Drivers which attempt to use it this way
are buggy.

So, in the  case of a real OF implementation,  yes, you should include
device_type values as specified by 1275.  Assuming of course that your
implementation  really  does  implement  the OF  method  binding  that
matches the stated device_type.  However, flattened trees clearly
can't provide the method interface, and so shouldn't declare the
device_type.

In practice, we do suggest including device_type in certain, limited,
circumstances precisely because there are a whole bunch of buggy
drivers out there which match (at least partly) on device_type.  We
don't want to break these gratuitously, but neither do we want to
encourage any further spread of using device_type incorrectly for
driver matching.  Hence the current policy.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread Kevin Diggs

Benjamin Herrenschmidt wrote:

What does this all mean to my GigE (dual 1.1 GHz 7455s)? Is this
thing supposed to be able to spread irq between its cpus?



Depends on the interrupt controller. I don't know that machine
but for example the Apple Dual G5's use an MPIC that can spread
based on an internal HW round robin scheme. This isn't always
the best idea tho for cache reasons... depends if an at what level
your caches are shared between CPUs.

Ben.


Sorry. I thought GigE was a common name for the machine. It is a dual
450 MHz G4 powermac with a gigabit ethernet and AGP. It now has a
PowerLogix dual 1.1 GHz 7455 in it. I think the L3 caches are
seperate? Not sure about the original cpu card. Can the OS tell?

The reason I asked is that I seem to remember a config option that
would restrict the irqs to cpu 0? Help suggested it was needed for
certain PowerMacs. Didn't provide any help as to which ones. My GigE
currently spreads them between the two. I have not noticed any
additional holes in the space time contiuum.

kevin
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH] genirq: Set initial default irq affinity to just CPU0

2008-10-26 Thread Benjamin Herrenschmidt
On Sun, 2008-10-26 at 18:30 -0800, Kevin Diggs wrote:
 The reason I asked is that I seem to remember a config option that
 would restrict the irqs to cpu 0? Help suggested it was needed for
 certain PowerMacs. Didn't provide any help as to which ones. My GigE
 currently spreads them between the two. I have not noticed any
 additional holes in the space time contiuum.

Yeah, a long time ago we had unexplained lockups when spreading
interrupts, hence the config option. I think it's all been fixed since
then.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev