Re: RFC v2: Zynq Clock Controller

Sören Brinkmann Mon, 25 Mar 2013 17:50:31 -0700

On Mon, Mar 25, 2013 at 07:10:35PM +0100, Lars-Peter Clausen wrote:
> On 03/25/2013 06:59 PM, Sören Brinkmann wrote:
> > Hi,
> > 
> > On Mon, Mar 25, 2013 at 06:19:11PM +0100, Lars-Peter Clausen wrote:
> >> On 03/25/2013 06:08 PM, Sören Brinkmann wrote:
> >>> Hi Lars,
> >>>
> >>> On Mon, Mar 25, 2013 at 03:46:35PM +0100, Lars-Peter Clausen wrote:
> >>>> Hi,
> >>>>
> >>>> On 03/22/2013 11:41 PM, Sören Brinkmann wrote:
> >>>>> Hi Lars,
> >>>>>
> >>>>> On Thu, Mar 21, 2013 at 07:32:52PM +0100, Lars-Peter Clausen wrote:
> >>>>>> On 03/21/2013 12:56 AM, Sören Brinkmann wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I spent some time working on this and incorporating feedback. Here's 
> >>>>>>> an updated proposal for a clock controller for Zynq:
> >>>>>>>
> >>>>>>> Required properties:
> >>>>>>>  - #clock-cells : Must be 1
> >>>>>>>  - compatible : "xlnx,ps7-clkc"  (this may become 'xlnx,zynq-clkc' 
> >>>>>>> terminology differs a bit between Xilinx internal and mainline)
> >>>>>>>  - ps-clk-frequency : Frequency of the oscillator providing ps_clk in 
> >>>>>>> HZ
> >>>>>>>                      (usually 33 MHz oscillators are used for Zynq 
> >>>>>>> platforms)
> >>>>>>>  - clock-output-names : List of strings used to name the clock 
> >>>>>>> outputs. Shall be a list of the outputs given below.
> >>>>>>>
> >>>>>>> Optional properties:
> >>>>>>>  - clocks : as described in the clock bindings
> >>>>>>>  - clock-names : as described in the clock bindings
> >>>>>>>
> >>>>>>> Clock inputs:
> >>>>>>> The following strings are optional parameters to the 'clock-names' 
> >>>>>>> property in
> >>>>>>> order to provide optional (E)MIO clock sources.
> >>>>>>>  - swdt_ext_clk
> >>>>>>>  - gem0_emio_clk
> >>>>>>>  - gem1_emio_clk
> >>>>>>>  - mio_clk_XX          # with XX = 00..53
> >>>>>>>
> >>>>>>> Example:
> >>>>>>>         clkc: clkc {
> >>>>>>>                 #clock-cells = <1>;
> >>>>>>>                 compatible = "xlnx,ps7-clkc";
> >>>>>>>                 ps-clk-frequency = <33333333>;
> >>>>>>
> >>>>>> The input frequency should be a clock as well.
> >>>>> Again, monolithic vs split. I don't see a reason not to just internally
> >>>>> call clk_register_fixed_rate(). That way its children do not have to
> >>>>> cope with a variable name for the xtal.
> >>>>> Also, with my proposal 'clocks' and 'clock-names' would be purely
> >>>>> optional properties, only required if optional external inputs are
> >>>>> present. Having the xtal defined externally would add mandatory entries 
> >>>>> for
> >>>>> those props.
> >>>>
> >>>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>                 clock-output-names = "armpll", "ddrpll", "iopll", 
> >>>>>>> "cpu_6or4x", "cpu_3or2x", "cpu_2x", "cpu_1x", "ddr2x", "ddr3x", 
> >>>>>>> "dci", "lqspi", "smc", "pcap", "gem0", "gem1", "fclk0", "fclk1", 
> >>>>>>> "fclk2", "fclk3", "can0", "can1", "sdio0", "sdio1", "uart0", "uart1", 
> >>>>>>> "spi0", "spi1", "dma", "usb0_aper", "usb1_aper", "gem0_aper", 
> >>>>>>> "gem1_aper", "sdio0_aper", "sdio1_aper", "spi0_aper", "spi1_aper", 
> >>>>>>> "can0_aper", "can1_aper", "i2c0_aper", "i2c1_aper", "uart0_aper", 
> >>>>>>> "uart1_aper", "gpio_aper", "lqspi_aper", "smc_aper", "swdt", 
> >>>>>>> "dbg_trc", "dbg_apb";  /* long list... explanation below */
> >>>>>>>                 /* optional props */
> >>>>>>>                 clocks = <&clkc 16>, <&clk_foo>;
> >>>>>>>                 clock-names = "gem1_emio_clk", "can_mio_clk_23";
> >>>>>>>         };
> >>>>>>>
> >>>>>>> With this revised bindings arbitrary upstream and downstream clock 
> >>>>>>> providers should be supported and it's also possible to loop back an 
> >>>>>>> output as input. The downside of supporting this is, that I don't see 
> >>>>>>> a way around explicitly listing the clock output names in the DT.
> >>>>>>> The reason for this is, that a downstream clock provider should use 
> >>>>>>> of_clk_get_parent_name() to obtain its parent clock name. For a block 
> >>>>>>> with multiple outputs of_clk_get_parent_name() can return a valid 
> >>>>>>> clock name only when 'clock-output-names' is present.
> >>>>>>> Probably the fclks are the only realistic use case to become parent 
> >>>>>>> of downstream clock providers, but I could imagine that e.g. a device 
> >>>>>>> driver like UART wants to use the CCF to model its internal clocks, 
> >>>>>>> hence it would require its parent clock name. Even though a device 
> >>>>>>> driver could use clk_get_parent() and __clk_get_name(), 
> >>>>>>> of_clk_get_parent_name() should probably work as well. I simply have 
> >>>>>>> a bad feeling about breaking of_clk_get_parent_name() for any clock.
> >>>>>>> But after all, I'm open for finding a better solution for this.
> >>>>>>>
> >>>>>>>
> >>>>>>> Similar, inputs for optional clock sources through (E)MIO pins can be 
> >>>>>>> defined as described in the clock bindings using the 'clocks' and 
> >>>>>>> 'clock-names' properties, with 'clock-names' being an arbitrary 
> >>>>>>> subset of the documented names. The actual parent clock names are 
> >>>>>>> internally resolved using of_clk_get_parent_name().
> >>>>>>>
> >>>>>>>
> >>>>>>> Regarding this monolithic solution versus a finer granular split:
> >>>>>>>
> >>>>>>> On cost of more complex probing we would also have:
> >>>>>>>  - one clock driver covering the peripheral clocks
> >>>>>>>  - one for the CPU clock domain
> >>>>>>>  - one for the DDR clock domain
> >>>>>>>  - one for GEM
> >>>>>>>  - one for CAN
> >>>>>>>  - one for the APER clks
> >>>>>>>  - one for the PLLs
> >>>>>>>  - one for fclks
> >>>>>>
> >>>>>> fclks are the same as peripheral clocks except for the gate bit, as 
> >>>>>> far as I
> >>>>>> can see.
> >>>>> And that makes them quite different, since they have to access multiple
> >>>>> registers instead of a single one. Also, the fclks have two dividers.
> >>>>> If you want to cover all of those with a single driver, you need a
> >>>>> plethora of arguments/properties to catch the small differences.
> >>>>>
> >>>>>>
> >>>>>>>  - probably one for the debug clocks (not sure whether we need those)
> >>>>>>>
> >>>>>>> Except for the peripheral one and probably the fclk, they would all 
> >>>>>>> be instantiated just once. So, there is not a lot of reuse going on.
> >>>>>>
> >>>>>> PLLs are going to be instantiated multiple times as well.
> >>>>> As mentioned in the very next sentence, I rather see a single driver
> >>>>> with multiple outputs. Take suspend: My plan is to have a few functions
> >>>>> like zynq_clk_(suspend|resume). That should take care of bypassing
> >>>>> shutting down the PLLs (and probably more). Therefore it's easier to
> >>>>> have them all in a single driver.
> >>>>> And if it turned out, that other clocks require special handling for
> >>>>> such system level functions, that could be addressed that way too with
> >>>>> the monolithic approach.
> >>>>>
> >>>>>>
> >>>>>>> Fclks I would probably also rather put into one driver with four 
> >>>>>>> outputs instead of instantiating a single output driver multiple 
> >>>>>>> times. Same for PLLs.
> >>>>>>>
> >>>>>>> And then there is e.g. a mux for the system watchdog input which 
> >>>>>>> doesn't really fit anywhere and it would be pretty much ridiculous to 
> >>>>>>> have another clock driver just for that one and it would also become 
> >>>>>>> "hidden" in one of the others.
> >>>>>>>
> >>>>>>> In my opinion that's just not necessary. We would create a bunch of 
> >>>>>>> clock drivers including DT bindings for them, probing would become 
> >>>>>>> more complex and it doesn't really help with the probe/naming issue. 
> >>>>>>> So, I don't think it's beneficial to go down that road.
> >>>>>>>
> >>>>>>> The monolithic solution would need one custom driver for the PLLs, DT 
> >>>>>>> bindings wouldn't be required for it though. Everything else should 
> >>>>>>> be internally described using the clock-primitives.
> >>>>>>>
> >>>>>>> Other than having a much simpler probe and init process, I still 
> >>>>>>> think it might be beneficial to have this monolithic block with a 
> >>>>>>> holistic view of the clock tree. For suspend e.g. I think the clock 
> >>>>>>> controller could export functions like zynq_clkc_(suspend|resume) and 
> >>>>>>> then the controller handles the PLL bypassing/shutdown.
> >>>>>>> Regarding full dynamic reparenting, I don't know how exactly that 
> >>>>>>> could work, but with the clock controller there is at least a block 
> >>>>>>> where that intelligence would be going and which has knowledge of all 
> >>>>>>> the 'struct clk *' required to reparent clocks.
> >>>>>>>
> >>>>>>> Regarding the DT description, it is probably controversial what is 
> >>>>>>> considered better. I, like the Tegra folks, think having one clock 
> >>>>>>> controller in there is fine and hides irrelevant implementation 
> >>>>>>> details.
> >>>>>>>
> >>>>>>
> >>>>>> I still don't like the monolithic solution. From my point of view it is
> >>>>>> architecturally inferior, it is a bad abstraction. You argue that for a
> >>>>>> non-monolithic version you'd have to implement a clock driver for each
> >>>>>> different type of clock. But you still have to do this for the 
> >>>>>> monolithic
> >>>>>> version, they'd probably just end up in one huge messy file. Unless 
> >>>>>> you are
> >>>>>> going to duplicate huge amounts of lines you'd probably have functions 
> >>>>>> like
> >>>>>> add_gem_clk, add_peripheral_clk, add_pll_clk and so on in your 
> >>>>>> monolithic drivers.
> >>>>> It probably makes sense to differ between having custom drivers and
> >>>>> describing the whole clock tree in DT.
> >>>>>
> >>>>> From reusability perspective, it may make sense to factor out some code
> >>>>> in drivers. IMHO, this does only apply to the peripheral clocks, since
> >>>>> everything else isn't reused and/or quite Zynq specific.
> >>>>> But a monolithic approach would not even prevent this. You could just
> >>>>> transparently change the implementation: Just add a clock driver,
> >>>>> replace the original code in the controller to use the new driver
> >>>>> instead and you're good. No need to touch DT bindings.
> >>>>>
> >>>>> In Zynq a peripheral clock is essentially described by:
> >>>>>         clk = clk_register_mux(NULL, mux_name, parents, 4, 0,           
> >>>>>          
> >>>>>                         clk_ctrl, 4, 2, 0, lock);                       
> >>>>>          
> >>>>>                                                                         
> >>>>>          
> >>>>>         clk = clk_register_divider(NULL, div_name, mux_name, 0, 
> >>>>> clk_ctrl, 8, 6,  
> >>>>>                         CLK_DIVIDER_ONE_BASED, lock);                   
> >>>>>          
> >>>>>                                                                         
> >>>>>          
> >>>>>         clks[clk0] = clk_register_gate(NULL, clk_name0, div_name,       
> >>>>>          
> >>>>>                         CLK_SET_RATE_PARENT, clk_ctrl, 0, 0, lock);     
> >>>>>          
> >>>>>
> >>>>> If we wrapped this by a driver, any code outside of that driver couldn't
> >>>>> change the mux setting, since its struct clk* is not exposed outside.
> >>>>> To get around this we'd have to reimplement all the clock ops, to 
> >>>>> create a
> >>>>> clock which supports all possible operations.
> >>>>> In the monolithic approach we could simply remember that struct clk* and
> >>>>> work with it.
> >>>>>
> >>>>> Bottom line: Factoring out some parts of the monolithic driver might
> >>>>> make sense for some parts. But it can be done later in a transparent 
> >>>>> way,
> >>>>> especially w/o touching DT bindings.
> >>>>>
> >>>>>
> >>>>> The other thing is describing the whole clock tree in DT:
> >>>>> That would force us to not only describe clocks for which it might make
> >>>>> sense, but also all Zynq specific clocks in custom drivers and DT
> >>>>> bindings and we gain nothing from it.
> >>>>>
> >>>>>>
> >>>>>> The SLCR is a virtual construct, which groups the clock units 
> >>>>>> together. But the
> >>>>>> clock units are the actual entities. E.g. imagine a Zync 2.0, it would 
> >>>>>> probably
> >>>>>> reuse the same clock units, but there would quite likely be different 
> >>>>>> clocks
> >>>>>> mapped at different addresses. If you choose a non-monolithic 
> >>>>>> implementation
> >>>>>> you'd just have to update your devicetree to make this work, with a 
> >>>>>> monolithic
> >>>>>> version you'd have to add a almost identical driver for the v2.
> >>>>> Either way you'll end up with a lot of lines describing the hierarchy.
> >>>>> Either in the dts or in the clock driver.
> >>>>>
> >>>>>
> >>>>> I think, my problem is, that I do not yet know how certain functionality
> >>>>> will be implemented later and what exact use-cases have to be supported.
> >>>>> With the monolithic approach I keep all options open. We hopefully will
> >>>>> never have to touch its DT bindings again. And refactoring the code and
> >>>>> migrating it to use dedicated clock drivers should be transparent
> >>>>> changes, if deemed beneficial.
> >>>>> Furthermore, I have a driver, which has references to all strucht clk*
> >>>>> in the system and can work with the whole clock tree leveraging a system
> >>>>> level view. Separating things in smaller blocks might complicate things
> >>>>> if a use case requires this.
> >>>>> And finally pushing the whole hierarchy description into the DT seems to
> >>>>> be the most restrictive approach, without offering big rewards.
> >>>>
> >>>> I don't see those restrictions you see with a dt binding which has nodes 
> >>>> for
> >>>> each clock block. You seem to be under the impression, that each device 
> >>>> tree
> >>>> node requires its own driver or device. This is not the case, have you 
> >>>> looked
> >>>> at how the current upstream zynq clock support is implemented? This is 
> >>>> still
> >>>> one monolithic driver, but there are multiple dt nodes, one for each 
> >>>> clock
> >>>> block.
> >>> I thought the one "huge messy" file was one thing you wanted to avoid?
> >>
> >> I want to avoid a messy one, yes.
> >>
> >>> Also, how does it address my concern regarding inaccessible
> >>> 'struct clk *'? A clk_register_foo() will only return a single struct
> >>> clk *. So, wrapping several clk_register() calls within one
> >>> clk_register_foo() makes all but the actual output inaccessible.
> >>> And if we do not directly call clk_register_foo() but do the probing
> >>> through DT the clock init code does not even get hold of that struct clk
> >>> *.
> >>
> >> That's what the composite clk driver is for.
> >> http://www.mail-archive.com/[email protected]/msg4k04590.html
> > Is it merged by now? Last time I checked it wasn't and I was seeking a
> > solution w/o this driver. And I think it wouldn't fit all clocks. E.g. the
> > ones with multiple dividers or gates.
> 
> Not yet, but hopefully soon.
> 
> For the clocks with multiple dividers we want a custom clk driver, at least
> for the divider part, otherwise implementing clk_set_rate wouldn't be so easy.
> 
> >>
> >>>
> >>>> This leads to a very clean and well structured code and also nicely
> >>>> structured dt, which represents the tree as it is in hardware. If I 
> >>>> understand
> >>>> you correctly what you suggest is that you basically want to copy 'n 
> >>>> paste the
> >>>> same 10 lines to instantiate a peripheral clock, which you mentioned 
> >>>> above,
> >>>> again and again. This will be quite messy.
> >>> For the peripheral clocks you're right, there is some reuse possible.
> >>> For those I have a helper, like zynq_clk_register_periph(). But for 
> >>> pretty much
> >>> everything else I don't.
> >>>
> >>> In conclusion, I see the DT approach limiting my ability to modify the
> >>> clock tree when implementing system level functionality. 
> >>> Probably quite a stretch, but imagine some smart power management code,
> >>> which might try to reparent all and every clock in the system to certain
> >>> PLLs. How would such a power manager get hold of all the struct clk * it
> >>> needs. In a system probing fully through DT we cannot get hold of those
> >>> and even less if we wrap several clocks in one clock_foo. Unless, we add
> >>> a dummy device which has every clock in the system as its input. And
> >>> that sounds even wackier than having a clock controller which provides
> >>> all the clocks to consumers.
> >>
> >> If you really really need a list of all the zynq clocks the easiest way to
> >> implement this is to add a global list and whenever a clock gets probed via
> >> dt add the clock to the list.
> > A simple list wouldn't do it. I need exact knowledge about which clock
> > is which. So, I pretty much copied the Tegra approach with a
> > 'struct clk *clks[clk_max]' array. And filling such an array from a generic 
> > clock
> > driver seems wrong.
> > By separating clocks we take away their awareness of the bigger
> > hierarchy picture. The clock controller approach maintains this knowledge.
> 
> If you already have the code can you upload it somewhere, this might make it
> easier for me to understand why the solution I have in mind won't work.
Should be accessible here:
https://github.com/sorenb-xlnx/linux-xlnx/tree/zynq-clkc


        Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC v2: Zynq Clock Controller

Reply via email to