Re: ivtv: use arch_phys_wc_add() and require PAT disabled

2018-03-11 Thread Andy Lutomirski



> On Mar 11, 2018, at 12:51 PM, Nick French <n...@ou.edu> wrote:
> 
> On Sat, Mar 10, 2018 at 10:20:23AM -0800, Andy Lutomirski wrote:
>>>> Perhaps the easy answer is to change the fatal is-pat-enabled check to just
>>>> a warning like "you have PAT enabled, so wc is disabled for the 
>>>> framebuffer.
>>>> if you want wc, use the nopat parameter"?
>>> 
>>> I like this idea more and more. I haven't experience any problems running
>>> with PAT-enabled and no write-combining on the framebuffer. Any objections?
>>> 
>> 
>> None from me.
>> 
>> However, since you have the hardware, you could see if you can use the
>> change_page_attr machinery to change the memory type on the framebuffer once
>> you figure out where it is.
> 
> I am certainly willing to try this, but my understanding of the goal of the
> changes that disabled ivtvfb originally is that it was trying to hide the
> architecture-specific memory management from the driver.

Not really. The goal was to eliminate all code that touches MTRRs on PAT 
systems. So mtrr_add got unexported and the arch_phys are legacy hints that do 
nothing on modern machines. 

> 
> Wouldn't (figuring out a way to) expose x86/mm/pageattr internals to the
> driver be doing the opposite? (or maybe I misunderstand your suggestion)

It doesn’t conflict at all.  Obviously the code should be tidy. 

From memory, I see two potentially reasonable real fixes. One is to find a way 
to punch a hole in an ioremap. So you’d find the framebuffer, remove it from 
theproblematic mapping, and then make a new mapping. The second is to change 
the mapping type in place. 

Or maybe you could just iounmap the whole thing after firmware is loaded and 
the framebuffer is found and then redo the mapping right. 


Re: ivtv: use arch_phys_wc_add() and require PAT disabled

2018-03-10 Thread Andy Lutomirski



> On Mar 10, 2018, at 8:57 AM, French, Nicholas A.  wrote:
> 
>> On Wed, Mar 07, 2018 at 11:23:09PM -0600, French, Nicholas A. wrote:
>>> On Thu, Mar 08, 2018 at 04:14:11AM +, Luis R. Rodriguez wrote:
 On Thu, Mar 08, 2018 at 04:06:01AM +, Luis R. Rodriguez wrote:
> On Thu, Mar 08, 2018 at 03:16:29AM +, French, Nicholas A. wrote:
> 
> Ah, I see. So my proposed ioremap_wc call was only "working" by aliasing 
> the
> ioremap_nocache()'d mem area and not actually using write combining at 
> all.
 
 There are some debugging PAT toys out there I think but I haven't played 
 with
 them yet or I forgot how to to confirm or deny this sort of effort, but
 likeley.
>>> 
>>> In fact come to think of it I believe some neurons are telling me that if
>>> two type does not match we'd get an error?
> 
> I can confirm that my original suggested patch just aliases to ivtv-driver's 
> nocache mapping:
> $ sudo modprobe ivtvfb
> $ sudo dmesg
> ...
> x86/PAT: Overlap at 0xd500-0xd580
> x86/PAT: reserve_memtype added [mem 0xd551-0xd56b0fff], track 
> uncached-minus, req write-combining, ret uncached-minus
> ivtvfb0: Framebuffer at 0xd551, mapped to 0xc6a7ed52, size 1665k
> ...
> $ sudo cat /sys/kernel/debug/x86/pat_memtype_list | grep 0xd5
> uncached-minus @ 0xd500-0xd580
> uncached-minus @ 0xd551-0xd56b1000
> 
> So nix that.
> 
>>> No what if the framebuffer driver is just requested as a secondary step
>>> after firmware loading?
>> 
>> Its a possibility. The decoder firmware gets loaded at the beginning of the 
>> decoder
>> memory range and we know its length, so its possible to ioremap_nocache 
>> enough
>> room for the firmware only on init and then ioremap the remaining 
>> non-firmware
>> decoder memory areas appropriately after the firmware load succeeds...
> 
> I looked in more detail, and this would be "hard" due to the way the rest of 
> the
> decoder offsets are determined by either making firmware calls or scanning the
> decoder memory range for magic bytes and other mess.
> 
> I think some smart guy named mcgrof apparently came to the same conclusion 
> in a really old email chain I found 
> [https://lists.gt.net/linux/kernel/2387536]:
> "The ivtv case is the *worst* example we can expect where the firmware
> hides from us the exact ranges for write-combining, that we should somehow
> just hope no one will ever do again."
> :-)
> 
>> Perhaps the easy answer is to change the fatal is-pat-enabled check to just a
>> warning like "you have PAT enabled, so wc is disabled for the framebuffer.
>> if you want wc, use the nopat parameter"?
> 
> I like this idea more and more. I haven't experience any problems running
> with PAT-enabled and no write-combining on the framebuffer. Any objections?
> 
> 

None from me. 

However, since you have the hardware, you could see if you can use the 
change_page_attr machinery to change the memory type on the framebuffer once 
you figure out where it is.

Re: Fw: [PATCH v2] cinergyT2-core: don't do DMA on stack

2016-10-06 Thread Andy Lutomirski
On Wed, Oct 5, 2016 at 11:58 AM, Mauro Carvalho Chehab
 wrote:
> Sorry, forgot to C/C people that are at the "Re: Problem with VMAP_STACK=y"
> thread.
>
> Forwarded message:
>
> Date: Wed,  5 Oct 2016 15:54:18 -0300
> From: Mauro Carvalho Chehab 
> To: Linux Doc Mailing List 
> Cc: Mauro Carvalho Chehab , Mauro Carvalho Chehab 
> , Mauro Carvalho Chehab 
> Subject: [PATCH v2] cinergyT2-core: don't do DMA on stack
>
>
> The USB control messages require DMA to work. We cannot pass
> a stack-allocated buffer, as it is not warranted that the
> stack would be into a DMA enabled area.
>
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>
> Added the fixups made by Johannes Stezenbach
>
>  drivers/media/usb/dvb-usb/cinergyT2-core.c | 45 
> ++
>  1 file changed, 27 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/media/usb/dvb-usb/cinergyT2-core.c 
> b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> index 9fd1527494eb..8267e3777af6 100644
> --- a/drivers/media/usb/dvb-usb/cinergyT2-core.c
> +++ b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> @@ -41,6 +41,7 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>
>  struct cinergyt2_state {
> u8 rc_counter;
> +   unsigned char data[64];
>  };
>
>  /* We are missing a release hook with usb_device data */
> @@ -50,29 +51,36 @@ static struct dvb_usb_device_properties 
> cinergyt2_properties;
>
>  static int cinergyt2_streaming_ctrl(struct dvb_usb_adapter *adap, int enable)
>  {
> -   char buf[] = { CINERGYT2_EP1_CONTROL_STREAM_TRANSFER, enable ? 1 : 0 
> };
> -   char result[64];
> -   return dvb_usb_generic_rw(adap->dev, buf, sizeof(buf), result,
> -   sizeof(result), 0);
> +   struct dvb_usb_device *d = adap->dev;
> +   struct cinergyt2_state *st = d->priv;
> +
> +   st->data[0] = CINERGYT2_EP1_CONTROL_STREAM_TRANSFER;
> +   st->data[1] = enable ? 1 : 0;
> +
> +   return dvb_usb_generic_rw(d, st->data, 2, st->data, 64, 0);
>  }
>
>  static int cinergyt2_power_ctrl(struct dvb_usb_device *d, int enable)
>  {

This...

> -   char buf[] = { CINERGYT2_EP1_SLEEP_MODE, enable ? 0 : 1 };
> -   char state[3];
> -   return dvb_usb_generic_rw(d, buf, sizeof(buf), state, sizeof(state), 
> 0);
> +   struct cinergyt2_state *st = d->priv;
> +
> +   st->data[0] = CINERGYT2_EP1_SLEEP_MODE;

...does not match this:

> +   st->data[1] = enable ? 1 : 0;

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with VMAP_STACK=y

2016-10-05 Thread Andy Lutomirski
On Wed, Oct 5, 2016 at 9:45 AM, Jörg Otte <jrg.o...@gmail.com> wrote:
> 2016-10-05 17:53 GMT+02:00 Andy Lutomirski <l...@amacapital.net>:
>> On Wed, Oct 5, 2016 at 8:21 AM, Jörg Otte <jrg.o...@gmail.com> wrote:
>>> 2016-10-05 11:04 GMT+02:00 Mauro Carvalho Chehab <mche...@s-opensource.com>:
>>>> Em Wed, 5 Oct 2016 09:50:42 +0200 (CEST)
>>>> Jiri Kosina <ji...@kernel.org> escreveu:
>>>>
>>>>> On Wed, 5 Oct 2016, Patrick Boettcher wrote:
>>>>>
>>>>> > > > Thanks for the quick response.
>>>>> > > > Drivers are:
>>>>> > > > dvb_core, dvb_usb, dbv_usb_cynergyT2
>>>>> > >
>>>>> > > This dbv_usb_cynergyT2 is not from Linus' tree, is it? I don't seem
>>>>> > > to be able to find it, and the only google hit I am getting is your
>>>>> > > very mail to LKML :)
>>>>> >
>>>>> > It's a typo, it should say dvb_usb_cinergyT2.
>>>>>
>>>>> Ah, thanks. Same issues there in
>>>>>
>>>>>   cinergyt2_frontend_attach()
>>>>>   cinergyt2_rc_query()
>>>>>
>>>>> I think this would require more in-depth review of all the media drivers
>>>>> and having all this fixed for 4.9. It should be pretty straightforward;
>>>>> all the instances I've seen so far should be just straightforward
>>>>> conversion to kmalloc() + kfree(), as the buffer is not being embedded in
>>>>> any structure etc.
>>>>
>>>> What we're doing on most cases is to put a buffer (usually with 80
>>>> chars for USB drivers) inside the "state" struct (on DVB drivers),
>>>> in order to avoid doing kmalloc/kfree all the times. One such patch is
>>>> changeset c4a98793a63c4.
>>>>
>>>> I'm enclosing a non-tested patch fixing it for the cinergyT2-core.c
>>>> driver.
>>>>
>>>> Thanks,
>>>> Mauro
>>>>
>>>> [PATCH] cinergyT2-core: don't do DMA on stack
>>>>
>>>
>>> Tried the cinergyT2 patch. No success. When I select a TV channel
>>> just nothing happens. There are no error messages.
>>
>> Could you try compiling with CONFIG_DEBUG_VIRTUAL=y and seeing if you
>> get a nice error message?
>>
>> --Andy
>
> Done. Still no error messages in dmesg and syslog.
>
> DVB adapter signals it is tuned to the channel.
> For me it looks like there`s no data reaching the application
> (similar to if I forget to connect an antenna).

I'm surprised.  CONFIG_DEBUG_VIRTUAL=y really ought to have caught the
problem, whatever it is.  You could try CONFIG_DEBUG_SG as well, but I
admit I'm grasping at straws there.  Maybe the DVB people have a
better idea as to what's going on here.

It's plausible that the patch you're testing got rid of the DMA on the
stack but is otherwise buggy.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with VMAP_STACK=y

2016-10-05 Thread Andy Lutomirski
On Wed, Oct 5, 2016 at 8:21 AM, Jörg Otte  wrote:
> 2016-10-05 11:04 GMT+02:00 Mauro Carvalho Chehab :
>> Em Wed, 5 Oct 2016 09:50:42 +0200 (CEST)
>> Jiri Kosina  escreveu:
>>
>>> On Wed, 5 Oct 2016, Patrick Boettcher wrote:
>>>
>>> > > > Thanks for the quick response.
>>> > > > Drivers are:
>>> > > > dvb_core, dvb_usb, dbv_usb_cynergyT2
>>> > >
>>> > > This dbv_usb_cynergyT2 is not from Linus' tree, is it? I don't seem
>>> > > to be able to find it, and the only google hit I am getting is your
>>> > > very mail to LKML :)
>>> >
>>> > It's a typo, it should say dvb_usb_cinergyT2.
>>>
>>> Ah, thanks. Same issues there in
>>>
>>>   cinergyt2_frontend_attach()
>>>   cinergyt2_rc_query()
>>>
>>> I think this would require more in-depth review of all the media drivers
>>> and having all this fixed for 4.9. It should be pretty straightforward;
>>> all the instances I've seen so far should be just straightforward
>>> conversion to kmalloc() + kfree(), as the buffer is not being embedded in
>>> any structure etc.
>>
>> What we're doing on most cases is to put a buffer (usually with 80
>> chars for USB drivers) inside the "state" struct (on DVB drivers),
>> in order to avoid doing kmalloc/kfree all the times. One such patch is
>> changeset c4a98793a63c4.
>>
>> I'm enclosing a non-tested patch fixing it for the cinergyT2-core.c
>> driver.
>>
>> Thanks,
>> Mauro
>>
>> [PATCH] cinergyT2-core: don't do DMA on stack
>>
>
> Tried the cinergyT2 patch. No success. When I select a TV channel
> just nothing happens. There are no error messages.

Could you try compiling with CONFIG_DEBUG_VIRTUAL=y and seeing if you
get a nice error message?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 09/17] staging/lirc_serial: Remove TSC-based timing

2015-06-19 Thread Andy Lutomirski
Hi Mauro, etc:

Are you okay with this change landing in the tip tree?

--Andy

On Fri, Jun 12, 2015 at 4:44 PM, Andy Lutomirski l...@kernel.org wrote:
 It wasn't compiled in by default.  I suspect that the driver was and
 still is broken, though -- it's calling udelay with a parameter
 that's derived from loops_per_jiffy.

 Cc: Jarod Wilson ja...@wilsonet.com
 Cc: de...@driverdev.osuosl.org
 Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
 Signed-off-by: Andy Lutomirski l...@kernel.org
 ---
  drivers/staging/media/lirc/lirc_serial.c | 63 
 ++--
  1 file changed, 4 insertions(+), 59 deletions(-)

 diff --git a/drivers/staging/media/lirc/lirc_serial.c 
 b/drivers/staging/media/lirc/lirc_serial.c
 index dc7984455c3a..465796a686c4 100644
 --- a/drivers/staging/media/lirc/lirc_serial.c
 +++ b/drivers/staging/media/lirc/lirc_serial.c
 @@ -327,9 +327,6 @@ static void safe_udelay(unsigned long usecs)
   * time
   */

 -/* So send_pulse can quickly convert microseconds to clocks */
 -static unsigned long conv_us_to_clocks;
 -
  static int init_timing_params(unsigned int new_duty_cycle,
 unsigned int new_freq)
  {
 @@ -344,7 +341,6 @@ static int init_timing_params(unsigned int new_duty_cycle,
 /* How many clocks in a microsecond?, avoiding long long divide */
 work = loops_per_sec;
 work *= 4295;  /* 4295 = 2^32 / 1e6 */
 -   conv_us_to_clocks = work  32;

 /*
  * Carrier period in clocks, approach good up to 32GHz clock,
 @@ -357,10 +353,9 @@ static int init_timing_params(unsigned int 
 new_duty_cycle,
 pulse_width = period * duty_cycle / 100;
 space_width = period - pulse_width;
 dprintk(in init_timing_params, freq=%d, duty_cycle=%d, 
 -   clk/jiffy=%ld, pulse=%ld, space=%ld, 
 -   conv_us_to_clocks=%ld\n,
 +   clk/jiffy=%ld, pulse=%ld, space=%ld\n,
 freq, duty_cycle, __this_cpu_read(cpu_info.loops_per_jiffy),
 -   pulse_width, space_width, conv_us_to_clocks);
 +   pulse_width, space_width);
 return 0;
  }
  #else /* ! USE_RDTSC */
 @@ -431,63 +426,14 @@ static long send_pulse_irdeo(unsigned long length)
 return ret;
  }

 -#ifdef USE_RDTSC
 -/* Version that uses Pentium rdtsc instruction to measure clocks */
 -
 -/*
 - * This version does sub-microsecond timing using rdtsc instruction,
 - * and does away with the fudged LIRC_SERIAL_TRANSMITTER_LATENCY
 - * Implicitly i586 architecture...  - Steve
 - */
 -
 -static long send_pulse_homebrew_softcarrier(unsigned long length)
 -{
 -   int flag;
 -   unsigned long target, start, now;
 -
 -   /* Get going quick as we can */
 -   rdtscl(start);
 -   on();
 -   /* Convert length from microseconds to clocks */
 -   length *= conv_us_to_clocks;
 -   /* And loop till time is up - flipping at right intervals */
 -   now = start;
 -   target = pulse_width;
 -   flag = 1;
 -   /*
 -* FIXME: This looks like a hard busy wait, without even an 
 occasional,
 -* polite, cpu_relax() call.  There's got to be a better way?
 -*
 -* The i2c code has the result of a lot of bit-banging work, I wonder 
 if
 -* there's something there which could be helpful here.
 -*/
 -   while ((now - start)  length) {
 -   /* Delay till flip time */
 -   do {
 -   rdtscl(now);
 -   } while ((now - start)  target);
 -
 -   /* flip */
 -   if (flag) {
 -   rdtscl(now);
 -   off();
 -   target += space_width;
 -   } else {
 -   rdtscl(now); on();
 -   target += pulse_width;
 -   }
 -   flag = !flag;
 -   }
 -   rdtscl(now);
 -   return ((now - start) - length) / conv_us_to_clocks;
 -}
 -#else /* ! USE_RDTSC */
  /* Version using udelay() */

  /*
   * here we use fixed point arithmetic, with 8
   * fractional bits.  that gets us within 0.1% or so of the right average
   * frequency, albeit with some jitter in pulse length - Steve
 + *
 + * This should use ndelay instead.
   */

  /* To match 8 fractional bits used for pulse/space length */
 @@ -520,7 +466,6 @@ static long send_pulse_homebrew_softcarrier(unsigned long 
 length)
 }
 return (actual-length)  8;
  }
 -#endif /* USE_RDTSC */

  static long send_pulse_homebrew(unsigned long length)
  {
 --
 2.4.2




-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-media in


Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2

2015-06-12 Thread Andy Lutomirski
On Jun 12, 2015 12:59 AM, Jan Beulich jbeul...@suse.com wrote:

  On 12.06.15 at 01:23, toshi.k...@hp.com wrote:
  There are two usages on MTRRs:
   1) MTRR entries set by firmware
   2) MTRR entries set by OS drivers
 
  We can obsolete 2), but we have no control over 1).  As UEFI firmwares
  also set this up, this usage will continue to stay.  So, we should not
  get rid of the MTRR code that looks up the MTRR entries, while we have
  no need to modify them.
 
  Such MTRR entries provide safe guard to /dev/mem, which allows
  privileged user to access a range that may require UC mapping while
  the /dev/mem driver blindly maps it with WB.  MTRRs converts WB to UC in
  such a case.

 But it wouldn't be impossible to simply read the MTRRs upon boot,
 store the information, disable MTRRs, and correctly use PAT to
 achieve the same effect (i.e. the blindly maps part of course
 would need fixing).

This may crash and burn badly when we call a UEFI function or an SMI
happens.  I think we should just leave the MTRRs alone.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-22 Thread Andy Lutomirski
On Wed, Apr 22, 2015 at 8:23 AM, Luis R. Rodriguez mcg...@suse.com wrote:
 On Tue, Apr 21, 2015 at 11:39:39PM -0600, Jason Gunthorpe wrote:
 On Wed, Apr 22, 2015 at 01:39:07AM +0200, Luis R. Rodriguez wrote:
   Mike, do you think the time is right to just remove the iPath driver?
 
  With PAT now being default the driver effectively won't work
  with write-combining on modern kernels. Even if systems are old
  they likely had PAT support, when upgrading kernels PAT will work
  but write-combing won't on ipath.

 Sorry, do you mean the driver already doesn't get WC? Or do you mean
 after some more pending patches are applied?

 No, you have to consider the system used and the effects of calls used
 on the driver in light of this table:

 --
 MTRR Non-PAT   PATLinux ioremap valueEffective memory type
 --
   Non-PAT |  PAT
  PAT
  |PCD
  ||PWT
  |||
 WC   000  WB  _PAGE_CACHE_MODE_WBWC   |   WC
 WC   001  WC  _PAGE_CACHE_MODE_WCWC*  |   WC
 WC   010  UC- _PAGE_CACHE_MODE_UC_MINUS  WC*  |   UC
 WC   011  UC  _PAGE_CACHE_MODE_UCUC   |   UC
 --

 (*) denotes implementation defined and is discouraged

 ioremap_nocache() will use _PAGE_CACHE_MODE_UC_MINUS by default today,
 in the future we want to flip the switch and make _PAGE_CACHE_MODE_UC
 the default. When that flip occurs it will mean ipath cannot get
 write-combining on both non-PAT and PAT systems. Now that is for
 the future, lets review the current situation for ipath.

 For PAT capable systems if mtrr_add() is used today on a Linux system on a
 region mapped with ioremap_nocache() that will mean you effectively nullify 
 the
 mtrr_add() effect as the combinatorial effect above yields an effective memory
 type of UC.

Are you sure?  I thought that ioremap_nocache currently is UC-, so
mtrr_add + ioremap_nocache gets WC even on PAT systems.

Going forward, when mtrr_add is gone, this will change, of course.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-21 Thread Andy Lutomirski
On Tue, Apr 21, 2015 at 3:08 PM, Luis R. Rodriguez mcg...@suse.com wrote:
 On Tue, Apr 21, 2015 at 3:02 PM, Luis R. Rodriguez mcg...@suse.com wrote:
 Andy, can we live without MTRR support on this driver for future kernels? 
 This
 would only leave ipath as the only offending driver.

 Sorry to be clear, can we live with removal of write-combining on this driver?


I personally think so, but a driver maintainer's ack would be nice.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-16 Thread Andy Lutomirski
On Apr 15, 2015 6:54 PM, Andy Walls awa...@md.metrocast.net wrote:

 On Wed, 2015-04-15 at 17:58 -0700, Andy Lutomirski wrote:
  On Wed, Apr 15, 2015 at 4:59 PM, Andy Walls awa...@md.metrocast.net wrote:
   On Wed, 2015-04-15 at 16:42 -0700, Andy Lutomirski wrote:
   On Wed, Apr 15, 2015 at 3:38 PM, Andy Walls awa...@md.metrocast.net 
   wrote:
  
   
  
   IMO the right solution would be to avoid ioremapping the whole bar at
   startup.  Instead ioremap pieces once the driver learns what they are.
   This wouldn't have any of these problems -- you'd ioremap() register
   regions and you'd ioremap_wc() the framebuffer once you find it.  If
   there are regions of unknown purpose, just don't map them all.
  
   Would this be feasible?
  
   Feasible? Maybe.
  
   Worth the time and effort for end-of-life, convential PCI hardware so I
   can have an optimally performing X display on a Standard Def Analog TV
   screen?   Nope. I don't have that level of nostalgia.
  
 
  The point is actually to let us unexport or delete mtrr_add.

 Understood.


We can
  either severely regress performance on ivtv on PAT-capable hardware if
  we naively switch it to arch_phys_wc_add or we can do something else.
  The something else remains to be determined.

 Maybe ioremap the decoder register area as UC, and ioremap the rest of
 the decoder region to WC. (Does that suck up too many PAT resources?

PAT resources are unlimited.

 Then add PCI reads following any sort of singleton PCI writes in the WC
 region.  I assume PCI rules about write postings before reads still
 apply when WC is set.


I think we need sfence, too, but that's easy.  We also lose the write
sizes.  That is, adjacent writes get combined.  Maybe that's okay.

  
   We sort of know where some things are in the MMIO space due to
   experimentation and past efforts examining the firmware binary.
  
   Documentation/video4linux/cx2341x/fw-*.txt documents some things.  The
   driver code actually codifies a little bit more knowledge.
  
   The driver code for doing transfers between host and card is complex and
   fragile with some streams that use DMA, other streams that use PIO,
   digging VBI data straight out of card memory, and scatter-gather being
   broken on newer firmwares.  Playing around with ioremapping will be hard
   to get right and likely cause something in the code to break for the
   primary use case of the ivtv supported cards.
 
  Ick.

 Yeah.

  If the only thing that really wants WC is the esoteric framebuffer
  thing,

 That appears to be it.

   could we just switch to arch_phys_wc_add and assume that no one
  will care about the regression on new CPUs with ivtv cards?

 That's on the table in my mind.  Not sure if it is the friendliest thing
 to do to users.  Quite honestly though, modern graphics cards have much
 better ouput resolution and performance.  Anyone with a modern system
 really should be using one.  (i.e. MythTV gave up on support for PVR-350
 output for video playback years ago in May 2010.)


 BTW, my 2005 system with multiple conventional PCI slots in it shows a
 'pat' flag in /proc/cpuinfo.  (AMD Athlon(tm) 64 X2 Dual Core Processor
 4200+)  I didn't know it was considered new. :)

Tons of CPUs have that ability, but we often turn it off due to errata
on older CPUs.

--Andy


 Regards,
 Andy


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-15 Thread Andy Lutomirski
On Mon, Apr 13, 2015 at 10:49 AM, Luis R. Rodriguez mcg...@suse.com wrote:

 c) ivtv: the driver does not have the PCI space mapped out separately, and
 in fact it actually does not do the math for the framebuffer, instead it lets
 the device's own CPU do that and assume where its at, see
 ivtvfb_get_framebuffer() and CX2341X_OSD_GET_FRAMEBUFFER, it has a get
 but not a setter. Its not clear if the firmware would make a split easy.
 We'd need ioremap_ucminus() here too and __arch_phys_wc_add().


IMO this should be conceptually easy to split.  Once we get the
framebuffer address, just unmap it (or don't prematurely map it) and
then ioremap the thing.

 From the beginning it seems only framebuffer devices used MTRR/WC, lately it
 seems infiniband drivers also find good use for for it for PIO TX buffers to
 blast some sort of data, in the future I would not be surprised if other
 devices found use for it.

IMO the Infiniband maintainers should fix their code.  Especially in
the server space, there aren't that many MTRRs to go around.  I wrote
arch_phys_wc_add in the first place because my server *ran out of
MTRRs*.

Hey, IB people: can you fix your drivers to use arch_phys_wc_add
(which is permitted to be a no-op) along with ioremap_wc?  Your users
will thank you.

 It may be true that the existing drivers that
 requires the above type of work are corner cases -- but I wouldn't hold my
 breath for that. The ivtv device is a good example of the worst type of
 situations and these days. So perhap __arch_phys_wc_add() and a
 ioremap_ucminus() might be something more than transient unless hardware folks
 get a good memo or already know how to just Do The Right Thing (TM).

I disagree.  We should try to NACK any new code that can't function
without MTRRs.

(Plus, ARM is growing in popularity in the server space, and ARM quite
sensibly doesn't have MTRRs.)

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-15 Thread Andy Lutomirski
On Wed, Apr 15, 2015 at 3:50 PM, Andy Walls awa...@md.metrocast.net wrote:
 On Wed, 2015-04-15 at 13:42 -0700, Andy Lutomirski wrote:
 On Mon, Apr 13, 2015 at 10:49 AM, Luis R. Rodriguez mcg...@suse.com wrote:

  c) ivtv: the driver does not have the PCI space mapped out separately, and
  in fact it actually does not do the math for the framebuffer, instead it 
  lets
  the device's own CPU do that and assume where its at, see
  ivtvfb_get_framebuffer() and CX2341X_OSD_GET_FRAMEBUFFER, it has a get
  but not a setter. Its not clear if the firmware would make a split easy.
  We'd need ioremap_ucminus() here too and __arch_phys_wc_add().
 

 IMO this should be conceptually easy to split.  Once we get the
 framebuffer address, just unmap it (or don't prematurely map it) and
 then ioremap the thing.

 Not so easy.  The main ivtv driver has already set up the PCI device and
 done the mapping for the MPEG-2 decoder/video output engine.  The video
 decoder/output device nodes might already be open by user space calling
 into the main driver, before the ivtvfb module is even loaded.

Surely the MPEG-2 decoder/video engine won't overlap the framebuffer,
though.  Am I missing something?

--Andy


 This could be mitigated by integrating all the ivtvfb module code into
 the main ivtv module.  But even then not every PVR-350 owner wants to
 use the video output OSD as a framebuffer.  Users might just want an
 actual OSD overlaying their TV video playback.

 Regards,
 Andy




-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-15 Thread Andy Lutomirski
On Wed, Apr 15, 2015 at 3:38 PM, Andy Walls awa...@md.metrocast.net wrote:
 Hi All,

 On Mon, 2015-04-13 at 19:49 +0200, Luis R. Rodriguez wrote:
 [snip]
  I only saw a few drivers using overlapping ioremap*()
 calls though on my MTRR review and they are all old devices so likely mostly
 used on non-PAT systems, but there might be other corner cases elsewhere.

 Lets recap, as I see it we have a few options with all this in mind on our
 quest to bury MTRR (and later make strong UC default):

 1) Let drivers do their own get_vm_area() calls as you note and handle the
cut and splicing of ioremap areas

 2) Provide an API to split and splice ioremap ranges

 3) Try to avoid these types of situations and let drivers simply try to
work towards a proper clean non-overlapping different ioremap() ranges

 Let me know if you think of others but please keep in mind the UC- to strong
 UC converstion we want to do later as well. We have ruled out using
 set_memor_wc() now.

 I prefer option 3), its technically possible but only for *new* drivers
 and we'd need either some hard work to split the ioremap() areas or provide
 a a set of *transient* APIs as I had originally proposed to phase / hope for
 final transition. There are 3 drivers to address:

 [snip]

 Just some background for folks:
 The ivtv driver supports cards that perform Standard Definition PAL,
 NTSC, and SECAM TV capture + hardware MPEG-2 encoding and MPEG-2
 decoding + TV output.

 Of the supported cards only *one* card type, the PVR-350 based on the
 CX23415 chip, can perform the MPEG-2 or YUV video decoding and output,
 with an OSD overlay.  PVR-350's are legacy PCI cards that have been end
 of life since 2088 or earlier.  Despite that, there are still used units
 available on Amazon and eBay.

 The ivtvfb driver module is an *optional* companion driver module that
 provides a framebuffer interface for the user to output an X display, FB
 console, or whatever to a standard definition analog PAL, NTSC, or SECAM
 TV or VCR.  It does this by co-opting the OSD overlay of the video
 decoding output engine and having it take up the whole screen.



 c) ivtv: the driver does not have the PCI space mapped out separately, and
 in fact it actually does not do the math for the framebuffer, instead it lets
 the device's own CPU do that and assume where its at, see
 ivtvfb_get_framebuffer() and CX2341X_OSD_GET_FRAMEBUFFER, it has a get
 but not a setter. Its not clear if the firmware would make a split easy.

 The CX2341[56]'s firmware + hardware machine are notorious for bugs and
 being hard to work with.  It would be difficult to determine with any
 certainty that the firmware returned predictable OSD framebuffer
 addresses from one user's system to the next.


 We'd need ioremap_ucminus() here too and __arch_phys_wc_add().

 Yes.

 As a driver writer/maintainer, IMO the name ioremap_ucminus() is jargon,
 without a clear meaning from the name, and maybe too x86 PAT specific?
 The pat.txt file under Documentation didn't explain what UC- meant; I
 had to go searching old LKML emails to understand it was a unchached
 region that could be overridden with write combine regions.

 To me names along, these lines would be better:
ioremap_nocache_weak(), or
ioremap_nocache_mutable(), or
ioremap_nocache()  (with ioremap_nocache_strong() or something for
 the UC version)


IMO the right solution would be to avoid ioremapping the whole bar at
startup.  Instead ioremap pieces once the driver learns what they are.
This wouldn't have any of these problems -- you'd ioremap() register
regions and you'd ioremap_wc() the framebuffer once you find it.  If
there are regions of unknown purpose, just don't map them all.

Would this be feasible?

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR

2015-04-15 Thread Andy Lutomirski
On Wed, Apr 15, 2015 at 4:59 PM, Andy Walls awa...@md.metrocast.net wrote:
 On Wed, 2015-04-15 at 16:42 -0700, Andy Lutomirski wrote:
 On Wed, Apr 15, 2015 at 3:38 PM, Andy Walls awa...@md.metrocast.net wrote:

 

 IMO the right solution would be to avoid ioremapping the whole bar at
 startup.  Instead ioremap pieces once the driver learns what they are.
 This wouldn't have any of these problems -- you'd ioremap() register
 regions and you'd ioremap_wc() the framebuffer once you find it.  If
 there are regions of unknown purpose, just don't map them all.

 Would this be feasible?

 Feasible? Maybe.

 Worth the time and effort for end-of-life, convential PCI hardware so I
 can have an optimally performing X display on a Standard Def Analog TV
 screen?   Nope. I don't have that level of nostalgia.


The point is actually to let us unexport or delete mtrr_add.  We can
either severely regress performance on ivtv on PAT-capable hardware if
we naively switch it to arch_phys_wc_add or we can do something else.
The something else remains to be determined.


 We sort of know where some things are in the MMIO space due to
 experimentation and past efforts examining the firmware binary.

 Documentation/video4linux/cx2341x/fw-*.txt documents some things.  The
 driver code actually codifies a little bit more knowledge.

 The driver code for doing transfers between host and card is complex and
 fragile with some streams that use DMA, other streams that use PIO,
 digging VBI data straight out of card memory, and scatter-gather being
 broken on newer firmwares.  Playing around with ioremapping will be hard
 to get right and likely cause something in the code to break for the
 primary use case of the ivtv supported cards.

Ick.

If the only thing that really wants WC is the esoteric framebuffer
thing, could we just switch to arch_phys_wc_add and assume that no one
will care about the regression on new CPUs with ivtv cards?


--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html