On Fri, September 27, 2013 8:24 pm, Artem Bityutskiy wrote:
> On Fri, 2013-09-27 at 19:18 +1000, Patrick Shirkey wrote:
>> On Fri, September 27, 2013 4:19 pm, Artem Bityutskiy wrote:
>> > On Wed, 2013-09-25 at 02:49 +1000, Patrick Shirkey wrote:
>> >> Hi,
>> >>
>> >> A quick update for those who are following this thread.
>> >>
>> >> We are tracing the audio latency when running a combination of JACK
>> and
>> >> PA.
>> >>
>> >> We are currently looking at the PA Stream Buffer as a potential
>> >> bottleneck.
>> >>
>> >> During testing I have seen latency as low as 4ms round trip but also
>> as
>> >> high as 1300ms and the results are not stable on my hda_intel sound
>> >> device.
>> >
>> > I think you earlier said you are using an x68 desktop for testing.
>> What
>> > I'd try to do is to prevent deep C-states. Indeed, a package you run
>> > pulseaudio/jack/other related process is able to enter a deep C-state,
>> > there is an exit latency associated with it.
>> >
>> > To put the long story short, there is the /dev/cpu_dma_latency file,
>> > where you can write the latency you can tolerate (in ms). The kernel
>> > will translate this to the deepest C-state the processor can enter.
>> >
>> > You can write 0 there, which will mean that CPU won't ever enter any
>> > C-state and will busy-loop when idle. Bad for power consumption. But
>> you
>> > can just experiment if this helps to lessen the latency divination
>> that
>> > you observe.
>> >
>> > You can write a larger number, then CPU will enter C1 at least, which
>> is
>> > already a lot better for PM.
>> >
>> > You can verify which C-states you hit with the 'turbostat' tool or
>> > powertop. The former comes, I think, from kernel-tools package in
>> > Fedora. Play with latency number and use them to check which C-states
>> > this corresponds to.
>> >
>> > Ah, and there is a trick. You should open /dev/cpu_dma_latency, write
>> > your latency (as ascii or binary, both are ok), and _do not close it_.
>> > As soon as you close it, the kernel will switch to the default latency
>> > constraint.
>> >
>> > Also, advanced drivers usually use the kernel PMQoS infrastructure and
>> > instruct the system when they cannot tolerate high latency.
>> >
>> > When I do 'git grep PM_QOS_CPU_DMA_LATENCY' in the kernel, I do not
>> see
>> > the HDA driver doing this.
>> >
>> > Anyway, this may not solve the issue, but I'd suggest to try out if it
>> > at least partially helps. And I am very interested to hear if it does
>> or
>> > not, or may be you already tried this out.
>> >
>>
>>
>> I can't get turbostat with apt on debian as it has been removed from
>> the
>> acpica-tools package.
>
> Ok. You can easily compile it yourself if you want. It is in the kernel
> tree in tools/power/x86/turbostat/, where you just type 'make'.
>
> Anyway, the only reason I refer to this tool is that you can use it to
> check the C-state residency statistics, and how C-state residency is
> affected by /dev/cpu_dma_latency settings.
>
>> Using powertop I see these stats with /dev/cpu_dma_latency set to 0:
>
> Did you open the file, wrote 0, and kept the file open? Does not look
> like because I see you hit C3.
>
> I do not know how to do this from console, I wrote a custom scrip for
> this.
>
> I have a python script which can do this, I can send it to you, let me
> know in a private e-mail.
>
>> Idle
>> Package | CPU 0
>> POLL 0.0% | POLL 0.0% 0.0 ms
>> C1 0.3% | C1 0.4% 0.1 ms
>> C2 17.8% | C2 17.2% 0.2 ms
>> C3 13.1% | C3 12.0% 0.1 ms
>
> See, you are hitting C2 and C3. C3 has the highest exit latency. But I
> do not know what would that be for your platform.
>
I see results similar to this with powertop while using your script :
./pmqos set cpu-latency 0
Idle
Package | CPU 0
POLL 0.0% | POLL 0.0% 0.1 ms
C1 1.0% | C1 0.5% 0.1 ms
C2 18.9% | C2 14.7% 0.2 ms
C3 5.4% | C3 9.5% 0.2 ms
| CPU 1
| POLL 0.0% 0.0 ms
| C1 1.5% 0.2 ms
| C2 23.1% 0.2 ms
| C3 1.3% 0.2 ms
----------
turbostat
----------
With turbostat I get this error which is apparently because this processor
doesn't support invariant TSC and turbostat requires that to run which
suggests that powertop is the only option to get this data easily on this
machine.
turbostat -v
GenuineIntel 10 CPUID levels; family:model:stepping 0x6:f:6 (6:15:6)
No invariant TSC
--
Patrick Shirkey
Boost Hardware Ltd
_______________________________________________
General mailing list
[email protected]
https://lists.tizen.org/listinfo/general